Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Managed cache store for mirrored repository #1549

Open
solonovamax opened this issue Sep 20, 2022 · 0 comments
Open

Managed cache store for mirrored repository #1549

solonovamax opened this issue Sep 20, 2022 · 0 comments
Labels
feature New feature request

Comments

@solonovamax
Copy link
Contributor

Request details

Currently, when mirroring a repository with reposilite, you can choose to enable "store" to cache artifacts on the local server.
However, this currently is an unmanaged cache, as it can grow infinitely and infrequently used artifacts will not be pruned by reposilite. For users concerned about disk space, this can be annoying because there is no easy way of identifying which artifacts are taking up a large amount of space but are infrequently accessed, so that they can be deleted to free up space.

The limit on artifacts stored would be determined by the following criteria:

  • Limit the stored artifacts to at most x artifacts.
  • Limit the stored artifacts to a quota.
  • Limit the stored artifacts to any downloaded in the past x days.

Artifacts can be chosen to be removed by the following criteria:

  • Remove artifacts based on longest period of time since last access.
  • Remove artifacts based on a formula including access time and number of downloads. (Could be user customizable, or hard coded) (for example: days since access * (downloads/10).) This feature exists to prioritize deleting artifacts that are old and are accessed infrequently.
  • Remove artifacts based on number of downloads.

From an implementation perspective, currently statistics are already stored about download count as well as download time. Another database table could be added to store a list of which artifacts, by group-artifact-version, are from a mirrored repo and which repo they're from. Then, it can be pruned in the following ways:

  • Schedule a task once every time a predetermined interval has elapsed
  • When the limit is reached, schedule a task to prune artifacts to ~95% of the limit, allowing further artifacts to be downloaded before another prune is triggered
@dzikoysk dzikoysk added feature New feature request and removed triage labels Sep 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature request
Projects
None yet
Development

No branches or pull requests

2 participants