Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow setting a database size limit, suspend DHT crawler if exceeded #187

Open
2 tasks done
nodiscc opened this issue Feb 29, 2024 · 11 comments
Open
2 tasks done

Allow setting a database size limit, suspend DHT crawler if exceeded #187

nodiscc opened this issue Feb 29, 2024 · 11 comments
Labels
enhancement New feature or request

Comments

@nodiscc
Copy link

nodiscc commented Feb 29, 2024

  • I have checked the existing issues to avoid duplicates
  • I have redacted any info hashes and content metadata from any logs or screenshots attached to this issue

Is your feature request related to a problem? Please describe

Since the database size will tend to grow indefinitely as more and more torrents are indexed (does it ever stop? once you've indexed the whole DHT? 🤔 ), I would like to set a limit in the configuration file, after which bitmagnet would stop crawling, and just emit a message in logs on start/every few minutes.

Describe the solution you'd like

dht_crawler:
  db_size_limit: 53687091200 # 50GiB in bytes, could also use human-friendly format 50G
[INFO] DHT crawler suspended, configured maximum database size of 53687091200 bytes reached

Describe alternatives you've considered

Manually disabling the DHT crawler component --keys=http_server in my systemd .service file once I get low on disk space. However a configuration setting would be more "set-and-forget" and could prevent running low on space in the first place (after which manual trimming of the database would be needed to reclaim some disk space, I guess).

Additional context

Related to #186 and #70

@nodiscc nodiscc added the enhancement New feature or request label Feb 29, 2024
@nodiscc nodiscc changed the title Short description of feature Allow setting a database size limit, suspend DHT crawler if exceeded Feb 29, 2024
@Technetium1
Copy link
Contributor

Should get an aggressive notification somewhere that you've hit such a catastrophic failure, I guess that could be done in Grafana as a default

@nodiscc
Copy link
Author

nodiscc commented Mar 1, 2024

catastrophic failure

It's not, it's normal operation (respecting configured limits). This deserves an INFO level log message, at worst a WARNING

@Technetium1
Copy link
Contributor

It should absolutely be a warning, as that's a serious problem! If you run out of space, there's potential for logging to fail, among other "catastrophic" things.

@nodiscc
Copy link
Author

nodiscc commented Mar 2, 2024

We misunderstand each other.

Hitting the maximum configured database size is fine, it needs an INFO message so that the user/admin is not left wondering "why is it not indexing anymore".

Running out of disk space is bad, but that's not bitmagnet's job to warn you about it, that's a job for your monitoring software. Anyway setting a db size limit in bitmagnet would help preventing this.

@akmad
Copy link

akmad commented Mar 7, 2024

Rather than just suspending the DHT crawler, it would be nice to instead start purging old (by some definition of "old") resources from the collection. Assuming that this is technically feasible, it would be similar to a rolling log and can only grow so large before old messages/files are removed. Not to mention, I would expect that there is a inverse relationship between the age of an item in the DHT and the health of the resource.

@nodiscc
Copy link
Author

nodiscc commented Mar 7, 2024

Just suspending the DHT crawler is fine. Let's not overcomplicate this

old messages/files are removed

I don't expect the application to start losing data without manual intervention. As I said:

manual trimming of the database would be needed to reclaim some disk space, I guess

Database cleanup possibilities can be discussed in another issue.

@mgdigital
Copy link
Collaborator

Could this be done by separating the dht_crawler worker to a separate Docker service, with a custom healthcheck that fails if the size limit is exceeded?

@nodiscc
Copy link
Author

nodiscc commented Mar 10, 2024

separate Docker service

I don't use Docker, I run the binary as a systemd service, see ansible role here. I could hack together a script that checks the db size, and restarts bitmagnet without the dht_crawler if a certain size is exceeded but... I think the check should be in the application rather than depend on an external mechanism, which feels like a hack.

@mgdigital
Copy link
Collaborator

mgdigital commented Mar 10, 2024

I guess my concern is any internal behaviour around this could get complex. It isn't currently supported to start/stop/restart individual workers without stopping the whole process - and I don't know if it needs to be, given that some external tool (Docker, Ansible whatever) would be capable of doing this. Database size can go down as well as up, and having this behaviour trigger during something like a migration, where disk usage spikes and then is recovered, could have unintended effects.

a configuration setting would be more "set-and-forget"

I don't know that this could ever be a "set and forget" thing as the worker won't be able to either resolve the disk space issue, or restart itself, so it will require intervention unless you intend it to be stopped forever once a threshold is reached.

Running out of disk space is bad, but that's not bitmagnet's job to warn you about it, that's a job for your monitoring software.

I agree with this - I think some monitoring software (or script) could also capably handle the stopping of the process?

I could hack together a script that checks the db size, and restarts bitmagnet without the dht_crawler if a certain size is exceeded

If going down this route I'd probably separate the crawler to its own process that can be started/stopped independently.

I think I'd need convincing that this use case was enough in-demand, and that some external orchestration would be sub-optimal, to require something implemented internally (even after other upcoming disk space mitigations have been implemented, see below).

As an aside, I'm currently building a custom workflow feature that (among other things) would allow you to auto-delete any torrent based on custom rules. It's not the feature you've described here but it will certainly help with keeping the DB size more manageable by deleting torrents that are stale or that you're not interested in.

@nodiscc
Copy link
Author

nodiscc commented Mar 13, 2024

It isn't currently supported to start/stop/restart individual workers without stopping the whole process

Thanks, that limits our possibilities indeed.

I think some monitoring software (or script) could also capably handle the stopping of the process?
If going down this route I'd probably separate the crawler to its own process that can be started/stopped independently.

I will try to get this working and post the results.

Although I still argue that there should be some way to suspend the DHT crawler (possibly without actually stopping the worker, just have it idle and poll the db size every 60s...) given bitmagnet's tendency to eat up disk space indefinitely (other services with a similar behavior, such as elasticseach will actively avoid storing new data when a certain disk usage threshold is reached)

@mgdigital
Copy link
Collaborator

mgdigital commented Mar 13, 2024

It isn't currently supported to start/stop/restart individual workers without stopping the whole process

Thanks, that limits our possibilities indeed.

Not to say it never could, I just think we'd need clear use cases and well-defined behaviour. I like the model of running individual workers that can simply be terminated if required for any externally determined reason, unless there's a good reason not to do it this way (partly because for the moment at least, it keeps things simple).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants