Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bottomless-cli: added snapshot command #1229

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Horusiath
Copy link
Contributor

@Horusiath Horusiath commented Mar 19, 2024

This PR extends bottomless-cli with an ability to generate a snapshot option. Example usage:

bottomless-cli -e "<aws-endpoint>" -b "<bucket>" -d "<local-db-dir>" -n "ns-<db-id>:<namespace>" snapshot -g "<generation>"

If you don't know what generations are out there you can run:

bottomless-cli -e "<aws-endpoint>" -b "<bucket>" -d "<local-db-dir>" -n "ns-<db-id>:<namespace>" ls

To test - given that you have a local bottomless storage with at least 2 generations dependent on each other:

  1. Go to a generation folder in AWS. Ensure that this folder has .dep file. Remove db.zstd and .changecounter files - this will make bottomless treat the generation as if it had no snapshots. export RUST_LOG=info,bottomless=debug for better visibility. At this point bottomless still can restore the database eg. bottomless-cli -e "http://localhost:9000/" -b "bottomless" -d "./data.sqld" -n "ns-<db-id>:<namespace>" restore -g "<generation>", however it will use .dep file to reach back across many generations to restore the db. At the end of restore a SQLite integrity check is being done.
  2. Call bottomless-cli with new command: bottomless-cli -e "http://localhost:9000/" -b "bottomless" -d "./data.sqld" -n "ns-<db-id>:<namespace>" snapshot -g "<generation>". It will recreate and re-upload a snapshot at the beginning of a generation. If you'll try to restore db again you'll see that a regenerated snapshot is being used, so that restore process is faster.

The idea here is that this way we could skip creating snapshots from within the database VMs themselves, and instead have a separate machine/cron job that creates them periodically in the background without putting the pressure related to uploading big DB snapshots from database machine.

Additionally, snapshot creation also performs a local restore and integrity check before upload, so we also can confirm that we can restore database to an uncorrupted state from existing backup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant