Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backup discussion #14

Open
sberryman opened this issue Apr 7, 2017 · 2 comments
Open

Backup discussion #14

sberryman opened this issue Apr 7, 2017 · 2 comments

Comments

@sberryman
Copy link
Contributor

sberryman commented Apr 7, 2017

I'm considering implementing backup using the autopilot redis example as a starting point.

Any prior knowledge I should be aware of?

From the little knowledge I have on hosted MongoDB clusters they tended to use a hidden replica set member to perform backups on. That may be a bit overkill for this example but something to be considered.

Since MongoDB clusters are likely rather large, how can we do snapshots AND full backups? I've never used Manta so I'm using this as a way to learn more about it. Should I just implement full backups for the example and possibly expand it in the future? I'm assuming I can pipe data into manta?

MongoDB Cloud Manager uses the oplog to offer snapshots and point in time recovery. Again, I'm assuming that is overkill for this example.

MongoDB's documentation for backup clearly states using mongodump and mongorestore are only designed for smaller datasets. At the very least, I will make sure that is apparent in the readme if that is the route you guys want to go. I'll use --oplog and --oplogReplay against a replica for backup.

I'll need to take into account access control if you plan on merging PR #11 any time soon.

mongodump

I'm planning on using the following options:

  1. --readPreference
  2. --gzip
  3. --archive
  4. --oplog - I need to research whether this works in conjunction with --archive as it is not mentioned in the documentation.
@sberryman
Copy link
Contributor Author

Ping @geek

@geek
Copy link
Contributor

geek commented Apr 12, 2017

@sberryman another good starting point is MySQL (https://github.com/autopilotpattern/mysql/blob/master/bin/manage.py#L228-L276)

Feel free to ignore the access control PR for now, to keep things simpler. The options you mention look like the right approach. I'd go for the initial plan to implement full backups for the example for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants