Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated offsite backup #23

Open
3 of 12 tasks
khuedoan opened this issue Jan 23, 2022 · 6 comments
Open
3 of 12 tasks

Automated offsite backup #23

khuedoan opened this issue Jan 23, 2022 · 6 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@khuedoan
Copy link
Owner

khuedoan commented Jan 23, 2022

  • Evaluate object storage providers (cost is a major factor, S3 compatible is a plus):
  • Evaluate backup tool:
    • Built in backup feature in Longhorn (doesn't support multiple destination)
    • K8up
    • Velero
  • Implement the above
  • Update docs
@khuedoan khuedoan added enhancement New feature or request help wanted Extra attention is needed labels Jan 23, 2022
@batistein
Copy link

@khuedoan i would recommend wasabi pricing
By the way very nice project!

@khuedoan
Copy link
Owner Author

Thanks, added to the list.

@khuedoan
Copy link
Owner Author

khuedoan commented Feb 7, 2022

After comparing the prices and consider my current usage (a few hundreds GB), I decided to go with AWS S3 Glacier Deep Archive.

Backup cost:

  • Storage cost: $1.01376/TB/month
  • Inbound data transfer: Free

Restore cost:

  • Retrieval (bulk, within 48 hours): $2.56/TB
  • Outbound data transfer: $0.09/GB or $92.16/TB

Full calculation:

    1,024 GB per month / 10 GB average item size = 102.40 unrounded number of items
    Round up by 1 (102.4000) = 103 number of items
    103 number of items x 32 KB = 3,296.00 KB overhead
    3,296.00 KB overhead / 1048576 KB in a GB = 0.003143 GB overhead
    0.003143 GB overhead x 0.00099 USD = 0.0000031 USD (Glacier Deep Archive storage overhead cost)
    Glacier Deep Archive storage overhead cost: 0.0000031 USD
    103 number of items x 8 KB = 824.00 KB overhead
    824.00 KB overhead / 1048576 KB in a GB = 0.000786 GB overhead
    Tiered price for: 0.000786 GB
    0.000786 GB x 0.0230000000 USD = 0.00 USD
    Total tier cost = 0.0000181 USD (S3 Standard storage overhead cost)
    S3 Standard storage overhead cost: 0.0000181 USD
    1,024 GB per month x 0.00099 USD = 1.01376 USD (Glacier Deep Archive storage cost)
    Glacier Deep Archive storage cost: 1.01376 USD
    0.0000031 USD + 0.0000181 USD + 1.01376 USD = 1.013781 USD (total Glacier Deep Archive storage cost)
    1,000 requests x 0.00005 USD = 0.05 USD (Cost for PUT, COPY, POST, LIST requests)
    1 requests x 0.000025 USD = 0.00 USD (Cost for Restore requests (Bulk))
    1,024 GB per month x 0.0025 USD = 2.56 USD (Cost for Glacier Deep Archive Data Retrieval (Bulk))
    1.013781 USD + 0.05 USD + 2.56 USD = 3.62 USD (S3 Glacier Deep Archive cost)

    **S3 Glacier Deep Archive cost (monthly): 3.62 USD**

    Inbound:
    Internet: 1024 GB x 0 USD per GB = 0.00 USD
    Outbound:
    Internet: 1024 GB x 0.09 USD per GB = 92.16 USD 

    **Data Transfer cost (monthly): 92.16 USD**

So data will be backed up in 2 places:

  • Local Minio on my NAS
  • AWS S3 Glacier

Distributed storage also improve the resilience with 2 (or 3) replicas.

@khuedoan khuedoan closed this as completed Feb 7, 2022
@khuedoan khuedoan reopened this Feb 7, 2022
@khuedoan
Copy link
Owner Author

khuedoan commented Feb 7, 2022

Accidentally closed. Still need to implement the backup.

@locmai
Copy link
Sponsor Contributor

locmai commented Feb 20, 2022

Probably you already know this: we should test the backup.

zanehala added a commit to zanehala/homelab that referenced this issue Dec 11, 2022
@ClashTheBunny
Copy link
Contributor

Hey, just a few thoughts after messing around with stuff.

K8up does a great job of backing up per namespace. That's kinda a pain here since we're obstensibly single tenent and namespaces are more for correctness. This means that the secret for s3 and backup schedule must be created everwhere, also, not a big deal, just tedius. The thing that is actually important with k8up is that it only backs up readwritemany volumes, which all of the current volumes are readwriteonce. This means that it runs, succeeds, and you don't actually get what you want out of it. I'm not sure if Velero is any different, but I would be interested if it works in a similar way: spin up job container, mount the volume again, run the backup.

The longhorn integrated backups work without this readwritemany caveat, so that's something to think about. But that also means that they are block backups and not logical/file backups. This makes them less easy to interact with outside of longhorn itself.

retX0 pushed a commit to retX0/homelab that referenced this issue Jan 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants