Add Backblaze B2 integration for backups #134014

frenck · 2024-12-25T22:36:53Z

⚠️ This PR/integration isn't ready yet.

At this point, it works, it is using the SDK provided by Backblaze themselves, which is all super nice.

However, their library is sync-only, and built on top of requests. This makes it less ideal for our backup agent implementation. I do want to test this on some bigger backup sizes.

Additionally, there is a mypy error left, that I've not resolved. Dunno why, I'm probably overlooking something simple at this point.

🤗 Feel free to jump in and push improvements to this branch directly ❤️

Proposed change

This PR adds the first steps in adding an integration for Backblaze B2.

The integration provides a backup agent that works with the Home Assistant backup solution introduced in Home Assistant 2025.1.

Type of change

Dependency upgrade
Bugfix (non-breaking change which fixes an issue)
New integration (thank you!)
New feature (which adds functionality to an existing integration)
Deprecation (breaking change to happen in the future)
Breaking change (fix/feature causing existing functionality to break)
Code quality improvements to existing code or addition of tests

Additional information

This PR fixes or closes issue: fixes #
This PR is related to issue:
Link to documentation pull request:

Checklist

The code change is tested and works locally.
Local tests pass. Your PR cannot be merged unless tests pass
There is no commented out code in this PR.
I have followed the development checklist
I have followed the perfect PR recommendations
The code has been formatted using Ruff (ruff format homeassistant tests)
Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

Documentation added/updated for www.home-assistant.io

If the code communicates with devices, web services, or third-party tools:

The manifest file has all fields filled out correctly.
Updated and included derived files by running: python3 -m script.hassfest.
New or updated dependencies have been added to requirements_all.txt.
Updated by running python3 -m script.gen_requirements_all.
For the updated dependencies - a link to the changelog, or at minimum a diff between library versions is added to the PR description.

To help with the load of incoming pull requests:

I have reviewed two other open pull requests in this repository.

balloob · 2024-12-26T05:43:08Z

homeassistant/components/backblaze/__init__.py

+        await hass.async_add_executor_job(
+            backblaze.authorize_account,
+            "production",
+            entry.data[CONF_APPLICATION_KEY_ID],
+            entry.data[CONF_APPLICATION_KEY],
+        )
+        bucket = await hass.async_add_executor_job(
+            backblaze.get_bucket_by_id, entry.data[CONF_BUCKET]
+        )


Wrap in a sync function and do this with 1 executor call.

I did in other places, not sure why I've skipped this one. Will do 👍

balloob · 2024-12-26T05:47:02Z

homeassistant/components/backblaze/backup.py

+        try:
+            await self._hass.async_add_executor_job(
+                self._bucket.upload_bytes,
+                b"".join([chunk async for chunk in stream]),


this means you're pushing the whole backup into memory. This won't work for bigger backups.

Does upload_bytes take an iterator? If so, you could write a sync iterator that uses run_coroutine_threadsafe to get a megabyte of data from the async iterator.

Yeah that is the bad side here... and no, it doesn't take an iterator.

There is a method that takes any stream, which makes a lot of sense... (I expect that to be more common than an iterator for this)

But I have no clue on how to wrap an async iterator into a stream. A BufferedReader (or BufferedWriter) could maybe work? Dunno.

Working on this, made we wonder why we made this an iterator to begin with 😬 although nice in many cases, the case I working with in this PR isn't uncommon.

I think I have an idea how to wrap it. Will whip something up during the day.

Wrote a BufferSyncIteratorToSyncStream on top of the Python io stream base to handle this case. Used your suggestion, including a buffer to ready up until a set buffer size each time we run the coroutine once the buffer is depleted.

Set the buffer to use in this case to 8 megabytes.

I have tested it by dropping a few gigabytes of binary files into my Home Assistant configuration folder. It passed nicely, but a little bit on the slow end for my feeling (nothing to back up this with any reasoning, just a feeling). Indicating there might be room for improvement in terms of efficiency.

mkohns · 2024-12-27T01:01:48Z

HI @frenck, I just took your code and replaced backblaze client with boto3 client to support any kind of S3 buckets. I tested it with Idrive e2, azure storage and minio. Is this something for the current beta - or for later? What do you think?

rhyswaywood · 2024-12-27T02:38:59Z

homeassistant/components/backblaze/strings.json

+          "bucket": "Bucket"
+        },
+        "data_description": {
+          "bucket": "Select the bucked to store backups in."


Suggested change

"bucket": "Select the bucked to store backups in."

"bucket": "Select the bucket to store backups in."

Add Backblaze B2 integration for backups

4c9014a

home-assistant bot added cla-signed has-tests new-integration integration: backblaze quality-scale labels Dec 25, 2024

balloob reviewed Dec 26, 2024

View reviewed changes

Add BufferedAsyncIteratorToSyncStream util

d1d1910

rhyswaywood reviewed Dec 27, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Backblaze B2 integration for backups #134014

Add Backblaze B2 integration for backups #134014

frenck commented Dec 25, 2024 •

edited

Loading

balloob Dec 26, 2024

frenck Dec 26, 2024

balloob Dec 26, 2024

frenck Dec 26, 2024 •

edited

Loading

frenck Dec 26, 2024

frenck Dec 26, 2024 •

edited

Loading

mkohns commented Dec 27, 2024

rhyswaywood Dec 27, 2024

	"bucket": "Select the bucked to store backups in."
	"bucket": "Select the bucket to store backups in."

Add Backblaze B2 integration for backups #134014

Are you sure you want to change the base?

Add Backblaze B2 integration for backups #134014

Conversation

frenck commented Dec 25, 2024 • edited Loading

⚠️ This PR/integration isn't ready yet.

🤗 Feel free to jump in and push improvements to this branch directly ❤️

Proposed change

Type of change

Additional information

Checklist

balloob Dec 26, 2024

Choose a reason for hiding this comment

frenck Dec 26, 2024

Choose a reason for hiding this comment

balloob Dec 26, 2024

Choose a reason for hiding this comment

frenck Dec 26, 2024 • edited Loading

Choose a reason for hiding this comment

frenck Dec 26, 2024

Choose a reason for hiding this comment

frenck Dec 26, 2024 • edited Loading

Choose a reason for hiding this comment

mkohns commented Dec 27, 2024

rhyswaywood Dec 27, 2024

Choose a reason for hiding this comment

frenck commented Dec 25, 2024 •

edited

Loading

frenck Dec 26, 2024 •

edited

Loading

frenck Dec 26, 2024 •

edited

Loading