feat(handler): add geom_uzip handler #1143

rxpha3l · 2025-03-03T16:23:20Z

geom_uzip is a FreeBSD feature for creating compressed disk images (usually containing UFS). The compression is done in blocks, and the resulting .uzip file can be mounted via the GEOM framework on FreeBSD.

The mkuzip header includes a table with block counts and sizes. The header declares the block size (size of decompressed blocks) and total number of blocks. Block size must be a multiple of 512 and defaults to 16384 in mkuzip.
It has the following structure:

Magic, which is a shebang that is stored on 10 bytes.
Version, which can change and is stored on 13 bytes.
Command, which can change and is stored on 105 bytes.
Block size, stored on 4 bytes.
Block count, stored on 4 bytes.
Table of content (TOC), which depends on the file lentgh.
The TOC is a list of uint64_t offsets into the file for each block. To determine the length of a given block, read the next TOC entry and subtract the current offset from the next offset (this is why there is an extra TOC entry at the end). Each block is compressed using zlib. A standard zlib decompressor will decode them to a block of size block_size.

Unblob parses the TOC to determine end & start offset of the uzip file. It will find the compressed blocks, decompress them using zlib and parses them together to recover the decompressed file. Empty chunks are ignored, which is why the decompressed file with unlbob can be a little bit lighter than the original one.

[Sources]
https://github.com/freebsd/freebsd-src/blob/master/sys/geom/uzip/g_uzip.c

python/unblob/handlers/compression/geom_uzip.py

tests/integration/compression/uzip/__input__/myfs.img.uzip

tests/integration/compression/uzip/__output__/myfs.img.uzip_extract/myfs.img

python/unblob/handlers/compression/geom_uzip.py

qkaiser · 2025-03-06T11:29:13Z

@vlaci what would be the easiest way to add pyzstd to unblob dependencies in Nix here ? It's not yet in upstream at https://github.com/NixOS/nixpkgs/blob/0fa90d642277de2c67e93204cc5870aba8af5878/pkgs/by-name/un/unblob/package.nix#L59 so we need a way to define it in this branch in the meantime.

qkaiser · 2025-03-06T12:01:49Z

@vlaci what would be the easiest way to add pyzstd to unblob dependencies in Nix here ? It's not yet in upstream at https://github.com/NixOS/nixpkgs/blob/0fa90d642277de2c67e93204cc5870aba8af5878/pkgs/by-name/un/unblob/package.nix#L59 so we need a way to define it in this branch in the meantime.

@rxpha3l I'm using this fix locally, but not sure if it's idiomatic Nix

diff --git a/overlay.nix b/overlay.nix
index 9c5051e..265cd79 100644
--- a/overlay.nix
+++ b/overlay.nix
@@ -29,6 +29,8 @@ final: prev:
         ];
       };
 
+      dependencies = (super.dependencies or []) ++ [ prev.python3.pkgs.pyzstd ];
+
       # remove this when packaging changes are upstreamed
       cargoDeps = final.rustPlatform.importCargoLock {
         lockFile = ./Cargo.lock;

python/unblob/handlers/compression/uzip.py

overlay.nix

python/unblob/handlers/compression/uzip.py

qkaiser

check the comments and rebase so there is no fixup commit

python/unblob/handlers/compression/uzip.py

martonilles

small leak, otherwise looks ok!

python/unblob/handlers/compression/uzip.py

Geom_uzip is a FreeBSD feature for creating compressed disk images (usually containing UFS). The compression is done in blocks, and the resulting .uzip file can be mounted via the GEOM framework on FreeBSD. The mkuzip header includes a table with block counts and sizes. The header declares the block size (size of decompressed blocks) and total number of blocks. Block size must be a multiple of 512 and defaults to 16384 in mkuzip. It has the following structure: > Magic, which is a shebang & compression identifier stored on 16 bytes. > Format, which is a shell command that provides some general information. > Block size, stored on 4 bytes. > Block count, stored on 4 bytes. > Table of content (TOC), which depends on the file lentgh. The TOC is a list of uint64_t offsets into the file for each block. To determine the length of a given block, read the next TOC entry and subtract the current offset from the next offset (this is why there is an extra TOC entry at the end). Each block is compressed using zlib. A standard zlib decompressor will decode them to a block of size block_size. Unblob parses the TOC to determine end & start offset of the compressed file. It detects the compression method (zlib, lzma or zstd). Finally the chunks are decompressed to revocer the inital file. Empty chunks are ignored, which is why the decompressed file with unlbob can be a little bit lighter than the original one. [Sources] https://github.com/mikeryan/unuzip https://www.baeldung.com/linux/filesystem-in-a-file https://docs.python.org/3/library/zlib.html https://github.com/freebsd/freebsd-src/blob/master/sys/geom/uzip/g_uzip.c https://parchive.sourceforge.net/docs/specifications/parity-volume-spec/article-spec.html https://www.mail-archive.com/[email protected]/msg34955.html

the change has been applied

qkaiser self-assigned this Mar 3, 2025

qkaiser force-pushed the geom_uzip branch from 9592034 to f3bc4ea Compare March 3, 2025 16:34

qkaiser linked an issue Mar 3, 2025 that may be closed by this pull request

Add Support for geom_uzip Compression (FreeBSD mkuzip) #1125

Closed

qkaiser added this to the Internship 2025 milestone Mar 3, 2025

qkaiser added enhancement New feature or request format:compression labels Mar 3, 2025

martonilles requested changes Mar 3, 2025

View reviewed changes

qkaiser reviewed Mar 3, 2025

View reviewed changes

rxpha3l force-pushed the geom_uzip branch 2 times, most recently from ea64c85 to b095981 Compare March 5, 2025 14:45

qkaiser reviewed Mar 6, 2025

View reviewed changes

python/unblob/handlers/compression/uzip.py Outdated Show resolved Hide resolved

rxpha3l force-pushed the geom_uzip branch from b6935e7 to 08ae071 Compare March 7, 2025 10:13

qkaiser force-pushed the geom_uzip branch from 08ae071 to 93bca99 Compare March 7, 2025 10:16

qkaiser requested review from martonilles and e3krisztian March 7, 2025 10:19

vlaci reviewed Mar 7, 2025

View reviewed changes

overlay.nix Outdated Show resolved Hide resolved

rxpha3l force-pushed the geom_uzip branch 4 times, most recently from 519c7e2 to 34c6696 Compare March 7, 2025 14:24

qkaiser approved these changes Mar 7, 2025

View reviewed changes

martonilles reviewed Mar 7, 2025

View reviewed changes

python/unblob/handlers/compression/uzip.py Show resolved Hide resolved

python/unblob/handlers/compression/uzip.py Outdated Show resolved Hide resolved

python/unblob/handlers/compression/uzip.py Outdated Show resolved Hide resolved

e3krisztian reviewed Mar 7, 2025

View reviewed changes

python/unblob/handlers/compression/uzip.py Outdated Show resolved Hide resolved

python/unblob/handlers/compression/uzip.py Outdated Show resolved Hide resolved

python/unblob/handlers/compression/uzip.py Outdated Show resolved Hide resolved

rxpha3l force-pushed the geom_uzip branch 2 times, most recently from 5c2c785 to 20704a6 Compare March 10, 2025 15:18

qkaiser reviewed Mar 26, 2025

View reviewed changes

python/unblob/handlers/compression/uzip.py Outdated Show resolved Hide resolved

python/unblob/handlers/compression/uzip.py Outdated Show resolved Hide resolved

rxpha3l force-pushed the geom_uzip branch from b0fae31 to 43b6fcf Compare March 28, 2025 14:53

e3krisztian approved these changes Mar 28, 2025

View reviewed changes

e3krisztian requested a review from martonilles March 28, 2025 15:19

rxpha3l force-pushed the geom_uzip branch from 43b6fcf to 043fdd9 Compare March 28, 2025 15:42

martonilles previously requested changes Mar 28, 2025

View reviewed changes

python/unblob/handlers/compression/uzip.py Outdated Show resolved Hide resolved

rxpha3l force-pushed the geom_uzip branch from 043fdd9 to 61ff4ab Compare March 28, 2025 16:11

qkaiser added this pull request to the merge queue Mar 29, 2025

Merged via the queue into onekey-sec:main with commit 3aa401a Mar 29, 2025
22 checks passed

feat(handler): add geom_uzip handler #1143

feat(handler): add geom_uzip handler #1143

Uh oh!

Conversation

rxpha3l commented Mar 3, 2025 • edited by qkaiser Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qkaiser commented Mar 6, 2025

Uh oh!

qkaiser commented Mar 6, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qkaiser left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

martonilles left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rxpha3l commented Mar 3, 2025 •

edited by qkaiser

Loading