-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rdsquashfs feature suggestion: hardlink duplicate files on extract #73
Comments
Only unpacking duplicated files once and creating copy-on-write reflinks sounds like a very interesting idea. On Linux this would be done with an |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
tl;dr: I have a squashfs file with millions of duplicated files in them, it would be awesome to be able to extract the image and hardlink (or reflink) the duplicated files
My specific use case is an abuse of the intended functionality of squashfs, but I have been using squashfs as a directory archival tool to consolidate dozens of Apple Time Machine backup folders [1]. Time Machine uses directory hardlinks to snapshot the entire filesystems and preserve space, but I have Time Machine backups from different drives and systems which don't share those hardlinks but have very similar files. mksquashfs has been the only tool that's been able to scale to the number of files and hardlinks that I'm dealing with and properly do deduplication as I append directories to my single squashfs file.
I can always mount the squashfs image and browse to the specific files/folders I want to retrieve, but I was thinking it would be cool to be able extract the image and use the deduplication table to create files on the disk as hardlinks or reflinks on COW filesystems such as BTRFS. I'm not sure how hard this would be to implement in rdsquashfs to do so.
[1] There are pitfalls with using mksquashfs on Apple Time Machine folders. Namely, squashfs does not support all the crazy xattr stuff that macOS applies to files, so some things don't restore completely, but as a file archive, it works fine.
The text was updated successfully, but these errors were encountered: