Skip to content

[r] support for stream deduplication? #23

Open
@xplshn

Description

@xplshn

Would it be in the scope for this project to add stream deduplication to some formats?

It would drastically reduce the filesizes of the archives. Basically, marking blocks which are the same as such, and only including a reference to the first of them in the header. Not all formats may be compatible tho

This would come as an step before compression.
So it could be implemented as a format option, basically "tar+dedup".compressionFormat

More about the technique can be read at https://github.com/klauspost/dedup, which is also a stream-deduplication library.

There's an article explaining everything in great detail here too: https://blog.klauspost.com/fast-stream-deduplication-in-go/

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions