Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add methods to get compressed size and offset for each stream (folder) #310

Open
orisano opened this issue Jan 22, 2025 · 1 comment
Open
Assignees
Labels
enhancement New feature or request

Comments

@orisano
Copy link
Contributor

orisano commented Jan 22, 2025

First of all, thank you for creating and maintaining this excellent Go library for reading 7z files. It has been very helpful for our project.

Background

I'm working on implementing parallel access to 7z files stored in object storage. To achieve this efficiently, I need to be able to:

  1. Know the compressed size of each stream to make ranged GET requests
  2. Know the offset of each compressed stream to access specific parts

Feature Request

Please add two new methods to the Reader interface:

  1. A method to get the compressed size for each stream
  2. A method to get the offset position of each compressed stream in the archive

Use Case

This would enable efficient parallel processing of 7z files stored in object storage by:

  • Making precise ranged GET requests for specific streams
  • Allowing multiple workers to process different streams concurrently
  • Minimizing unnecessary data transfer

Technical Details

The requested methods could look something like:

// Returns the compressed size of the specified stream
func (r *Reader) GetStreamCompressedSize(streamIndex int) (uint64, error)

// Returns the offset of the specified stream in the archive
func (r *Reader) GetStreamOffset(streamIndex int) (uint64, error)

I'm happy to create a Pull Request for this feature if you think it would be helpful.

@orisano orisano changed the title Add methods to get compressed size and offset for each stream Add methods to get compressed size and offset for each stream (folder) Jan 24, 2025
@bodgit bodgit self-assigned this Feb 14, 2025
@bodgit bodgit added the enhancement New feature or request label Feb 14, 2025
@bodgit
Copy link
Owner

bodgit commented Feb 14, 2025

I'm trying to understand how you'd make use of this. Is this so you would just fetch the raw streams and read them separately? How do you know what compression algorithm(s) are used? Surely you need that information as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants