Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(puffin): Parse Puffin FileMetadata #765

Merged
merged 12 commits into from
Jan 14, 2025

Conversation

fqaiser94
Copy link
Contributor

@fqaiser94 fqaiser94 commented Dec 7, 2024

Part of #744

Summary

Add support for parsing Puffin FileMetadata

Context

@@ -23,3 +23,8 @@

mod compression;
pub use compression::CompressionCodec;

mod metadata;
Copy link
Contributor Author

@fqaiser94 fqaiser94 Dec 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I have deliberately chosen to not expose anything at the moment.
I want to create a separate, intentional PR to add pub modifiers after the reader and writer implementations have been added.
This way, we can develop inside this crate without having to worry about breaking changes until we're ready.
If it helps though, I intend to expose the following outside of the crate eventually:

pub use metadata::{BlobMetadata, FileMetadata, CREATED_BY_PROPERTY};

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great!

@fqaiser94 fqaiser94 marked this pull request as ready for review December 7, 2024 02:07
@fqaiser94 fqaiser94 force-pushed the puffin_metadata branch 4 times, most recently from 4d3dd9f to 011945e Compare December 13, 2024 12:22
@c-thiel
Copy link
Collaborator

c-thiel commented Dec 13, 2024

@fqaiser94, just added the higher level statistic files in #799 FYI. I would guess you would end up building those soon too.

Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @fqaiser94 for working on this! Here are some minor suggestions.

crates/iceberg/src/puffin/compression.rs Outdated Show resolved Hide resolved
}

#[derive(Clone, PartialEq, Eq, Hash, Debug)]
pub(crate) enum Flag {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we have other flags?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the Puffin spec only supports one flag (as documented here) but allows for the possibility that there may be more flags in the future.

crates/iceberg/src/puffin/metadata.rs Show resolved Hide resolved
crates/iceberg/src/puffin/metadata.rs Outdated Show resolved Hide resolved
crates/iceberg/src/puffin/metadata.rs Show resolved Hide resolved
crates/iceberg/src/puffin/metadata.rs Outdated Show resolved Hide resolved
crates/iceberg/src/puffin/metadata.rs Outdated Show resolved Hide resolved
crates/iceberg/src/puffin/metadata.rs Outdated Show resolved Hide resolved
crates/iceberg/src/puffin/metadata.rs Outdated Show resolved Hide resolved
crates/iceberg/src/puffin/metadata.rs Outdated Show resolved Hide resolved
@fqaiser94 fqaiser94 force-pushed the puffin_metadata branch 2 times, most recently from c06c3d1 to aa321ee Compare December 19, 2024 02:09
Copy link
Contributor

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fqaiser94 for this great pr! Generally LGTM, left some minor suggestion.

crates/iceberg/src/puffin/metadata.rs Show resolved Hide resolved
crates/iceberg/src/puffin/metadata.rs Show resolved Hide resolved
crates/iceberg/src/puffin/metadata.rs Outdated Show resolved Hide resolved
crates/iceberg/src/puffin/metadata.rs Show resolved Hide resolved
crates/iceberg/src/puffin/metadata.rs Outdated Show resolved Hide resolved
@fqaiser94 fqaiser94 force-pushed the puffin_metadata branch 2 times, most recently from 3caf54e to 378c97a Compare January 6, 2025 03:21
Copy link
Contributor

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fqaiser94 for this great pr, and all the tests! Left some minor suggestions, others look great to me!

crates/iceberg/src/puffin/metadata.rs Show resolved Hide resolved
crates/iceberg/src/puffin/metadata.rs Outdated Show resolved Hide resolved
crates/iceberg/src/puffin/metadata.rs Outdated Show resolved Hide resolved
crates/iceberg/src/puffin/metadata.rs Outdated Show resolved Hide resolved
crates/iceberg/src/puffin/metadata.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @fqaiser94 for this great pr!

@liurenjie1024
Copy link
Contributor

Let's wait for a moment to see if others need to take a look.

@liurenjie1024 liurenjie1024 merged commit 7addc3f into apache:main Jan 14, 2025
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants