Skip to content

zstd:chunked metadata ambiguity #2014

Closed
@mtrmac

Description

@mtrmac

Filing separately, earlier discussion around #1888 (comment) :

zstd:chunked layers, when pulling, contain metadata in three places:

  • The ordinary uncompressed tar format
  • TOC metadata
  • tar-split metadata

When pulling, we ignore the uncompressed tar, build files from TOC metadata, and record tar-split

When pushing, we ignore the metadata of individual files, and use the tar-split metadata.

The net outcome is that a user can “pull; inspect; push”, and the pushed metadata will be different from what the user saw.

I think this_must_ be addressed.


Vaguely, I think that could either happen by having the tar-split “drive” the chunked pull, using the TOC only to look up data; or by overwriting the TOC metadata by the tar-split data.

The latter seems a bit easier because the “push” compression code already has a tar → TOC conversion code, so we would “only” need to read through tar-split; the former is conceptually nicer because, hypothetically, we could eventually only have a single “apply tar header → filesystem metadata” code for all of c/storage, instead of now having “tar → metadata” for ordinary overlay, “TOC → metadata” for chunked, and things like #1653 (comment) — but it would probably be more disruptive and not practical within current short time limits.

Cc: @giuseppe — I’ll be looking at this over the next few days, but I’d very much appreciate any insight, advice, or help.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions