Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record (TOC digest → DiffID) mapping in BlobInfoCache #2321

Merged
merged 11 commits into from
Jul 30, 2024

Commits on Jul 30, 2024

  1. Compute the schema1 DiffID list based on m.LayerInfos()

    We are already calling m.LayerInfos() anyway, so there is ~no
    extra cost. And using LayerInfos means we don't need to worry
    about reversing the order of layers, and we will have access
    to the layer index, allowing us to acccess the indexTo* fields
    in the future.
    
    Should not change behavior.
    
    Signed-off-by: Miloslav Trmač <[email protected]>
    mtrmac committed Jul 30, 2024
    Configuration menu
    Copy the full SHA
    b8c1dd7 View commit details
    Browse the repository at this point in the history
  2. Improve the documentation of layer identification

    - Don't claim that we only use compressed digests.
    - Explicitly document that we assume TOC digests to be unambiguous
    - Actually define the term "DiffID".
    - Be more precise in computeID about the criteria being layer identity,
      not where we pull the layer from.
    
    Should not change behavior.
    
    Signed-off-by: Miloslav Trmač <[email protected]>
    mtrmac committed Jul 30, 2024
    Configuration menu
    Copy the full SHA
    b520fad View commit details
    Browse the repository at this point in the history
  3. Allow returning (and reporting) unexpected errors from computeID

    Some errors are severe enough that just logging and continuing is
    not really worthwhile.
    
    Signed-off-by: Miloslav Trmač <[email protected]>
    mtrmac committed Jul 30, 2024
    Configuration menu
    Copy the full SHA
    e7a01b8 View commit details
    Browse the repository at this point in the history
  4. Centralize collecting _consistent_ layer data into trustedLayerIdenti…

    …tyDataLocked
    
    Currrently we "only" have indexToTOCDigest and blobDiffIDs, but we will make this
    more complex.
    
    Centralizing the consumption of these fields into trustedLayerIdentityDataLocked
    ensure that all consumers interpret the data exactly consistently (and it also
    allows us to use a single "trusted" variable instead of 2/3 individual ones).
    
    Should not change behavior.
    
    Signed-off-by: Miloslav Trmač <[email protected]>
    mtrmac committed Jul 30, 2024
    Configuration menu
    Copy the full SHA
    6dc3f0c View commit details
    Browse the repository at this point in the history
  5. Add TOC digest <-> uncompressed digest mapping to BIC

    The new code is not called, so it should not change behavior
    (apart from extending the BoltDB/SQLite schema).
    
    Signed-off-by: Miloslav Trmač <[email protected]>
    mtrmac committed Jul 30, 2024
    Configuration menu
    Copy the full SHA
    757d726 View commit details
    Browse the repository at this point in the history
  6. Set up storage destination to support TOC-based layers identified in …

    …storage by DiffID
    
    If we can, prefer identifying layers by DiffID, because multiple TOCs can map to the
    same DiffID; and because it maximizes reuse with non-TOC layers.
    
    For now, the new situation is unreachable.
    
    Signed-off-by: Miloslav Trmač <[email protected]>
    mtrmac committed Jul 30, 2024
    Configuration menu
    Copy the full SHA
    0ad8604 View commit details
    Browse the repository at this point in the history
  7. Split reusedBlobFromLayerLookup from tryReusingBlobAsPending

    We will add one more instance of this, so share the code.
    
    Should not change behavior (it does remove one unreachable code path).
    
    Signed-off-by: Miloslav Trmač <[email protected]>
    mtrmac committed Jul 30, 2024
    Configuration menu
    Copy the full SHA
    924d853 View commit details
    Browse the repository at this point in the history
  8. Use BlobInfoCache to share more layers if (TOC->Uncompressed) mapping…

    … is known
    
    - Multiple TOC values might correspond to a single DiffID (e.g. if different
      compression levels are used); try to share them all, identified by DiffID
      (so that we also reuse with non-TOC pulls).
      - LayersByTOCDigest only uses a single TOC digest per layer; BlobInfoCache
        allows multiple matches, matches layers which have been since deleted,
        and potentially matches TOC digests which we have created by pushing
        but haven't pulled yet.
    - On reuse, we can now use DiffID-based layer identities even if the reuse
      was TOC~driven.
    
    Signed-off-by: Miloslav Trmač <[email protected]>
    mtrmac committed Jul 30, 2024
    Configuration menu
    Copy the full SHA
    d910afa View commit details
    Browse the repository at this point in the history
  9. Record the (TOC digest, uncompressed digest) data when we compress la…

    …yers
    
    Signed-off-by: Miloslav Trmač <[email protected]>
    mtrmac committed Jul 30, 2024
    Configuration menu
    Copy the full SHA
    acdd064 View commit details
    Browse the repository at this point in the history
  10. Use the uncompressed digest we got from a BlobInfoCache for chunked l…

    …ayers
    
    - Rely on it instead of triggering the "untrusted DiffID" logic
    - Also propagate it to storage
    
    Signed-off-by: Miloslav Trmač <[email protected]>
    mtrmac committed Jul 30, 2024
    Configuration menu
    Copy the full SHA
    403d0a2 View commit details
    Browse the repository at this point in the history
  11. HACK: Don't compress with zstd:chunked when encrypting

    Signed-off-by: Miloslav Trmač <[email protected]>
    mtrmac committed Jul 30, 2024
    Configuration menu
    Copy the full SHA
    f49cb62 View commit details
    Browse the repository at this point in the history