-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] overlay: make sure directories omitted from layers have the right permissions #1653
base: main
Are you sure you want to change the base?
Conversation
93c8282
to
c28137b
Compare
When extracting layer contents, if we find ourselves needing to implicitly create a directory (due to its not being included in the layer diff), try to give it the permissions, ownership, attributes, and datestamp of the corresponding directory from the layer which would be stacked below it. When a layer omits any of the directories which contain items which that layer adds or modifies, this should prevent the default values that we would use from overriding those which correspond to the same directory in a lower layer, which could later be mistaken as an indication that one or more of those was intentionally changed, forcing the directory to be pulled up. Signed-off-by: Nalin Dahyabhai <[email protected]>
When applying a chunked diff, ApplyDiff() iterates through the entries in the diff's table of contents, creating directories and other zero-length items like empty files, symbolic links, and directories, building a queue of hard links, and asking goroutines to locate files in other layers and configured OSTree repositories which contain content that matches what should eventually end up in non-zero-length files in the layer. If a file's entire contents couldn't be found locally, but it was split into chunks in the chunked diff, ApplyDiff() then searches for the chunks in local layers. Lastly, the contents of files with non-zero lengths, which ApplyDiff() had attempted to find locally, but which weren't found locally, are then requested from the registry. Track which directories in the layer are created implicitly by ApplyDiff(), so that the overlay driver can reset their ownership, permissions, extended attributes, and datestamps to match the immediately lower layer, if there is one. Signed-off-by: Nalin Dahyabhai <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing this makes sense to me, but oh it’s a lot of code to make that possible. That’s an impressive amount of effort.
(I have only fairly briefly skimmed the tests. But their existence adds a lot of confidence anyway :) )
@@ -188,6 +188,7 @@ type DriverWithDifferOutput struct { | |||
Metadata string | |||
BigData map[string][]byte | |||
TarSplit []byte | |||
ImplicitDirs []string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some documentation of what this field contains would be nice.
if hdr == nil { | ||
continue | ||
} | ||
path := filepath.Join(stagingDirectory, implicitDir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a vague feeling that the code to apply a *tar.Header
to a filesystem object should already exist somewhere, and that duplicating it is not the ideal approach.
I have no idea where is the best place for that code (pkg/archive
?) and what all options need to be provided to it, though…
As specific examples, looking at overlay.ApplyDiffWithDiffer
, it seems that d.options.ignoreChownErrors
or options.ForceMask
are not implemented in this new copy. I guess ForceMask
is irrelevant because it should have been applied do the parent layer correctly, already… can that be assumed even with additional layer stores?
idmap = idtools.NewIDMappingsFromMaps(append([]idtools.IDMap{}, uidMaps...), append([]idtools.IDMap{}, gidMaps...)) | ||
} | ||
} | ||
return &backfiller{idmap: idmap, lowerDiffDirs: append([]string{}, lowerDiffDirs...)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Absolutely non-blocking: This might be a tiny bit more readable when using x/exp/slices.Clone
; that will be in the standard library in future Go 1.21. OTOH until then the x/exp package is technically without any stability promise…)
// ownership/permissions/attributes of a directory from a lower layer so that | ||
// we don't have to create it in an upper layer using default values that will | ||
// be mistaken for a reason that the directory was pulled up to that layer. | ||
func NewBackfiller(idmap *idtools.IDMappings, lowerDiffDirs []string) *backfiller { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a public c/storage API, or would something like c/storage/internal/unionbackfill
work?
(I don’t have a strong opinion, I just want this to be an intentional decision.)
// ownership/permissions/attributes of a directory from a lower layer so that | ||
// we don't have to create it in an upper layer using default values that will | ||
// be mistaken for a reason that the directory was pulled up to that layer. | ||
func NewBackfiller(idmap *idtools.IDMappings, lowerDiffDirs []string) *backfiller { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If lowerDiffDirs
is expected to be from ordered from children to parents (is it?), that would be useful to document at least at subpackage/API boundaries like this.
// that an entry we inserted had no content. | ||
func (r *Reader) Read(b []byte) (int, error) { | ||
if r.currentIsQueued { | ||
return 0, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://pkg.go.dev/archive/tar#Reader.Read documents 0, io.EOF
on both special file types, and empty regular files.
|
||
// Read will either read from a real entry in the archive, or pretend very hard | ||
// that an entry we inserted had no content. | ||
func (r *Reader) Read(b []byte) (int, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What … happens … when the parent layer contains a regular file in place of an expected parent directory?
I suppose it doesn’t matter what we do here, because extracting the current layer should fail, shouldn’t it?
hdr, err := tr.Next() | ||
defer func() { | ||
closeErr := tw.Close() | ||
io.Copy(wc, reader) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this do? If tw.Close
succeeded, the stream is already terminated by the tar EOF blocks.
And even if that were not true, both the reader and the writer are in an indeterminate (or anyway hard-for-me-to-reason-about) state.
} else if closeErr != nil { | ||
wc.CloseWithError(closeErr) | ||
} else { | ||
wc.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Close()
is equivalent to CloseWithError(nil)
, so this could be:
if err == nil { err = closeErr }
wc.CloseWithError(err)
@@ -0,0 +1,159 @@ | |||
package tarbackfill |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a public c/storage API, or would something like c/storage/internal/tarbackfill
work?
(I don’t have a strong opinion, I just want this to be an intentional decision.)
@nalind Waiting on you? |
Another way to solve this possibly: Add an xattr to the directory like |
When extracting layer contents, if we find ourselves needing to implicitly create a directory (due to its not being included in the layer diff), try to give it the permissions, ownership, attributes, and datestamp of the corresponding directory from the layer which would be stacked below it.
When a layer omits any of the directories which contain items which that layer adds or modifies, this should prevent the default values that we would use from overriding those which correspond to the same directory
in a lower layer, which could later be mistaken as an indication that one or more of those was intentionally changed, forcing the directory to be pulled up.