Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config,storage: support populating directories from archives #1498

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Snaipe
Copy link

@Snaipe Snaipe commented Nov 12, 2022

Tarballs are ubiquitous as a binary release format, and without any builtin ability to fetch, validate, and extract them, we are left with half-baked hacks to do all of this from the confines of a oneshot systemd service, or worse, extracting and providing the entirety of the archive contents as files and directory entries in the ignition config, resulting in very large json documents.

Let's not do this. Instead, this commit formalizes the use of archives via adding a new (optional) "contents" key under a directory entry.

This new contents key is identical in function as its eponymous version in the "files" entries, except that it incorporates a new "archive" subkey to specify the archive format rather than guessing with heuristics. Today, only "archive": "tar" is supported, though this commit is structured to allow the addition of other archive types if needed.

@bgilbert bgilbert self-requested a review November 12, 2022 19:48
@bgilbert
Copy link
Contributor

Thanks for the detailed PR. I think this is worth considering. I won't have a chance to do a detailed review for some weeks, but a few initial thoughts:

  • We should document the limits of tar support: which member types (e.g. not device special files), which tar formats (I think currently USTAR, PAX, and GNU), etc.
  • It looks like sparse files aren't extracted sparsely. That's probably fine, but we should document it too.
  • Config validation ensures that the same pathname can't be specified multiple times, e.g. in both directories and files. This is needed to ensure that the config is declarative; otherwise the result would be determined by the order in which Ignition happens to process the config sections. There's a similar risk here: a user could specify an archive that writes /foo/bar/baz and separately a files entry for /foo/bar/baz. Since we're already disallowing unpacking into non-empty directories, one possible fix is to disallow any links or files entry with a path underneath a directory that has archive set.

@Snaipe
Copy link
Author

Snaipe commented Nov 13, 2022

Thanks for the early feedback, I've addressed your points about conflict validation and added a test to ensure it, and added some documentation around what exactly is being supported.

This allows some schema types to specify a key prefix for which keys
matching said prefix should conflict. This is used by a subsequent
commit where a directory is populated from an archive, and any file,
directory, or link whose path is under the original directory should
conflict.
Tarballs are ubiquitous as a binary release format, and without any
builtin ability to fetch, validate, and extract them, we are left
with half-baked hacks to do all of this from the confines of a oneshot
systemd service, or worse, extracting and providing the entirety of the
archive contents as files and directory entries in the ignition config,
resulting in very large json documents.

Let's not do this. Instead, this commit formalizes the use of archives
via adding a new (optional) "contents" key under a directory entry.

This new contents key is identical in function as its eponymous version
in the "files" entries, except that it incorporates a new "archive"
subkey to specify the archive format rather than guessing with
heuristics. Today, only "archive": "tar" is supported, though this
commit is structured to allow the addition of other archive types if
needed.
@cgwalters
Copy link
Member

Worth noting here that https://github.com/coreos/enhancements/blob/main/os/coreos-layering.md AKA https://fedoraproject.org/wiki/Changes/OstreeNativeContainerStable is basically saying that we already have a way to version, mirror, manage, sign tarballs and use them to ship both binaries and application and OS configuration - that's what OCI/Docker container images are.

@bgilbert
Copy link
Contributor

Not all Linux distros that use Ignition also use ostree. For some of the ones that do, discussion is still ongoing about when layering will be the recommended customization flow and when it won't, e.g. coreos/fedora-coreos-tracker#1219. So, layering isn't a full substitute for this PR.

@cgwalters
Copy link
Member

Yes, though the larger point I was trying to make here is about the overlap between Ignition fetching raw tarballs and container images. ostree isn't a requirement for the "boot a container" model, just one implementation. In fact I consider hiding ostree to be a key goal of that effort.

For example, I could also imagine that instead of this, Ignition could learn to unpack container images (e.g. FROM scratch style ones) too. That would have different tradeoffs than the "host derivation" one in that configuration and other binaries could still be lifecycled separately from the host.

One great example of a specific problem one runs into quickly when growing from Ignition's base of "configuration" to shipping code around is multi-arch support. OCI already has that problem solved. Now of course one can switch to generating Ignition fragments and having different e.g. verification digests per architecture injected, etc.

So again my point here is I think both ostree and Ignition have problems that are best solved using container tooling which is already what our users are familiar with.

@Snaipe
Copy link
Author

Snaipe commented Nov 14, 2022

I don't think this PR precludes incorporating some way to unpack container images into the rootfs, but the reason why I've made this PR is that it's immediately useful to us without needing to include a docker registry as part of bootstrapping the cluster that hosts our docker registry.

In contrast, downloading a tarball over HTTP is probably as simple as you can get, and just requires an HTTP server (which we already have as far as bootstrapping is concerned since we configure HTTP boot on most of our cluster machines)

@cgwalters
Copy link
Member

I understand. Don't get me wrong, I'm not against this PR. (To be clear: I'm also not an Ignition maintainer)

In contrast, downloading a tarball over HTTP is probably as simple as you can get, and just requires an HTTP server

Yes, though it's also entirely possible to implement the registry API with a static webserver: https://github.com/NicolasT/static-container-registry (There may be better examples of this, but it's really just a bit of JSON and tarballs...that's all OCI is)

@Snaipe
Copy link
Author

Snaipe commented Mar 1, 2023

Any news on this? @bgilbert

@bgilbert bgilbert removed their request for review July 28, 2023 06:49
@bgilbert
Copy link
Contributor

bgilbert commented Aug 1, 2023

Thanks for your work on this PR. I won't be able to get back to this, unfortunately, but hopefully one of the other folks on the team can take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants