-
Notifications
You must be signed in to change notification settings - Fork 5k
WIP: krunkit driver #20826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
WIP: krunkit driver #20826
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: nirs The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @nirs. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Can one of the admins verify this patch? |
The serial console name depends on the driver. We had setting for qemu that does not work for vfkit and krunkit, breaking boot from minikube iso. Fixed by using 2 console= options, one is known to work for qemu, and one for vfkit and krunkit. With this we can use the same iso image with qemu, vfkit, and krunkit. This will allow simplifying vfkit driver. Previously we had to extract the kernel and initrd and start it using the legacy --kernel, --kernel-cmdline and --initrd options. I tested this by building the iso with this fix and running with --iso-url. Example run with qemu: % minikube start -p qemu --driver qemu --container-runtime containerd \ --iso-url file://$PWD/minikube-arm64.iso 😄 [qemu] minikube v1.36.0 on Darwin 15.5 (arm64) ✨ Using the qemu2 driver based on user configuration 🌐 Automatically selected the socket_vmnet network 👍 Starting "qemu" primary control-plane node in "qemu" cluster 🔥 Creating qemu2 VM (CPUs=2, Memory=6000MB, Disk=20000MB) ... 📦 Preparing Kubernetes v1.33.1 on containerd 1.7.23 ... ▪ Generating certificates and keys ... ▪ Booting up control plane ... ▪ Configuring RBAC rules ... 🔗 Configuring bridge CNI (Container Networking Interface) ... 🔎 Verifying Kubernetes components... ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5 🌟 Enabled addons: default-storageclass, storage-provisioner 🏄 Done! kubectl is now configured to use "qemu" cluster and "default" namespace by default Example run with krunkit: % minikube start -p krunkit --driver krunkit --container-runtime containerd \ --iso-url file://$PWD/minikube-arm64.iso 😄 [krunkit] minikube v1.36.0 on Darwin 15.5 (arm64) ✨ Using the krunkit (experimental) driver based on user configuration 👍 Starting "krunkit" primary control-plane node in "krunkit" cluster 🔥 Creating krunkit VM (CPUs=2, Memory=6000MB, Disk=20000MB) ... 📦 Preparing Kubernetes v1.33.1 on containerd 1.7.23 ... ▪ Generating certificates and keys ... ▪ Booting up control plane ... ▪ Configuring RBAC rules ... 🔗 Configuring bridge CNI (Container Networking Interface) ... 🔎 Verifying Kubernetes components... ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5 🌟 Enabled addons: default-storageclass, storage-provisioner 🏄 Done! kubectl is now configured to use "krunkit" cluster and "default" namespace by default
7cda3a5
to
cbe0012
Compare
This speeds up machine boot by 5 seconds. The timeout may be helpful for debugging boot issues but we don't have a way to access the serial console for debugging currently. Testing shows about 4-5 seconds speedup. | driver | timeout | start time | |---------|---------|------------| | vfkit | 5.0 | 24.01 | | vfkit | 0.0 | 19.90 | | qemu | 5.0 | 29.46 | | qemu | 0.0 | 24.28 | | krunkit | 5.0 | 25.14 | | krunkit | 0.0 | 20.51 | vfkit tested booting using iso instead of direct kernel boot. Direct kernel boot is little bit faster that booting from iso even with timeout=0.
@afbjorklund can you review this? I think the issue of not having /dev/dri is using too old kernel. We are using 5.10.207 while libkrun seems to require 5.16 or later. libkrun for macOS is using venus, which requires
So it seems that we need to move to newer kernel. Do you know why are stuck with 5.10?
|
libkrun virtio-net driver enables TSO offloading and checksum offloading by default, so we must use vment-helper --enable-tso and --enable-checksum-offload with krunkit. These options do not work with vfkit.
krunkit is a tool to launch configurable virtual machines using the libkrun platform, optimized for GPU accelerated virtual machines and AI workloads on Apple silicon. It is mostly compatible with vfkit; the driver is a simplified copy of the vfkit driver. Unlike vfkit, krunkit is available only on Apple silicon. Changes compared to vfkit driver: - krunkit requires unix socket for netwroking, so we must use vment-helper. - krunkit can be controlled only via HTTP, not via unix socket. - krunkit does not support --kernel-cmdline - We must enable vmnet offloading, required for krunkit. - The code was simplified since vmnet-helper is always used - Code was cleaned up to use .ResolveStorePath() Limitations: - Only one machine can be created since we use the same port for krunkit --restful-uri. This should be fixed to use an unused port, or use a unix socket when unix socket is supported[1]. [1] containers/krunkit#47
Previously it was used only for vfkit, so we suggested to fallback to the `nat` network. This advice is not relevant to krunkit or to qemu (which can also use vmnet-helper). Change the error to recommend installing vment-helper. We need to think how we can recommend other networks for vfkit and qemu. Another solution is to create error for every driver+network combination but this seems hard to manage.
This is the same way that we test vfkit. This test is not running in the CI. Issues: - Need to install and configure vment-helper (requires root).
I don't think it is stuck with anything, just that it was using the LTS versions... It seems that a new kernel version was not included, in the minikube OS upgrade. Buildroot supports many: https://github.com/buildroot/buildroot/blob/2025.02.x/linux/linux.hash
See also https://www.kernel.org/ ("longterm") |
So should we try to update the kernel to latest longterm version (6.12.30)? |
There was some talk about bumping to a newer kernel: (6.6?) But that was last year, and for the 2024.02.x OS. So maybe. It seems that it is only needed for this AI feature, though? |
Yes, this is for AI on Apple silicon. I can try to get build an iso to play with it, and when we a working iso we can discuss how to proceed with the upgrade. |
So you could still use vfkit for all other (non-AI) usage of Kubernetes. Or maybe even make it a single driver, if the syntax is close enough... |
} | ||
|
||
func (d *Driver) stopKrunkit() error { | ||
if err := d.SetKrunkititState("Stop"); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
krunkit ignores this value, but it is documented, so we should keep it same as vfkit.
https://github.com/containers/krunkit/blob/9421c0e1daca8af0eb84d6afc67bfb55b78214c4/src/status.rs#L101
func (d *Driver) killkrunkit() error { | ||
if err := d.SetKrunkititState("HardStop"); err != nil { | ||
// Typically fails with EOF due to https://github.com/crc-org/krunkit/issues/277. | ||
log.Debugf("Failed to set krunkit state to 'HardStop': %s", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
krunkit ignores the value and does not document this value, so we should not set the state here but just kill it.
https://github.com/containers/krunkit/blob/9421c0e1daca8af0eb84d6afc67bfb55b78214c4/src/status.rs#L101
} | ||
|
||
// Make a boot2docker VM disk image. | ||
func (d *Driver) generateDiskImage(size int) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate of vfkit version, which may duplicate of qemu?
State string `json:"state"` | ||
} | ||
|
||
func (d *Driver) GetkrunkitState() (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as vfkit using http client instead of unix http client.
return vmstate.State, nil | ||
} | ||
|
||
func (d *Driver) SetKrunkititState(state string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as vfkit using http client instead of unix http client.
return nil | ||
} | ||
|
||
func WaitForTCPWithDelay(addr string, duration time.Duration) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate from vfkit, likely duplicated in qemu and other drivers. Can move to some util package.
return fmt.Errorf("hosts without a driver cannot be upgraded") | ||
} | ||
|
||
func (d *Driver) getKrunkitState() (state.State, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as vfkit
return os.OpenFile(logfile, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0o600) | ||
} | ||
|
||
func (d *Driver) setupIP(mac string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as vfkit (and likey qemu and other drivers).
- Update to longterm kernel 6.6.92[1] - aarch64: Enable Virtio GPU, needed for krunkit driver Generated using by running: make iso-menuconfig-aarch64 make linux-menuconfig-aarch64 make iso-menuconfig-x86_64 make linux-menuconfig-x86_64 This generated many changes in the configs, maybe they were updated mnaully previously. With this change we can boot krunkit witht the built iso: % minikube start -p krunkit --driver krunkit --container-runtime containerd --iso-url file://$PWD/minikube-arm64-vgpu.iso 😄 [krunkit] minikube v1.36.0 on Darwin 15.5 (arm64) ✨ Using the krunkit (experimental) driver based on user configuration 👍 Starting "krunkit" primary control-plane node in "krunkit" cluster 🔥 Creating krunkit VM (CPUs=2, Memory=6000MB, Disk=20000MB) ... 📦 Preparing Kubernetes v1.33.1 on containerd 1.7.23 ... ▪ Generating certificates and keys ... ▪ Booting up control plane ... ▪ Configuring RBAC rules ... 🔗 Configuring bridge CNI (Container Networking Interface) ... 🔎 Verifying Kubernetes components... ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5 🌟 Enabled addons: storage-provisioner, default-storageclass 🏄 Done! kubectl is now configured to use "krunkit" cluster and "default" namespace by default And now we have accelerated gpu: $ tree /dev/dri /dev/dri |-- by-path | |-- platform-a007000.virtio_mmio-card -> ../card0 | `-- platform-a007000.virtio_mmio-render -> ../renderD128 |-- card0 `-- renderD128 [1] https://www.kernel.org/
ok-to-build-iso |
See the logs at: for example for this PR |
Yes, the only interesting feature in krunkit is accelerated GPU, so if you don't need this I don't see a reason to use it.
We can use a single driver that will use vfkit or krunkit, maybe if you use --enable-gpu-acceleration. But calling it vfkit and using krunkit will be confusing. Another issue is networking - for vfkit we have nat (default), or vmnet-helper (optional, for multi clusters or multi-node cluster). For krunkit we must use vmnet-helper so the --network option is not used. I think having 2 separate drivers is the best way for user experience and maintenance. We have lot of duplicate code, but this can be solved by extracting shared pieces out of the drivers. Most of the duplicate code is also duplicated from other drivers (e.g. qemu). |
|
||
func (d *Driver) restfulURI() string { | ||
// TODO: use unused port or a unix socket when supported. | ||
// https://github.com/containers/krunkit/issues/47 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use getAvailableTCPPortFromRange() from qemu driver.
Hi @nirs, we have updated your PR with the reference to newly built ISO. Pull the changes locally if you want to test with them or update your PR further. |
krunkit is a tool to launch configurable virtual machines using the libkrun platform, optimized for GPU accelerated virtual machines and AI workloads on Apple silicon.
It is mostly compatible with vfkit; the driver is a simplified copy of the vfkit driver.
To get accelerated gpu the arch iso was updated to longterm kernel 6.6.92 and Virtio GPU was enabled.
Limitations:
--restful-uri. This should be fix to use an unused port or use unix socket when supported[1]
[1] containers/krunkit#47
Status:
Based on these PRs for testing:
Fixes #20803