-
Notifications
You must be signed in to change notification settings - Fork 5k
Add krunkit driver supporting GPU acceleration on macOS #20826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: nirs The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @nirs. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Can one of the admins verify this patch? |
7cda3a5
to
cbe0012
Compare
@afbjorklund can you review this? I think the issue of not having /dev/dri is using too old kernel. We are using 5.10.207 while libkrun seems to require 5.16 or later. libkrun for macOS is using venus, which requires
So it seems that we need to move to newer kernel. Do you know why are stuck with 5.10?
|
I don't think it is stuck with anything, just that it was using the LTS versions... It seems that a new kernel version was not included, in the minikube OS upgrade. Buildroot supports many: https://github.com/buildroot/buildroot/blob/2025.02.x/linux/linux.hash
See also https://www.kernel.org/ ("longterm") |
So should we try to update the kernel to latest longterm version (6.12.30)? |
There was some talk about bumping to a newer kernel: (6.6?) But that was last year, and for the 2024.02.x OS. So maybe. It seems that it is only needed for this AI feature, though? |
Yes, this is for AI on Apple silicon. I can try to get build an iso to play with it, and when we a working iso we can discuss how to proceed with the upgrade. |
So you could still use vfkit for all other (non-AI) usage of Kubernetes. Or maybe even make it a single driver, if the syntax is close enough... |
} | ||
|
||
// Make a boot2docker VM disk image. | ||
func (d *Driver) generateDiskImage(size int) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate of vfkit version, which may duplicate of qemu?
ok-to-build-iso |
This comment has been minimized.
This comment has been minimized.
Hi @nirs, we have updated your PR with the reference to newly built ISO. Pull the changes locally if you want to test with them or update your PR further. |
This comment has been minimized.
This comment has been minimized.
8e1f14e
to
48d867c
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
304e9ac
to
faddf3a
Compare
kvm2 driver with docker runtime
Times for minikube (PR 20826) start: 48.6s 49.5s 50.6s 48.4s 51.8s Times for minikube ingress: 14.5s 15.5s 15.0s 15.5s 14.5s docker driver with docker runtime
Times for minikube start: 22.2s 25.5s 25.2s 22.2s 21.0s Times for minikube ingress: 12.3s 12.2s 12.7s 11.2s 12.2s docker driver with containerd runtime
Times for minikube start: 25.0s 20.9s 21.9s 22.6s 21.8s Times for minikube ingress: 22.7s 24.2s 22.7s 23.3s 23.2s |
Comparing start time: krunkit, vfkit, qemukrunkit start a cluster with --no-kubernetes 1.24 times faster than vfkit and 1.91 faster than qemu. vfkit is 1.04 times faster when starting a full cluster. vm only% hyperfine -r 10 -C "krunkit/out/minikube delete" \
"krunkit/out/minikube start --driver krunkit --no-kubernetes" \
"krunkit/out/minikube start --driver vfkit --network vmnet-shared --no-kubernetes" \
"krunkit/out/minikube start --driver qemu --no-kubernetes"
Benchmark 1: krunkit/out/minikube start --driver krunkit --no-kubernetes
Time (mean ± σ): 8.355 s ± 1.112 s [User: 0.283 s, System: 0.236 s]
Range (min … max): 7.067 s … 10.269 s 10 runs
Benchmark 2: krunkit/out/minikube start --driver vfkit --network vmnet-shared --no-kubernetes
Time (mean ± σ): 10.349 s ± 0.536 s [User: 0.373 s, System: 0.243 s]
Range (min … max): 9.511 s … 11.129 s 10 runs
Benchmark 3: krunkit/out/minikube start --driver qemu --no-kubernetes
Time (mean ± σ): 15.977 s ± 0.433 s [User: 0.507 s, System: 0.250 s]
Range (min … max): 15.570 s … 16.787 s 10 runs
Summary
krunkit/out/minikube start --driver krunkit --no-kubernetes ran
1.24 ± 0.18 times faster than krunkit/out/minikube start --driver vfkit --network vmnet-shared --no-kubernetes
1.91 ± 0.26 times faster than krunkit/out/minikube start --driver qemu --no-kubernetes kubernetes cluster% hyperfine -r 5 -C "krunkit/out/minikube delete" \
"krunkit/out/minikube start --driver krunkit --container-runtime containerd" \
"krunkit/out/minikube start --driver vfkit --container-runtime containerd --network vmnet-shared" \
"krunkit/out/minikube start --driver qemu --container-runtime containerd"
Benchmark 1: krunkit/out/minikube start --driver krunkit --container-runtime containerd
Time (mean ± σ): 20.958 s ± 1.098 s [User: 0.978 s, System: 0.909 s]
Range (min … max): 19.347 s … 21.887 s 5 runs
Benchmark 2: krunkit/out/minikube start --driver vfkit --container-runtime containerd --network vmnet-shared
Time (mean ± σ): 20.108 s ± 0.774 s [User: 1.057 s, System: 1.298 s]
Range (min … max): 19.036 s … 21.078 s 5 runs
Benchmark 3: krunkit/out/minikube start --driver qemu --container-runtime containerd
Time (mean ± σ): 27.047 s ± 1.633 s [User: 0.932 s, System: 0.901 s]
Range (min … max): 25.457 s … 28.802 s 5 runs
Summary
krunkit/out/minikube start --driver vfkit --container-runtime containerd --network vmnet-shared ran
1.04 ± 0.07 times faster than krunkit/out/minikube start --driver krunkit --container-runtime containerd
1.35 ± 0.10 times faster than krunkit/out/minikube start --driver qemu --container-runtime containerd |
Generated by running `make iso-menuconfig-x86_64` and updating kernel version to longterm kernel 6.6.95 and kernel headers to 6.6.x, and then running `make linux-menuconfig-x86_64` to update the linux config. Additinally update hyperv-daemons package to use kernel 6.x.
Generated by running `make iso-menuconfig-aarch64` and updating kernel version to longterm kernel 6.6.95 and kernel headers to 6.6.x, and then running `make linux-menuconfig-aarch64` to update the linux config.
The krunkit driver exposes the host GPU via VirtIO GPU, enabling AI workloads in the guest.
krunkit is a tool to launch configurable virtual machines using the libkrun platform, optimized for GPU accelerated virtual machines and AI workloads on Apple silicon. It is mostly compatible with vfkit; the driver is a simplified copy of the vfkit driver. Unlike vfkit, krunkit is available only on Apple silicon. Changes compared to vfkit driver: - krunkit requires unix socket for netwroking, so we must use vment-helper. - krunkit does not support HardStop, so we kill it using SIGKILL. - We must enable vmnet offloading, required for krunkit. - The code was simplified since vmnet-helper is always used - Code was cleaned up to use .ResolveStorePath() We require krunkit 0.2.2, supporting --restul-uri=unix://.
Previously it was used only for vfkit, so we suggested to fallback to the `nat` network. This advice is not relevant to krunkit or to qemu (which can also use vmnet-helper). Change the error to recommend installing vment-helper. We need to think how we can recommend other networks for vfkit and qemu. Another solution is to create error for every driver+network combination but this seems hard to manage.
This is the same way that we test vfkit. This test is not running in the CI. Issues: - Need to install and configure vment-helper (requires root).
krunkit is a tool to launch configurable virtual machines using the libkrun platform, optimized for GPU accelerated virtual machines and AI workloads on Apple silicon.
It is mostly compatible with vfkit; the driver is a simplified copy of the vfkit driver.
To experiment with the krunkit driver see https://github.com/medyagh/ai-playground-minikube/tree/main/macos
Benchmarking GPU compute workloads shows 2 orders of magnitude improvement with Virtio-GPU, and fully utilized GPU:
Status
Based on #20995 for testing.
Fixes #20803