You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Logs from the same app aren't retrievable (even with --previous)
kubectl logs -n akri ds/akri-agent-daemonset
...
...
...
[2024-11-11T15:52:45Z TRACE agent::plugin_manager::device_plugin_instance_controller] Plugin Manager: Reconciling akri-udev-fdb118
[2024-11-11T15:52:48Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] reclaiming unused slots - start
[2024-11-11T15:52:48Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] register - before call to register with the kubelet at socket /var/lib/kubelet/pod-resources/kubelet.sock
[2024-11-11T15:52:58Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] reclaiming unused slots - start
[2024-11-11T15:52:58Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] register - before call to register with the kubelet at socket /var/lib/kubelet/pod-resources/kubelet.sock
[2024-11-11T15:53:08Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] reclaiming unused slots - start
[2024-11-11T15:53:08Z TRACE agent::plugin_manager::device_plugin_slot_reclaimer] register - before call to register with the kubelet at socket /var/lib/kubelet/pod-resources/kubelet.sock
[2024-11-11T15:53:15Z WARN agent::plugin_manager::device_plugin_instance_controller] Error during reconciliation of Instance Some("akri")::akri-udev-fdb118, retrying in 16s: Other(HyperError: error trying to connect: deadline has elapsed
Caused by:
0: error trying to connect: deadline has elapsed
1: deadline has elapsed)
Additional context
I have followed the cluster setup guide, but I specifically haven't followed the section about granting the regular user admin privileges to the kube config. This seems to me like a security caveat, and I wonder if it is really necessary. Why does Akri need access to the kubelet socket, can't it use the kubernetes API, like other applications?
And if Akri does need access to the kubeconfig, how can the method be made more secure than currently?
Hi @ruzko thank you for your question! We've been pushing a lot of changes in recently preparing for a release -- I see that you're using the akri-dev chart, can you please try reinstalling with the latest dev chart and see if the same behavior occurs?
@ruzko, can you increase the capacity field on your configuration to something greater than 1? That configures how many containers are allowed to use a device at once. Looks like you have 1 node, so it should be fine to keep it at 1, but there may be a race case here. This race case may be more prevalent in the rewrite of the agent. It may be worth trying out the previous v0.12.20 release.
Describe the bug
Pods requesting an akri instance are unable to be scheduled due to `admission error: unable to claim slot
Output of
kubectl get pods,akrii,akric -o wide
kubectl get pods,akric,akrii,services -o wide -n akri
Kubernetes Version: [e.g. Native Kubernetes 1.19, MicroK8s 1.19, Minikube 1.19, K3s]
k3s version v1.30.3+k3s1 (f6466040)
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Pods requesting an akri instance are scheduled with access to that instance
Logs (please share snips of applicable logs)
kubectl get pod nginx-fcb89c6f8-xh7cz -oyaml
Logs from the same app aren't retrievable (even with --previous)
kubectl logs -n akri ds/akri-agent-daemonset
kubectl logs -n akri deploy/akri-webhook-configuration
kubectl logs -n akri akri-udev-discovery-daemonset-5fj55
kubectl logs -n akri deploy/akri-controller-deployment
journalctl -u k3s -r -g akri
Additional context
I have followed the cluster setup guide, but I specifically haven't followed the section about granting the regular user admin privileges to the kube config. This seems to me like a security caveat, and I wonder if it is really necessary. Why does Akri need access to the kubelet socket, can't it use the kubernetes API, like other applications?
And if Akri does need access to the kubeconfig, how can the method be made more secure than currently?
kubectl get akric -n akri -oyaml
kubectl get akric -n akri -oyaml
The text was updated successfully, but these errors were encountered: