New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubelet memory leak when a plugin is registered twice #124716
Comments
/area kubelet |
/sig-node I agree this seems a bug, however I wonder how often plugins fight each other for registration and how severe the memory leak is. With precise numbers we can re-evaluate the prioritization. |
/assign |
A plug-in is registered every 5 seconds, and two registration requests are sent each time. The size of the kubelet process increases by 1 GB in about four days. The kubelet plug-in registration log is as follows: E0506 02:43:49.962434 11715 client.go:88] "ListAndWatch ended unexpectedly for device plugin" err="rpc error: code = Unavailable desc = error reading from server: EOF" resource="fuse" |
How your device plugin is registered in kubelet?
According to the message |
We are using mode 1,but it will repeat the registration every 5 seconds. |
/sig node |
What happened?
We found that the kubelet memory kept increasing,and we exported the pprof of the goroutine. The grpc goroutine leaks, causing memory leakage.
We found out that the reason was because one of the pluginThe following code causes this situation. One client is lost.s kept registering twice and using the same name for both.
kubernetes/pkg/kubelet/cm/devicemanager/plugin/v1beta1/handler.go
Lines 90 to 96 in 1dc30bf
When two requests are registered at the same time, only one client is reserved in s.clients.
kubernetes/pkg/kubelet/cm/devicemanager/plugin/v1beta1/handler.go
Lines 106 to 117 in 1dc30bf
Therefore, after the c.Run () method in the runClient method is executed, s.getClient obtains only one registered client. As a result, the c.grpc.Close () method is not invoked, causing memory and coroutine leakage.
What did you expect to happen?
Even in this case, kubelet should not leak memory.
How can we reproduce it (as minimally and precisely as possible)?
1、The plug-in is registered every 5 seconds and two registration requests are sent at the same time.
2、The kubelet memory usage keeps increasing.
Anything else we need to know?
No response
Kubernetes version
Cloud provider
OS version
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)
The text was updated successfully, but these errors were encountered: