-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include initContainers when calculating pod overhead #3572
Conversation
2i2c-org#3569 changed the cryptnono daemonset to have different resource requests for the init containers as well as the container. While working on 2i2c-org#3566, I noticed this was generating wrong choices - the overhead was calculated wrong (too small). We were intentionally ignoring init containers while calculating overhead, and turns out the scheduler and the autoscaler both do take it into consideration. The effective resource request for a pod is the higher of the resource requests for the containers *or* the init containers - this ensures that a pod with higher requests for init containers than containers (like our cryptnono pod!) will actually run. This is documented at https://kubernetes.io/docs/concepts/workloads/pods/init-containers/#resource-sharing-within-containers, and implemented in Kubernetes itself at https://github.com/kubernetes/kubernetes/blob/9bd0ef5f173de3cc2d1d629a4aee499d53690aee/pkg/api/v1/resource/helpers.go#L50 (this is the library code that the cluster autoscaler uses). This PR updates the two places we currently have that calculate effective resource requests (I assume eventually these will be merged into one - I haven't kept up with the team's work last quarter here). I've updated the node-capacity-info.json file, which is what seems to be used by the generator script right now.
Brings in the new memory / cpu limits set up for 2i2c-org#3572 - without that, the largest set up size doesn't actually spawn on r5.4xlarge due to insufficient memory.
Put up the adjusted memory limits in #3573 - I'm not sure why an update for r5.4xlarge was needed (seems unrelated to the fixes here, as the fix should be a no-op). |
@yuvipanda this looks great, I added a commit where I re-executed the generate command for all clusters based on a comment in the daemonset-requests.yaml file: # This file isn't updated by automation, but can easily be updated by manually
# running a command once for each cluster:
#
# ls config/clusters | xargs -I {} deployer generate resource-allocation daemonset-requests {}
# |
Ah, thank you @consideRatio! I'll do that next time I change any code that affects these generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working this Yuvi!!
#3569 changed the cryptnono daemonset to have different resource requests for the init containers as well as the container. While working on #3566, I noticed this was generating wrong choices - the overhead was calculated wrong (too small).
We were intentionally ignoring init containers while calculating overhead, and turns out the scheduler and the autoscaler both do take it into consideration. The effective resource request for a pod is the higher of the resource requests for the containers or the init containers - this ensures that a pod with higher requests for init containers than containers (like our cryptnono pod!) will actually run. This is documented at
https://kubernetes.io/docs/concepts/workloads/pods/init-containers/#resource-sharing-within-containers, and implemented in Kubernetes itself at
https://github.com/kubernetes/kubernetes/blob/9bd0ef5f173de3cc2d1d629a4aee499d53690aee/pkg/api/v1/resource/helpers.go#L50 (this is the library code that the cluster autoscaler uses).
This PR updates the two places we currently have that calculate effective resource requests (I assume eventually these will be merged into one - I haven't kept up with the team's work last quarter here).
I've updated the node-capacity-info.json file, which is what seems to be used by the generator script right now.