-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use generated resource allocation options for veda hub #3566
Conversation
While starting to work on 2i2c-org#3565, I realized that VEDA was still using the older style 'node share' rather than the generated 'resource allocation' options. I've swapped over the options to now be based on images for users to choose and resource allocation options generated by our resource allocation script. This matches openscapes, and there has generally been pretty big positive feedback on this mode. I've kept the initial cloning to only happen on the pangeo image as it currently exists, without making any changes. That should be cleaned up as part of 2i2c-org#3565
Merging this PR will trigger the following deployment actions. Support and Staging deployments
Production deployments
|
I'd suggest just rounding down the numbers to the nearest 0.5? |
This does seems like an improvement, though I wonder why more than one section is necessary: Also linking to external descriptions/docs on the what software is installed or how to use the custom image choice. |
@wildintellect that issue is pending jupyterhub/kubespawner#778 getting fixed. |
I've approved pending any tweaks asked for by the community :) |
I'll open another issue to deal with the rounding as it's reasonably complex, and deploy this now. The community seemed to not object to this on slack :) |
🎉🎉🎉🎉 Monitor the deployment of the hubs here 👉 https://github.com/2i2c-org/infrastructure/actions/runs/7413484076 |
2i2c-org#3569 changed the cryptnono daemonset to have different resource requests for the init containers as well as the container. While working on 2i2c-org#3566, I noticed this was generating wrong choices - the overhead was calculated wrong (too small). We were intentionally ignoring init containers while calculating overhead, and turns out the scheduler and the autoscaler both do take it into consideration. The effective resource request for a pod is the higher of the resource requests for the containers *or* the init containers - this ensures that a pod with higher requests for init containers than containers (like our cryptnono pod!) will actually run. This is documented at https://kubernetes.io/docs/concepts/workloads/pods/init-containers/#resource-sharing-within-containers, and implemented in Kubernetes itself at https://github.com/kubernetes/kubernetes/blob/9bd0ef5f173de3cc2d1d629a4aee499d53690aee/pkg/api/v1/resource/helpers.go#L50 (this is the library code that the cluster autoscaler uses). This PR updates the two places we currently have that calculate effective resource requests (I assume eventually these will be merged into one - I haven't kept up with the team's work last quarter here). I've updated the node-capacity-info.json file, which is what seems to be used by the generator script right now.
2i2c-org#3569 changed the cryptnono daemonset to have different resource requests for the init containers as well as the container. While working on 2i2c-org#3566, I noticed this was generating wrong choices - the overhead was calculated wrong (too small). We were intentionally ignoring init containers while calculating overhead, and turns out the scheduler and the autoscaler both do take it into consideration. The effective resource request for a pod is the higher of the resource requests for the containers *or* the init containers - this ensures that a pod with higher requests for init containers than containers (like our cryptnono pod!) will actually run. This is documented at https://kubernetes.io/docs/concepts/workloads/pods/init-containers/#resource-sharing-within-containers, and implemented in Kubernetes itself at https://github.com/kubernetes/kubernetes/blob/9bd0ef5f173de3cc2d1d629a4aee499d53690aee/pkg/api/v1/resource/helpers.go#L50 (this is the library code that the cluster autoscaler uses). This PR updates the two places we currently have that calculate effective resource requests (I assume eventually these will be merged into one - I haven't kept up with the team's work last quarter here). I've updated the node-capacity-info.json file, which is what seems to be used by the generator script right now.
@wildintellect I opened #3584 with the explanation for why we show the odd numbers in memory. |
Brings in 2i2c-org#3566 (and follow-ups) to the GHG hub Ref 2i2c-org#3565
While starting to work on #3565, I realized that VEDA was still using the older style 'node share' rather than the generated 'resource allocation' options. I've swapped over the options to now be based on images for users to choose and resource allocation options generated by our resource allocation script. This matches openscapes, and there has generally been pretty big positive feedback on this mode. It also gives more visibility to the R & QGIS options.
I've kept the initial cloning to only happen on the pangeo image as it currently exists, without making any changes. That should be cleaned up as part of #3565
I also had to update the node-info.json file by running the appropriate command.
Old:
New: