Use generated resource allocation options for veda hub #3566

yuvipanda · 2024-01-03T23:27:35Z

While starting to work on #3565, I realized that VEDA was still using the older style 'node share' rather than the generated 'resource allocation' options. I've swapped over the options to now be based on images for users to choose and resource allocation options generated by our resource allocation script. This matches openscapes, and there has generally been pretty big positive feedback on this mode. It also gives more visibility to the R & QGIS options.

I've kept the initial cloning to only happen on the pangeo image as it currently exists, without making any changes. That should be cleaned up as part of #3565

I also had to update the node-info.json file by running the appropriate command.

Old:

New:

While starting to work on 2i2c-org#3565, I realized that VEDA was still using the older style 'node share' rather than the generated 'resource allocation' options. I've swapped over the options to now be based on images for users to choose and resource allocation options generated by our resource allocation script. This matches openscapes, and there has generally been pretty big positive feedback on this mode. I've kept the initial cloning to only happen on the pangeo image as it currently exists, without making any changes. That should be cleaned up as part of 2i2c-org#3565

github-actions · 2024-01-03T23:28:36Z

Merging this PR will trigger the following deployment actions.

Support and Staging deployments

Cloud Provider	Cluster Name	Upgrade Support?	Upgrade Staging?	Reason for Staging Redeploy
aws	nasa-veda	No	Yes	Core infrastructure has been modified
aws	nasa-esdis	No	Yes	Core infrastructure has been modified
aws	gridsst	No	Yes	Core infrastructure has been modified
aws	ubc-eoas	No	Yes	Core infrastructure has been modified
kubeconfig	utoronto	No	Yes	Core infrastructure has been modified
aws	catalystproject-africa	No	Yes	Core infrastructure has been modified
aws	openscapes	No	Yes	Core infrastructure has been modified
gcp	2i2c-uk	No	Yes	Core infrastructure has been modified
gcp	hhmi	No	Yes	Core infrastructure has been modified
gcp	leap	No	Yes	Core infrastructure has been modified
gcp	qcl	No	Yes	Core infrastructure has been modified
aws	2i2c-aws-us	No	Yes	Core infrastructure has been modified
aws	nasa-ghg	No	Yes	Core infrastructure has been modified
gcp	awi-ciroh	No	Yes	Core infrastructure has been modified
gcp	2i2c	No	Yes	Core infrastructure has been modified
aws	nasa-cryo	No	Yes	Core infrastructure has been modified
gcp	catalystproject-latam	No	Yes	Core infrastructure has been modified
aws	jupyter-meets-the-earth	No	Yes	Core infrastructure has been modified
gcp	callysto	No	Yes	Core infrastructure has been modified
gcp	pangeo-hubs	No	Yes	Core infrastructure has been modified
gcp	cloudbank	No	Yes	Core infrastructure has been modified
aws	smithsonian	No	Yes	Core infrastructure has been modified
aws	victor	No	Yes	Core infrastructure has been modified
gcp	linked-earth	No	Yes	Core infrastructure has been modified
gcp	meom-ige	No	Yes	Core infrastructure has been modified

Production deployments

Cloud Provider	Cluster Name	Hub Name	Reason for Redeploy
aws	nasa-veda	prod	Core infrastructure has been modified
aws	nasa-esdis	prod	Core infrastructure has been modified
aws	gridsst	prod	Core infrastructure has been modified
aws	ubc-eoas	prod	Core infrastructure has been modified
kubeconfig	utoronto	prod	Core infrastructure has been modified
kubeconfig	utoronto	r-prod	Core infrastructure has been modified
aws	catalystproject-africa	nm-aist	Core infrastructure has been modified
aws	catalystproject-africa	must	Core infrastructure has been modified
aws	openscapes	prod	Core infrastructure has been modified
gcp	2i2c-uk	lis	Core infrastructure has been modified
gcp	hhmi	prod	Core infrastructure has been modified
gcp	leap	prod	Core infrastructure has been modified
gcp	qcl	prod	Core infrastructure has been modified
aws	2i2c-aws-us	showcase	Core infrastructure has been modified
aws	2i2c-aws-us	ncar-cisl	Core infrastructure has been modified
aws	2i2c-aws-us	go-bgc	Core infrastructure has been modified
aws	2i2c-aws-us	itcoocean	Core infrastructure has been modified
aws	2i2c-aws-us	cosmicds	Core infrastructure has been modified
aws	nasa-ghg	prod	Core infrastructure has been modified
gcp	awi-ciroh	prod	Core infrastructure has been modified
gcp	2i2c	imagebuilding-demo	Core infrastructure has been modified
gcp	2i2c	demo	Core infrastructure has been modified
gcp	2i2c	ohw	Core infrastructure has been modified
gcp	2i2c	aup	Core infrastructure has been modified
gcp	2i2c	temple	Core infrastructure has been modified
gcp	2i2c	ucmerced	Core infrastructure has been modified
gcp	2i2c	climatematch	Core infrastructure has been modified
gcp	2i2c	mtu	Core infrastructure has been modified
gcp	2i2c	tufts	Core infrastructure has been modified
aws	nasa-cryo	prod	Core infrastructure has been modified
gcp	catalystproject-latam	unitefa-conicet	Core infrastructure has been modified
aws	jupyter-meets-the-earth	prod	Core infrastructure has been modified
gcp	callysto	prod	Core infrastructure has been modified
gcp	pangeo-hubs	prod	Core infrastructure has been modified
gcp	pangeo-hubs	coessing	Core infrastructure has been modified
gcp	cloudbank	bcc	Core infrastructure has been modified
gcp	cloudbank	ccsf	Core infrastructure has been modified
gcp	cloudbank	csm	Core infrastructure has been modified
gcp	cloudbank	dvc	Core infrastructure has been modified
gcp	cloudbank	elcamino	Core infrastructure has been modified
gcp	cloudbank	evc	Core infrastructure has been modified
gcp	cloudbank	glendale	Core infrastructure has been modified
gcp	cloudbank	howard	Core infrastructure has been modified
gcp	cloudbank	miracosta	Core infrastructure has been modified
gcp	cloudbank	skyline	Core infrastructure has been modified
gcp	cloudbank	demo	Core infrastructure has been modified
gcp	cloudbank	fresno	Core infrastructure has been modified
gcp	cloudbank	humboldt	Core infrastructure has been modified
gcp	cloudbank	laney	Core infrastructure has been modified
gcp	cloudbank	sbcc	Core infrastructure has been modified
gcp	cloudbank	sbcc-dev	Core infrastructure has been modified
gcp	cloudbank	lacc	Core infrastructure has been modified
gcp	cloudbank	lamission	Core infrastructure has been modified
gcp	cloudbank	mills	Core infrastructure has been modified
gcp	cloudbank	mission	Core infrastructure has been modified
gcp	cloudbank	norco	Core infrastructure has been modified
gcp	cloudbank	palomar	Core infrastructure has been modified
gcp	cloudbank	pasadena	Core infrastructure has been modified
gcp	cloudbank	sjcc	Core infrastructure has been modified
gcp	cloudbank	sacramento	Core infrastructure has been modified
gcp	cloudbank	srjc	Core infrastructure has been modified
gcp	cloudbank	saddleback	Core infrastructure has been modified
gcp	cloudbank	santiago	Core infrastructure has been modified
gcp	cloudbank	sjsu	Core infrastructure has been modified
gcp	cloudbank	sierra	Core infrastructure has been modified
gcp	cloudbank	tuskegee	Core infrastructure has been modified
gcp	cloudbank	wlac	Core infrastructure has been modified
gcp	cloudbank	csulb	Core infrastructure has been modified
gcp	cloudbank	csum	Core infrastructure has been modified
aws	smithsonian	prod	Core infrastructure has been modified
aws	victor	prod	Core infrastructure has been modified
gcp	linked-earth	prod	Core infrastructure has been modified
gcp	meom-ige	prod	Core infrastructure has been modified

yuvipanda · 2024-01-03T23:28:52Z

I've pinged @freitagb and @slesaad on slack to take a look.

wildintellect · 2024-01-03T23:40:13Z

I'd suggest just rounding down the numbers to the nearest 0.5?
e.g 1.9 = 1.5, 28.937 = 28.5 etc
As long as your not over promising it's accurate enough.

yuvipanda · 2024-01-03T23:41:19Z

Positive feedback from Alex Mandel on slack:

Although there was a suggestion to do better rounding of the numbers.

wildintellect · 2024-01-03T23:42:55Z

This does seems like an improvement, though I wonder why more than one section is necessary:
box1 - What image type
box2 (based on box1) - what machine size
box3 - only if needed for a custom image

Also linking to external descriptions/docs on the what software is installed or how to use the custom image choice.

yuvipanda · 2024-01-03T23:47:00Z

@wildintellect that issue is pending jupyterhub/kubespawner#778 getting fixed.

sgibson91 · 2024-01-04T12:27:07Z

I've approved pending any tweaks asked for by the community :)

yuvipanda · 2024-01-04T18:37:34Z

I'll open another issue to deal with the rounding as it's reasonably complex, and deploy this now. The community seemed to not object to this on slack :)

github-actions · 2024-01-04T18:37:54Z

🎉🎉🎉🎉

Monitor the deployment of the hubs here 👉 https://github.com/2i2c-org/infrastructure/actions/runs/7413484076

2i2c-org#3569 changed the cryptnono daemonset to have different resource requests for the init containers as well as the container. While working on 2i2c-org#3566, I noticed this was generating wrong choices - the overhead was calculated wrong (too small). We were intentionally ignoring init containers while calculating overhead, and turns out the scheduler and the autoscaler both do take it into consideration. The effective resource request for a pod is the higher of the resource requests for the containers *or* the init containers - this ensures that a pod with higher requests for init containers than containers (like our cryptnono pod!) will actually run. This is documented at https://kubernetes.io/docs/concepts/workloads/pods/init-containers/#resource-sharing-within-containers, and implemented in Kubernetes itself at https://github.com/kubernetes/kubernetes/blob/9bd0ef5f173de3cc2d1d629a4aee499d53690aee/pkg/api/v1/resource/helpers.go#L50 (this is the library code that the cluster autoscaler uses). This PR updates the two places we currently have that calculate effective resource requests (I assume eventually these will be merged into one - I haven't kept up with the team's work last quarter here). I've updated the node-capacity-info.json file, which is what seems to be used by the generator script right now.

yuvipanda · 2024-01-06T02:26:57Z

@wildintellect I opened #3584 with the explanation for why we show the odd numbers in memory.

Brings in 2i2c-org#3566 (and follow-ups) to the GHG hub Ref 2i2c-org#3565

yuvipanda requested a review from a team as a code owner January 3, 2024 23:27

github-actions bot assigned yuvipanda Jan 3, 2024

Fix missing resource allocation choice for custom image

0c64bb1

Update capacities & regenerate veda choices

c822e4d

sgibson91 approved these changes Jan 4, 2024

View reviewed changes

yuvipanda merged commit e6ed45a into 2i2c-org:master Jan 4, 2024
33 checks passed

yuvipanda mentioned this pull request Jan 4, 2024

Include initContainers when calculating pod overhead #3572

Merged

yuvipanda added a commit to yuvipanda/pilot-hubs that referenced this pull request Jan 11, 2024

Make GHG profileList match VEDA's

d8a4e8b

Brings in 2i2c-org#3566 (and follow-ups) to the GHG hub Ref 2i2c-org#3565

yuvipanda mentioned this pull request Jan 11, 2024

Make GHG profileList match VEDA's #3609

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use generated resource allocation options for veda hub #3566

Use generated resource allocation options for veda hub #3566

yuvipanda commented Jan 3, 2024 •

edited

Loading

github-actions bot commented Jan 3, 2024 •

edited

Loading

yuvipanda commented Jan 3, 2024

wildintellect commented Jan 3, 2024

yuvipanda commented Jan 3, 2024

wildintellect commented Jan 3, 2024

yuvipanda commented Jan 3, 2024

sgibson91 commented Jan 4, 2024

yuvipanda commented Jan 4, 2024

github-actions bot commented Jan 4, 2024

yuvipanda commented Jan 6, 2024

Use generated resource allocation options for veda hub #3566

Use generated resource allocation options for veda hub #3566

Conversation

yuvipanda commented Jan 3, 2024 • edited Loading

github-actions bot commented Jan 3, 2024 • edited Loading

Support and Staging deployments

Production deployments

yuvipanda commented Jan 3, 2024

wildintellect commented Jan 3, 2024

yuvipanda commented Jan 3, 2024

wildintellect commented Jan 3, 2024

yuvipanda commented Jan 3, 2024

sgibson91 commented Jan 4, 2024

yuvipanda commented Jan 4, 2024

github-actions bot commented Jan 4, 2024

yuvipanda commented Jan 6, 2024

yuvipanda commented Jan 3, 2024 •

edited

Loading

github-actions bot commented Jan 3, 2024 •

edited

Loading