-
Notifications
You must be signed in to change notification settings - Fork 389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CPU requests and limits for build pod #1348
base: main
Are you sure you want to change the base?
Conversation
Thanks for submitting your first pull request! You are awesome! 🤗 |
Thanks for this PR! I was surprised to see that we don't already have this, which means this is for sure useful :D Looking and thinking about the code now. Some immediate thoughts:
|
I think it'll be fine for non-k8s systems, we can ignore the limits 😄. However if we want to make it easier to implement limits in non-k8s backends it might be nice to tighten the specification. |
@betatim, @manics, thanks for feedback Tbh I was confused about units in JH/BH, seems like different approach from k8s is used (M, G, K instead of Mi, Gi, Ki), so I was not sure what is "right" here. Due to time pressure, just ended with k8s convention, I'm open to adjust. On one hand currently it feels unnatural to use convention different from k8s, otoh in this case probably is bad idea to use convention that differs from rest of JH ecosystem. My honest feeling is the whole idea of having these traits for extra validation/translation for different convention breaks least surprise principle, and is if not already a source of some confusion. If i'm an operator and deploy it to k8s, expectation is that requests and limits follow k8s conventions and be passed verbatim as specified in config to the pod spec, rather than being transformed via traits to some magic unexpected common denominators. Applies to JH as well. As for tests, I'd be more than glad to write some, as this PR is driven by some internal requirements and we're interested in merging it to upstream to avoid the cost of maintaining fork. Will raise this tomorrow internally. |
@noroutine I understand your point! There's no single right answer, some people expect things to follow a standard format that is independent of the underlying implementation, some expect it to follow a low level standard (K8s spec), others expect it to follow a different low-level standard (e.g. systemd uses percentages). I think the decision should come down to whether we'd like to support limits in future alternative backends. If we decide only K8s matters then using the K8s format is fine, otherwise we should consider how other developers could implement it. Regardless of the format I think it'd be nice to include a description of it in the help as well as a URL. That guarantees it's always known, even if the URL changes in future. |
The For CPU limits: the ones implemented in this PR follow the format kubernetes uses as well as what the docker CLI "natively" understands. This is also the format repo2docker uses. So unless someone has strong opinions on an alternative format I'd say we are consistent within our ecosystem and with kubernetes. So lets stick with this for now? It would be great to have some tests, even just simple ones that check that the basics aren't broken. Without this something will sooner or later break this feature and no one will notice. |
0395ca4
to
4a511f1
Compare
Hey guys I've just pushed some tests to cover that trait, let me know if i can do any more from my side to get this through! Kind regards |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nice!
@manics @betatim tests has been added and the code looks good to me. I also think that the config is reasonable in a non-k8s context, it could be relevant to have such requests/limits there as well even though maybe it doesn't - and then it can be documented they will be ignored for such environments later. Go for merge? |
Not sure why the tests are failing :( |
If you go to the logs for the failing check: https://github.com/jupyterhub/binderhub/pull/1348/checks?check_run_id=3697325231
Work on #1318 went a lot faster than expected! The new parameter needs to be added to binderhub/binderhub/build_local.py Lines 65 to 93 in a8e3444
The alternative is to add this as a configurable Trailets on the Build class when it's refactored as discussed in #1318 (comment). That way you can avoid adding a parameter to BinderHub, the configurable should be passed down to the class, and since it can be a configurable on e.g. KubernetesBuilder it can use Kubernetes specific config. The downside is that'll delay the addition of this new feature. However there's another issue. The build pod runs repo2docker, which does the build in another Docker container (by communicating directly with the Docker daemon through a socket). Since the Docker build takes place outside K8S the K8S pod limits won't apply, they somehow need to be passed to repo2docker. Based on this comment: Lines 132 to 136 in a8e3444
a memory request is useful because it guarantees a certain amount of memory for the build pod. If I understand correctly the build pod won't use that memory so most of it will be free on the K8S node, which means the Docker container where the build occurs can use it. memory_limit has an effect because it's passed down to repo2docker, which then sets the memory limit using the Docker build API:Lines 193 to 195 in a8e3444
https://github.com/jupyterhub/repo2docker/blob/3eef69f7454ddb9086af38b2ece5f632af05920f/repo2docker/app.py#L161-L169 This means |
Hi gentlemen Unfortunate it failed again. I was a bit out on leave, and checking now above, trying to understand what best action on making this work again. Thank you |
@noroutine You'll need to add cpu limit support to repo2docker, then you can come back here and pass the appropriate flags when repo2docker is run. |
Thank you @manics |
This PR makes CPU request and limit for build pod to be configurable in config, just like memory settings
Trait was implemented to be flexible in taken values and accepts integer, floats, string representations of those, as well as millicore spec as per https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu