-
Notifications
You must be signed in to change notification settings - Fork 389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Idle culling will stop working with traefik proxy #831
Comments
The main reason we don't require jupyterhub-singleuser in repo2docker is that it would be a pain to put it in the containers, since it is sensitive to the version in the Hub pod. This is another case, I think, for the transparent
Since I think this is mostly useful in container-based deployments like kubernetes, requiring Python so that it can import the jupyterhub auth implementation from jupyterhub itself is probably the best way to go from a maintainability standpoint (as opposed to go, which has benefits for portability, but not something I think our team has the capacity to develop and maintaining at this point). Then this proxy would go in a sidecar container, exerting no requirements on the user container beyond an http endpoint that can run on a prefix. With that, we could preserve the assumption that the user pods are fully equipped jupyterhub pods (implementing auth, etc.), while also separating it from the user env. |
@minrk I think for activity tracking, we can say something like 'we will hit /activity, and it should return the timestamp of last activity' (or something like that) that then becomes a protocol that can be implemented by multiple server implementations. We can do that in a sidecar, in singleuser, etc as we see fit. If we know we're running notebook, this can just use the internal notebook activity tracking. Else it can rely on network or some other mechanism. I don't think tracking this through would be too slow - this is the same as prometheus's model. I think JupyterHub should do this tracking itself internally if possible... In general, I think it'll be great to explicitly define what a 'jupyterhub equipped pod' means, and we can go from there. I agree re: go. My ideal would be to find a maintained proxy from somewhere else that can give us the behavior we want purely through configuration rather than requiring us to write and maintain code. |
I'm not sure what you mean by internally tracking here. The design of JupyterHub is that the Hub is completely not involved during normal user interaction with their own server(s). So it is 100% on the proxy and/or server to implement activity tracking. It is already the case that the Hub is responsible for storing the activity, so it's only one request for Hub API clients to check last_activity of all servers, if that's what you are referring to.
That's an interesting idea. This will have to be configurable, but we can do it. JupyterHub 1.0 reverses this - singleuser servers push activity rather than the Hub pulling it. The auth circumvention in binderhub makes pull a challenge, since jupyterhub's token auth to the API doesn't work.
Yeah, we can work on this. I believe what we have with JupyterHub 1.0 is:
The biggest challenge with activity tracking when using the jupyterhub-naïve notebook server, either push or pull, is that it requires server extensions and coordinated auth (jupyterhub can't make authenticated requests to binder notebooks). This is a challenge for Binder, where we don't have a good answer for installing deployment-sensitive extensions. |
Thanks for the detailed response, @minrk! By 'internally', I meant the code to make these external calls and keep a note of their last activity status should be in JupyterHub. However, if the tracking is 'push' in 1.0 (I didn't know this!) that sounds awesome and much more efficient. I guess this too can be part of the JupyterHub server protocol as you mention. Do you think we can formally write that up somewhere? |
Yes, absolutely. I put the skeleton of what should be included above as a note because I only had a few minutes, but the detailed version of this definitely belongs in the jupyterhub docs. Probably a new page. |
Hi everyone I'm currently implementing a JupyterHub and a BinderHub and I'm facing some issues with culling.
We currently have those options:
Other options allow to determine frequency for checking activity/culling/culling users/etc. I focused on kernels/servers culling. It looks like a lot of stuff to do, mainly at notebook/JLab levels. I don't know how Jupyter works to address priority levels/consistency/user experience between notebook/lab/hub/binder. Can you raise this point as active members of Jupyter? Feel free to complete/correct my understanding of the situation. I would be happy to contribute! |
All of the classic notebook's culling features are available in JupyterLab because those are server-side features and jupyterlab uses the same server (soon a fork of the same server with the same features, but still). JupyterHub's culling in general works just fine with JupyterLab, but can be hindered somewhat by JupyterLab's sometimes overzealous polling behavior (I believe this is the linked lab issue). I don't think there's necessarily a whole lot to do. Adding internal max-age is easy to do, even via a server extension: from traitlets.config.application import Application
max_age = 3600 # one hour
def shutdown():
Application.instance().stop()
IOLoop.current().call_later(max_age, shutdown) Culling terminals with similar parameters to kernels makes perfect sense. The JupyterLab polling is a recurring issue, and trying to get JupyterLab to do less with "idle" things (and what counts as idle?) is always a question. |
@minrk I think this can be closed, right? |
I don't think so. If the JupyterHub chart switched to traefik from chp, binderhub would have to disable the idle culler because it wouldn't work, as the Hub would have no sources of activity for binder pods (unless auth is enabled). jupyterhub/traefik-proxy#151 is the issue for activity tracking in traefik-proxy, which I think is doable (if we can assume prometheus), but a nontrivial amount of work and has some tricky decisions with tradeoffs to consider. |
Related to jupyterhub/zero-to-jupyterhub-k8s#1162 and jupyterhub/jupyterhub#2346
Binder currently relies on JupyterHub's activity tracking. With JupyterHub < 1.0, this information comes solely from configurable-http-proxy. JupyterHub 1.0 moves the onus for this to jupyterhub-singleuser because alternative proxies like traefik do not track activity. This is better for JupyterHub in general, but since binder launches vanilla notebook servers and not jupyterhub-singleuser, this activity is not tracked at the network level.
Additionally, we have learned on mybinder.org that the notebook's internal activity tracking is better and more reliable since it can make more fine-grained activity decisions (e.g. choosing to cull with idle but connected websockets).
So we have some facts:
cull_idle_servers.py
assumes the behavior ofjupyterhub-singleuser
or configurable-http-proxy for activity trackingI'm not sure exactly what we should do about this, but it's a pretty big issue and a blocker for adopting more resilient proxies in BinderHub. If we continue to assume the notebook server in BinderHub, we can write a new idle-culler that talks directly to the notebook API, ignoring the Hub's activity data. This will be quite inefficient, as it requires lots more requests to notebook servers rather than a single request to the Hub (this can be scaled by sharding the culler). This is also getting closer and closer to not using JupyterHub for anything at all. If we want to skip over that and remove the notebook server assumption, we need to get to work on a sidecar container that at least implements activity tracking (reintroducing the problem of network activity not being as good as internal activity tracking), and possibly also implements auth.
The text was updated successfully, but these errors were encountered: