[WIP] possible solution to propagate informative spawn failure messages from spawner to bhub ui #819
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
First of all I need help here :) I don't fully understand what is happening (specially on JupyterHub side) but here is what I understand so far:
slow_spawn_timeout
setting is by default 0:TimeoutError
and jhub always raises 500 error (even maybe spawner sends 409 error): https://github.com/jupyterhub/jupyterhub/blob/e89836c035f79a44cb5ebc1126e53c6f605464c1/jupyterhub/handlers/base.py#L887-L924binderhub/binderhub/launcher.py
Lines 75 to 84 in 1835d07
binderhub/binderhub/launcher.py
Line 197 in 1835d07
consecutive_failure_limit
is by default 5 (https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/8ed2f8111b5575dc5df29afb114a8ee5906f9a96/jupyterhub/values.yaml#L17)The same process happens regardless of type of error from spawner. In this PR to solve this issue,
slow_spawn_timeout
is set to 10 seconds (default value), so BinderHub gets actual error from spawner. I also setconsecutiveFailureLimit
to 0, so hub doesn't restart after informative failures of spawner. But this is actually not good, because hub also ignores real errors that it has to restart to get rid of them.I updated build and launch code too. So now it propagates 409 error messages from JupyterHub API to UI and doesn't retry to launch. And as I wrote before I need help here, this part can be wrong or missing. So I don't mind if we totally changed it.
related to #712 and #805