[k8s] Jupyterhub adressing notebook containters per IP breaks installs behind http proxy #872

jmozd · 2024-10-25T11:54:20Z

Bug description

Starting the actual notebook servers fails with a time-out message when running a JupyterHub installation on Kubernetes (image quay.io/jupyterhub/k8s-hub:3.3.8 installed via Helm chart jupyterhub:3.3.8 ) in a Kubernetes cluster that requires an external HTTP proxy to access resources outside the K8s.

Debugging shows that the hub tries to address the server pods URLs via (intra-cluster) IP addresses, instead of using DNS host names. NO_PROXY is set to exclude intra-cluster DNS domains from proxying, as well as CIDR ranges for intra-cluster IP addresses. But because the aio library used for HTTP queries does not support CIDR elements in NO_PROXY, the HTTP requests for the server pod status queries are sent to the external proxy, instead of going to the notebook server. The HTTP proxy responds with an error code (it has no way to reach intra-cluster IPs, hence these are blocked by the proxy configuration) and because of the contiuous errors while requesting the notebook status, the hub eventually reports the server to not have started successfully and kills the server instance.

If the distinct IPs of server instances are added to NO_PROXY explicitly, then starting the servers succeeds, as the hub's queries are sent to the server instance directly. But as the allocation of server IPs is dynamic, all possible IPs would have to be added to NO_PROXY individually, which is neither practical to administer nor expected to be performant.

Checking the hub's admin page, looking at the user's server's details, there's already an entry for the DNS name of the notebook server pod.

How to reproduce

define the extra variables "http_proxy", "https_proxy" pointing to an HTTP proxy (i.e. Squid instance)
define the extra variable "no_proxy" (or NO_PROXY) to exclude cluster DNS names and CIDRs of cluster IP ranges
create the hub pod
Try to start a new server instance, check K8s for its IP
check the HTTP proxy access log for IP-based http requests and correlate with the IP of the server instance IP
see the time-out error for the notebook server start reported on the hub webUI

Expected behaviour

The hub should communicate via the dynamically generated, cluster-internal DNS name of the server pod. As it is already displayed on the hub's admin page server details entry, it should already be available to the hub.

Would hub use the cluster-internal DNS name, aio would relate the server pod's name to the entry in no_proxy that is excluding any intra-cluster DNS hosts requests from going to the HTTP proxy.

Actual behaviour

As can be deducted from the HTTP proxy log, JupyterHub uses the IP of the server pod to request status information. The traffic is then redirected to the HTTP proxy, which cannot properly forward the traffic.

Your personal set up

OS: Kubernetes cluster (K3s v1.26.15, managed via Rancher

Version(s): JupyterHub 3.3.8 installed via Helm chart jupyterhub:3.3.8

Configuration

Details of the values.yaml can be made available - no parameter was obviously toggling IP vs DNS usage to access the server pods...

minrk · 2024-11-12T08:04:33Z

If you set:

hub:
  config:
    KubeSpawner:
      services_enabled: true

The URL for pods should use the service DNS name, not the ip address.

You can also define a KubeSpawner.get_pod_url hook (requires writing Python code) to return a URL, given the Pod resource.

But I'm guessing we should probably be using a DNS name all the time.

jmozd · 2024-11-14T19:13:24Z

Thank you, updating the configuration made the Kube'Spawner actually use the service DNS names.

Some component also accesses the Kubernetes API server by IP - I tested the proposed config change by removing all cluster-internal IP addresses from the NO_PROXY list, but it still wouldn't work. I then noticed in the Squid proxy log the access to 10.43.0.1, which in this case is

> kubectl describe service/kubernetes
Name:              kubernetes
Namespace:         default
Labels:            component=apiserver
                   provider=kubernetes
Annotations:       <none>
Selector:          <none>
Type:              ClusterIP
IP Family Policy:  SingleStack
IP Families:       IPv4
IP:                10.43.0.1
IPs:               10.43.0.1
Port:              https  443/TCP

Re-adding that IP to the NO_PROXY list got things working again (and thankfully it's just that single, static IP), but this might be addressed as well?

minrk · 2024-11-14T21:19:35Z

talking to the api server is done with official kubernetes api clients via standard load_incluster_config(). I don't think we pick the url to connect to, kubernetes does. But maybe there's an arg that would work? If you can share Python code that connects a kubernetes client as you would expect from inside the containers, we can give it a try.

jmozd · 2024-11-15T22:42:05Z

Thank you for the pointer - it indeed looks like that code is simply using the host from the env var KUBERNETES_SERVICE_HOST, which is Kubernetes-provided and is set with the IP address, instead of DNS name.

We have a number of other pods running behind http proxies, so my current guess is that somehow using the aio library influences this, while other pods may use other libraries, capable of handling CIDR entries in no_proxy.

From my current point of view, it'd be too much of a hassle to try to fix this in code, especially since adding that single IP to no_proxy solves the issue with a very reasonable amount of "work". Both this and using "services_enabled: true" when running in an environment with external HTTP proxies might be added to the docs, as especially "service_enabled" would have never caught my eye, had you not mentioned it. Granted, I wouldn't know where to put these bits of information, maybe in the descriptions in values.yaml (since this is purely Kubernetes-related)?

Thank you again for your assistance - I'm unsure if you want to keep this issue open WRT to your comment above ("But I'm guessing we should probably be using a DNS name all the time.") - feel free to close, I'm a happy camper now.

jmozd added the bug label Oct 25, 2024

minrk transferred this issue from jupyterhub/jupyterhub Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[k8s] Jupyterhub adressing notebook containters per IP breaks installs behind http proxy #872

[k8s] Jupyterhub adressing notebook containters per IP breaks installs behind http proxy #872

jmozd commented Oct 25, 2024

minrk commented Nov 12, 2024

jmozd commented Nov 14, 2024

minrk commented Nov 14, 2024

jmozd commented Nov 15, 2024

[k8s] Jupyterhub adressing notebook containters per IP breaks installs behind http proxy #872

[k8s] Jupyterhub adressing notebook containters per IP breaks installs behind http proxy #872

Comments

jmozd commented Oct 25, 2024

Bug description

How to reproduce

Expected behaviour

Actual behaviour

Your personal set up

minrk commented Nov 12, 2024

jmozd commented Nov 14, 2024

minrk commented Nov 14, 2024

jmozd commented Nov 15, 2024