Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingress returning 503s when using Topology Aware Routing and the controller has no endpoints in the zone #11342

Open
LAMRobinson opened this issue May 3, 2024 · 4 comments
Labels
needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@LAMRobinson
Copy link

LAMRobinson commented May 3, 2024

What happened:

Ingress returns 503 when run in a multi-zone setup where the backend endpointslice doesn't have any endpoints in the same zone as the Ingress Controller

What you expected to happen:

Like kube-proxy, Ingress should send you to a random endpoint as topology hints are meant to be fail open not shut (unlike xTP).

My impression is that all testing/thought about this feature has been assuming people are using the topology-aware-routing:auto which doesn't let you into this situation, but the hints feature is explicitly designed to separate the responsibility of making the decision of enabling topology routing for a service from the responsibility of implementing it, so the implementation of the hints in the dataplane shouldn't make decisions around the assumption of what it thinks is setting them.

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):

NGINX Ingress controller
  Release:       v1.8.4
  Build:         05adfe3ee56fab8e4aded7ae00eed6630e43b458
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.21.6

Note that this is still the current behavior in the latest commit of this repo, see this snippet:
https://github.com/kubernetes/ingress-nginx/blob/main/internal/ingress/controller/endpointslices.go#L144

Kubernetes version (use kubectl version):
1.27

Environment:
A multi-zone cluster, e.g.:

DC1
  NodeA
  NodeB

DC2
  NodeC
  NodeD

Then:

ingress-nginx-controller-1 NodeA
ingress-nginx-controller-2 NodeB
ingress-nginx-controller-3 NodeC
ingress-nginx-controller-4 NodeD

service-backend-pod-1 NodeD

Ingress 3/4 on NodeC/D populate the endpoint list including pod-1 and work.

Ingress 1/2 on NodeA/B do not populate the endpoint list as pod-1 is marked as in a different zone in the endpointslice

** Workaround **

Setting service-upstream and delegating the decision to kube-proxy makes this work, as kube-proxy handles this situation properly (sends you to a random endpoint regardless of topology). It would be nice if ingress-nginx handled this though as there are lots of downsides to service-upstream as I'm sure you folks know

@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority labels May 3, 2024
@longwuyuan
Copy link
Contributor

The switch to endpointslices was a requirement AFAIK.

cc @tao12345666333

@LAMRobinson
Copy link
Author

@longwuyuan - I might have misunderstood you but I'm not saying to not use EndpointSlices, but asking for a change in behaviour about how the Ingress Controller treats endpoints in an EndpointSlice when:

  • All endpoints have their topology hints set
  • There is no endpoint running in the same zone as the controller

Right now the function returns an empty list of endpoints thus the controller returns a 503. I would like instead for it instead to return the full list of endpoints, in a "fail open" style behaviour, which matches how kube-proxy behaves.

@longwuyuan
Copy link
Contributor

I assume it would be kube-proxy's job to return what it does being what it is. I am not sure if a ingress-controller can or should mimic kube-proxy behaviour. No data on how it impacts ingress-controller use-case of endpointslices.

Please wait for comments from others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
Development

No branches or pull requests

3 participants