-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
helm-chart: prevent Traefik from ignoring the backend ingress rule #7859
Conversation
WalkthroughThe update enhances the Helm chart by introducing a fix to prevent premature 200 OK responses from API endpoints before backend readiness. It increments the chart version from Changes
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There is a condition that may occur during Kubernetes deployment, where the frontend service already has an endpoint (i.e. the frontend pod), but the backend service does not. For example, the backend pod may not have started yet or the service controller may not have had time to react to the backend pod. In this case, when Traefik serves a request with an `/api/...` path, it will see that it matches the `/api` rule, but since the corresponding service has no endpoints, it will _skip_ that rule and try other rules. And since the `/` rule matches everything, it will then route the request to the frontend. This is confusing and unhelpful, and more importantly, it makes health checks return the wrong result. Since the frontend will serve `index.html` to every request, a request to `/api/server/health/` or `/api/server/about` will return a 200 code, even though the server isn't actually up. Because of this bug, I have observed weird failures in the Helm workflow, where the "Wait for CVAT to be ready" step completes, but CVAT is not actually ready. (FYI: The failures I've seen are actually in a private repo, but the failure condition could occur in this repo too. It's just more likely in a private repo, because GitHub uses smaller runners in private repos.) The fix is simple: use the `allowEmptyServices` Traefik setting, which disables the rule skipping behavior. With this setting on, Traefik will return a 503 response for backend URLs until the backend service gains an endpoint.
8231ab2
to
17c8d7b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review Details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (3)
- changelog.d/20240507_140615_roman_allow_empty_services.md (1 hunks)
- helm-chart/Chart.yaml (1 hunks)
- helm-chart/values.yaml (1 hunks)
Files skipped from review due to trivial changes (3)
- changelog.d/20240507_140615_roman_allow_empty_services.md
- helm-chart/Chart.yaml
- helm-chart/values.yaml
Motivation and context
There is a condition that may occur during Kubernetes deployment, where the frontend service already has an endpoint (i.e. the frontend pod), but the backend service does not. For example, the backend pod may not have started yet or the service controller may not have had time to react to the backend pod.
In this case, when Traefik serves a request with an
/api/...
path, it will see that it matches the/api
rule, but since the corresponding service has no endpoints, it will skip that rule and try other rules. And since the/
rule matches everything, it will then route the request to the frontend.This is confusing and unhelpful, and more importantly, it makes health checks return the wrong result. Since the frontend will serve
index.html
to every request, a request to/api/server/health/
or/api/server/about
will return a 200 code, even though the server isn't actually up.Because of this bug, I have observed weird failures in the Helm workflow, where the "Wait for CVAT to be ready" step completes, but CVAT is not actually ready. (FYI: The failures I've seen are actually in a private repo, but the failure condition could occur in this repo too. It's just more likely in a private repo, because GitHub uses smaller runners in private repos.)
The fix is simple: use the
allowEmptyServices
Traefik setting, which disables the rule skipping behavior. With this setting on, Traefik will return a 503 response for backend URLs until the backend service gains an endpoint.How has this been tested?
I deployed the Helm chart, then ran a
kubectl delete deployments.apps cvat-backend-server
to simulate the server being unavailable. Then I curled the/api/server/health/
endpoint.Checklist
develop
branch[ ] I have updated the documentation accordingly[ ] I have added tests to cover my changes[ ] I have linked related issues (see GitHub docs)[ ] I have increased versions of npm packages if it is necessary(cvat-canvas,
cvat-core,
cvat-data and
cvat-ui)
License
Feel free to contact the maintainers if that's a concern.
Summary by CodeRabbit
Bug Fixes
New Features
Documentation
0.12.0
to0.12.1
.