Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

karpenter.azure.com/zone requirement causes continuous drift #205

Open
tallaxes opened this issue Mar 15, 2024 · 0 comments
Open

karpenter.azure.com/zone requirement causes continuous drift #205

tallaxes opened this issue Mar 15, 2024 · 0 comments
Labels
area/availability-zones Issues or PRs related to availability zones area/drift Issues or PRs related to Drift area/nodeclaim Issues or PRs related to NodeClaim lifecycle management kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@tallaxes
Copy link
Collaborator

tallaxes commented Mar 15, 2024

Version

Karpenter Version: 99d1bb0 (current main)

Kubernetes Version: v1.27.9

Expected Behavior

One should be able to specify requirements with karpenter.azure.com/zone constraint, for example to only provision nodes in a specific zone, without adverse effects.

Actual Behavior

Specifying any kind of karpenter.azure.com/zone constraint in a NodePool current triggers continuous drift.

Here is what I am thinking is going on. Right now, we cannot (and do not) record this requirement/constraint as a label on NodeClaim. This is because Karpenter will try applying all of these as labels to Node object - and topology.kubernetes.io/zone
is a protected label in AKS. (It will be applied to a new Node correctly, but by a different component). So for now, as a workaround, we use an alternative label karpenter.azure.com/zone. I suspect that it is this discrepancy that causes Karpenter to detect Requirements Drift: Based of NodePool, the NodeClaim is expected to have the zone label, and it does not => out of spec, to be replaced. I also suspect that, while we do have E2E tests in this area, they likely only test that the node gets provisioned, and don't notice the subsequent drift.

Steps to Reproduce the Problem

Use NodePool with any kind of karpenter.azure.com/zone requirement.

Resource Specs and Logs

Continuous drift observed.

Workaround

Specify zone-based constraints (including topologySpreadConstraint, if needed) via workload, rather than NodePool.

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@tallaxes tallaxes added kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on. area/availability-zones Issues or PRs related to availability zones area/nodeclaim Issues or PRs related to NodeClaim lifecycle management area/drift Issues or PRs related to Drift labels Mar 15, 2024
@tallaxes tallaxes changed the title karpenter.azure.com/zone requirement causes drift karpenter.azure.com/zone requirement causes continuous drift Mar 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/availability-zones Issues or PRs related to availability zones area/drift Issues or PRs related to Drift area/nodeclaim Issues or PRs related to NodeClaim lifecycle management kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

1 participant