Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Cancel instance refresh on any relevant change to ASG instead of blocking until previous one is finished (which may have led to failing nodes due to outdated join token) #5173

Open
wants to merge 1 commit into
base: s3-user-data
Choose a base branch
from

Conversation

AndiDog
Copy link
Contributor

@AndiDog AndiDog commented Oct 22, 2024

What type of PR is this?

/kind feature

What this PR does / why we need it:

Changing any relevant spec.* for an AWSMachinePool triggers rolling of nodes via ASG instance refresh. If another change happens shortly afterwards, it has to wait until the first rollout is done, and will then trigger another instance refresh. But it is neither necessary nor desired to roll all worker nodes twice in such a case, and it's much slower. Instead, cancel the one pending instance refresh, wait until another one can be started, and apply the latest change as soon as possible with the second instance refresh.

This change has been running fine in Giant Swarm's CAPA fork for three months at the time of opening this PR.

Special notes for your reviewer:

This PR stacks on top of #5148, so let's please review and merge that other PR first. After that's done, this PR can be retargeted to main. I didn't want to separate these independent changes because otherwise I have to deal with merge conflicts.

Checklist:

  • squashed commits
  • includes documentation
  • includes emojis
  • adds unit tests
  • adds or updates e2e tests

Release note:

Cancel instance refresh on any relevant change to ASG instead of blocking until previous one is finished

…king until previous one is finished (which may have led to failing nodes due to outdated join token)
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 22, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from andidog. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added needs-priority size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 22, 2024
@richardcase
Copy link
Member

For reviewers this now stack on #5172 and not #5148.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. needs-priority release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants