-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: iteration on aws k8s upgrade docs #4099
docs: iteration on aws k8s upgrade docs #4099
Conversation
938e9d6
to
818d0f2
Compare
```{warning} | ||
This upgrade will cause disruptions for users and trigger alerts for | ||
[](uptime-checks). To help other engineers, communicate that your are starting a | ||
cluster upgrade in the `#maintenance-notices` Slack channel and setup a [snooze](uptime-checks:snoozes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't need a snooze, so that was removed. I relocated the note on communicating in slack to a dedicated step.
```{warning} | ||
We haven't yet established a policy for planning and communicating maintenance | ||
procedures to users. So preliminary, only make a k8s cluster upgrade while the | ||
cluster is unused or that the maintenance is communicated ahead of time. | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still haven't, but as long as we are only very little disruptive (brief networking etc) we can probably make upgrades while clusters are active anyhow.
I opted to remove this rather than refine this. I acknowledge we should have a policy, but I'd like us to collectively iterate and find agreement after collective experience rather than me declaring and motivating one in a PR about how to make a k8s upgrade technically.
e9d59e2
to
090ba6c
Compare
090ba6c
to
42dee07
Compare
Thanks for reviewing @GeorgianaElena!! I rebased and added a commit with some adjustments after trying them out myself. |
Our docs previously assumed there were no users on the user nodes, assuming a re-creation upgrade strategy could be used. This assumption is removed in this iteration.
I removed the pre-requiesite section "Consider changes to
template.jsonnet
" that I now consider too out of scope to be suggested in docs to be done during a k8s upgrade.Node upgrade strategies, AWS auth, notes on version skew, and maybe something more.
Added step to overview what goes on in the cluster before upgrading. This can for example be used to rule out that something broke because of the upgrade (because it was already broken).
Upgrading multiple clusters in parallell is reasonable, and I've now made it so that the guide is easier to scale to run in parallell.
Related
This was worked in preparation for #4009. If it wasn't done now, it would be harder to handle #4009 even though it wasn't part of scope of #4009 to get this done.
Review
I think the time efficient approach is to let this be practically reviewed by merging it and then using it - iterating on it further if needed to fix issues with it.