Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: iteration on aws k8s upgrade docs #4099

Merged
merged 4 commits into from
May 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,9 @@ help:

.PHONY: help Makefile

# keep --ignore entries in sync with content in noxfile.py
live:
sphinx-autobuild --ignore "_build/*" --ignore "tmp/*" --ignore "_static/hub-*.json" -b dirhtml -n . _build/dirhtml
sphinx-autobuild --ignore "*/_build/*" --ignore "*/tmp/*" --ignore "*/*.json" --ignore "*/*.csv" -b dirhtml -n . _build/dirhtml

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
Expand Down
335 changes: 150 additions & 185 deletions docs/howto/upgrade-cluster/aws.md

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions docs/howto/upgrade-cluster/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ some in shared clusters have been announced ahead of time.
:maxdepth: 1
:caption: Upgrading Kubernetes clusters
upgrade-disruptions.md
node-upgrade-strategies.md
k8s-version-skew.md
aws.md
```
25 changes: 25 additions & 0 deletions docs/howto/upgrade-cluster/node-upgrade-strategies.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
(upgrade-cluster:node-upgrade-strategies)=

# About strategies to upgrade nodes

## About rolling upgrades

To upgrade node group's nodes, we typically do *rolling upgrades*. In a rolling
upgrade, something new is added to replace something old before its removed.

When doing a rolling upgrade of node groups, we can do a rolling upgrade _fast
and forcefully_ or _slow and patiently_ - either pods running on a node group's
nodes get forcefully stopped, or they get to stop on their own.

*Managed* node groups can do fast and forceful rolling upgrades, while
*unmanaged* node groups need to be re-created to get upgraded k8s software
(`kubelet` etc).

Core nodes' workloads can be suitable to forcefully relocate, while user nodes'
workloads should be given time to stop on their own.

## About re-creation upgrades

With unmanaged node groups like on EKS, if disruption isn't a concern or if
there isn't anything running to disrupt, node groups can be deleted and
re-created to save time.
9 changes: 7 additions & 2 deletions docs/hub-deployment-guide/new-cluster/new-cluster.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,13 @@ This guide will assume you have already followed the guidance in [](/topic/infra

Mac users with homebrew can run `brew install eksctl`.

Verify install and version with `eksctl version`. You typically need a very
recent version of this CLI.
Verify install and version with `eksctl version`. You should have *the latest
version* of this CLI.

```{important}
Without the latest version, you may install an outdated versions of `aws-node`
because [its hardcoded](https://github.com/eksctl-io/eksctl/pull/7756).
```

4. Install [`jsonnet`](https://github.com/google/jsonnet)

Expand Down
2 changes: 1 addition & 1 deletion eksctl/2i2c-aws-us.jsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ local daskNodes = [
nodeGroups: [
ng + {
namePrefix: 'core',
nameSuffix: 'b',
nameSuffix: 'a',
nameIncludeInstanceType: false,
availabilityZones: [nodeAz],
ssh: {
Expand Down
5 changes: 4 additions & 1 deletion noxfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,16 @@ def docs(session):
session.posargs.pop(session.posargs.index("live"))

# Add folders to ignore
# keep this in sync with Makefile
AUTOBUILD_IGNORE_DIRS = [
"_build",
"tmp",
]
# Add files to ignore
# keep this in sync with Makefile
AUTOBUILD_IGNORE_FILES = [
"_static/*.json",
"*.json",
"*.csv",
]

cmd = ["sphinx-autobuild"]
Expand Down
2 changes: 1 addition & 1 deletion terraform/gcp/projects/qcl.tfvars
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ user_buckets = {
}

notebook_nodes = {
# FIXME: tainted, to be deleted when empty, replaced by k8s upgraded variant
# FIXME: tainted, to be deleted when empty, replaced by equivalent during k8s upgrade
"n2-highmem-4" : {
min : 0,
max : 100,
Expand Down