Skip to content

Commit 16af978

Browse files
Ensure that users do not enable auto-upgrades in K8s guides (#1185)
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
1 parent 7d6392a commit 16af978

File tree

4 files changed

+40
-41
lines changed

4 files changed

+40
-41
lines changed

modules/deploy/pages/deployment-option/self-hosted/kubernetes/aks-guide.adoc

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -11,18 +11,16 @@ Deploy a secure Redpanda cluster and Redpanda Console in Azure Kubernetes Servic
1111
1212
== Prerequisites
1313

14-
Before you begin, you must have the following:
15-
16-
* You must satisfy the prerequisites listed in the https://learn.microsoft.com/en-us/azure/aks/learn/quick-kubernetes-deploy-cli#prerequisites[AKS quickstart^]
14+
* Satisfy the prerequisites listed in the https://learn.microsoft.com/en-us/azure/aks/learn/quick-kubernetes-deploy-cli#prerequisites[AKS quickstart^]
1715
to get access to the Azure CLI.
18-
* https://kubernetes.io/docs/tasks/tools/[`kubectl`^]. Minimum required Kubernetes version: {supported-kubernetes-version}.
16+
* Install https://kubernetes.io/docs/tasks/tools/[`kubectl`^]. Minimum required Kubernetes version: {supported-kubernetes-version}.
1917
+
2018
[,bash]
2119
----
2220
kubectl version --short --client
2321
----
2422

25-
* https://helm.sh/docs/intro/install/[Helm^]. Minimum required Helm version: {supported-helm-version}
23+
* Install https://helm.sh/docs/intro/install/[Helm^]. Minimum required Helm version: {supported-helm-version}
2624
+
2725
[,bash]
2826
----
@@ -38,8 +36,6 @@ In this step, you create an AKS cluster with three nodes on https://learn.micros
3836
- 2 cores per worker node, which is a requirement for production.
3937
- Local NVMe disks, which is recommended for best performance.
4038

41-
NOTE: The Helm chart configures default `podAntiAffinity` rules to make sure that only one Pod running a Redpanda broker is scheduled on each worker node. To learn why, see xref:deploy:deployment-option/self-hosted/kubernetes/k-requirements.adoc#number-of-workers[Number of workers].
42-
4339
. Create a resource group for Redpanda:
4440
+
4541
[,bash]
@@ -56,10 +52,15 @@ az aks create -g redpandaResourceGroup -n <cluster-name> \
5652
--generate-ssh-keys \
5753
--enable-node-public-ip \
5854
--node-vm-size Standard_L8s_v3 \
59-
--disable-file-driver
55+
--disable-file-driver \
56+
--node-os-upgrade-channel None <1>
6057
----
6158
+
62-
TIP: For all available options, see the https://learn.microsoft.com/en-us/cli/azure/aks?view=azure-cli-latest#az-aks-create[AKS documentation^].
59+
<1> Set the https://learn.microsoft.com/en-us/azure/aks/auto-upgrade-node-os-image[OS upgrade channel^] to `None` to prevent AKS from automatically rebooting or upgrading nodes.
60+
+
61+
For more details, see the xref:deploy:deployment-option/self-hosted/kubernetes/k-requirements.adoc#node-updates[requirements and recommendations] for deploying Redpanda in Kubernetes.
62+
63+
For all available options, see the https://learn.microsoft.com/en-us/cli/azure/aks?view=azure-cli-latest#az-aks-create[AKS documentation^].
6364

6465
include::deploy:partial$kubernetes/guides/create-storageclass.adoc[leveloffset=+2]
6566

modules/deploy/pages/deployment-option/self-hosted/kubernetes/eks-guide.adoc

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ Then, use `rpk` both as an internal client and an external client to interact wi
1313
1414
== Prerequisites
1515

16-
Before you begin, you must have the following prerequisites.
16+
Before you begin, you must meet the following prerequisites.
1717

1818
=== IAM user
1919

@@ -252,8 +252,6 @@ In this step, you create an EKS cluster with three nodes on https://aws.amazon.c
252252
- 2 cores per worker node, which is a requirement for production.
253253
- Local NVMe disks, which is recommended for best performance.
254254

255-
NOTE: The Helm chart configures default `podAntiAffinity` rules to make sure that only one Pod running a Redpanda broker is scheduled on each worker node. To learn why, see xref:deploy:deployment-option/self-hosted/kubernetes/k-requirements.adoc#number-of-workers[Number of workers].
256-
257255
. Create an EKS cluster and give it a unique name. If your account is configured with OIDC, add the `--with-oidc` flag to the `create cluster` command.
258256
+
259257
[,bash,lines=4-6]
@@ -266,16 +264,22 @@ eksctl create cluster \
266264
--external-dns-access
267265
----
268266
+
269-
[TIP]
267+
[IMPORTANT]
270268
====
271-
To see all options:
269+
Do not enable https://docs.aws.amazon.com/eks/latest/userguide/automode.html[auto mode^] (`--enable-auto-mode`) on Amazon EKS clusters running Redpanda.
270+
271+
Auto mode can trigger automatic reboots or node upgrades that disrupt Redpanda brokers, risking data loss or cluster instability. Redpanda requires manual control over node lifecycle events.
272272
273+
For more details, see the xref:deploy:deployment-option/self-hosted/kubernetes/k-requirements.adoc#node-updates[requirements and recommendations] for deploying Redpanda in Kubernetes.
274+
====
275+
+
276+
To see all options:
277+
+
273278
```bash
274279
eksctl create cluster --help
275280
```
276-
281+
+
277282
Or, for help creating an EKS cluster, see the https://eksctl.io/usage/creating-and-managing-clusters/[Creating and managing clusters^] in the `eksctl` documentation.
278-
====
279283

280284
. Make sure that your local `kubeconfig` file points to your EKS cluster:
281285
+

modules/deploy/pages/deployment-option/self-hosted/kubernetes/gke-guide.adoc

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11,17 +11,15 @@ Deploy a secure Redpanda cluster and Redpanda Console in Google Kubernetes Engin
1111
1212
== Prerequisites
1313

14-
Before you begin, you must have the following:
15-
1614
* Complete the 'Before you begin' steps and the 'Launch Cloud Shell' steps of the https://cloud.google.com/kubernetes-engine/docs/deploy-app-cluster#before-you-begin[GKE quickstart^]. Cloud Shell comes preinstalled with the Google Cloud CLI, the `kubectl` command-line tool, and the Helm package manager.
17-
* https://kubernetes.io/docs/tasks/tools/[`kubectl`^]. Minimum required Kubernetes version: {supported-kubernetes-version}.
15+
* Ensure https://kubernetes.io/docs/tasks/tools/[`kubectl`^] is installed. Minimum required Kubernetes version: {supported-kubernetes-version}.
1816
+
1917
[,bash]
2018
----
2119
kubectl version --short --client
2220
----
2321

24-
* https://helm.sh/docs/intro/install/[Helm^]. Minimum required Helm version: {supported-helm-version}
22+
* Ensure https://helm.sh/docs/intro/install/[Helm^] is installed. Minimum required Helm version: {supported-helm-version}
2523
+
2624
[,bash]
2725
----
@@ -37,8 +35,6 @@ In this step, you create a GKE cluster with three nodes on https://cloud.google.
3735
- 2 cores per worker node, which is a requirement for production.
3836
- Local NVMe disks, which is recommended for best performance.
3937

40-
NOTE: The Helm chart configures default `podAntiAffinity` rules to make sure that only one Pod running a Redpanda broker is scheduled on each worker node. To learn why, see xref:deploy:deployment-option/self-hosted/kubernetes/k-requirements.adoc#number-of-workers[Number of workers].
41-
4238
Create a GKE cluster. Replace the `<region>` placeholder with your own region.
4339

4440
[,bash]
@@ -50,12 +46,18 @@ gcloud container clusters create <cluster-name> \
5046
--region=<region>
5147
----
5248

53-
[TIP]
49+
[IMPORTANT]
5450
====
51+
Do not enable https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-upgrades[node auto-upgrades^] (`--enable-autoupgrade`) on Google GKE clusters running Redpanda.
52+
53+
Node auto-upgrades can trigger automatic reboots or node upgrades that disrupt Redpanda brokers, risking data loss or cluster instability. Redpanda requires manual control over node lifecycle events.
54+
55+
For more details, see the xref:deploy:deployment-option/self-hosted/kubernetes/k-requirements.adoc#node-updates[requirements and recommendations] for deploying Redpanda in Kubernetes.
56+
====
57+
5558
To see all options that you can specify when creating a cluster, see the https://cloud.google.com/sdk/gcloud/reference/container/clusters/create[Cloud SDK reference^].
5659

5760
Or, for help creating a GKE cluster, see the https://cloud.google.com/kubernetes-engine/docs/deploy-app-cluster#create_cluster[GKE documentation^].
58-
====
5961

6062
include::deploy:partial$kubernetes/guides/create-storageclass.adoc[leveloffset=+2]
6163

modules/deploy/partials/requirements.adoc

Lines changed: 9 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -52,35 +52,27 @@ ifndef::env-kubernetes[]
5252
endif::[]
5353

5454
[[node-updates]]
55-
== Node maintenance and operating system upgrades
55+
== Prevent automatic node upgrades
5656

5757
Ensure that node and operating system (OS) upgrades are manually managed when running Redpanda in production. Manual control avoids unplanned reboots or replacements that disrupt Redpanda brokers, causing service downtime, data loss, or quorum instability.
5858

59-
=== Limitations of automatic updates
59+
Common issues with automatic node upgrades include:
6060

61-
Redpanda is stateful. Redpanda brokers manage partition data and leadership, making them sensitive to disruptions. Proper handling during maintenance is required to:
62-
63-
- Avoid data loss, especially for nodes with ephemeral or local storage.
64-
- Ensure smooth leadership transitions by decommissioning brokers before removing a node.
65-
- Minimize service downtime by upgrading nodes one at a time during planned maintenance windows.
66-
67-
However, automatic update mechanisms provided by cloud platforms may not meet Redpanda's stateful requirements. Common issues include:
68-
69-
- Hard timeouts for graceful shutdowns that may not allow Redpanda brokers enough time to complete decommissioning or leadership transitions.
61+
- Hard timeouts for graceful shutdowns that do not allow Redpanda brokers enough time to complete decommissioning or leadership transitions.
7062
- Replacements or reboots without ensuring data has been safely migrated or replicated, risking data loss.
7163
- Parallel upgrades across multiple nodes, which can disrupt quorum or reduce cluster availability.
7264

73-
*Recommendations*:
65+
*Requirements*:
7466

7567
- Disable automatic node maintenance or upgrades.
7668
ifdef::env-kubernetes[]
7769
To prevent managed Kubernetes services from automatically rebooting or upgrading nodes:
78-
** **Azure AKS**: Set the OS upgrade channel to `None`. https://learn.microsoft.com/en-us/azure/aks/auto-upgrade-node-os-image[Azure Documentation^].
79-
** **Google GKE**: Disable GKE auto-upgrades for node pools. https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-upgrades[GCP Documentation^].
80-
** **Amazon EKS**: Avoid enabling EKS node auto-upgrades. https://docs.aws.amazon.com/eks/latest/userguide/worker.html[AWS Documentation^].
81-
- xref:upgrade:k-upgrade-kubernetes.adoc[Manually manage node upgrades].
82-
endif::[]
70+
** **Azure AKS**: https://learn.microsoft.com/en-us/azure/aks/auto-upgrade-node-os-image[Set the OS upgrade channel to `None`^].
71+
** **Google GKE**: https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-upgrades[Disable GKE auto-upgrades for node pools^].
72+
** **Amazon EKS**: https://docs.aws.amazon.com/eks/latest/userguide/automode.html[Disable EKS node auto-upgrades^].
8373

74+
See also: xref:upgrade:k-upgrade-kubernetes.adoc[How to manually manage node upgrades].
75+
endif::[]
8476

8577
== CPU and memory
8678

0 commit comments

Comments
 (0)