Bare Metal Provision with High Availability

This section provisions a fresh Kubernetes cluster using Kubeadm. These steps should work fine on both Debian and RedHat based distros. This configuration assumes you already have an HA-API configured, as discussed in the previous section.

Optional

Enable NTP: systemctl enable --now systemd-timesyncd
Disable IPv6 if you're not using it. It just makes things easier to toubleshoot in my opinion.
Append the following lines to /etc/sysctl.conf
```
 net.ipv6.conf.all.disable_ipv6 = 1
 net.ipv6.conf.default.disable_ipv6 = 1
```
Enable Secure Boot
1. Verify secure boot is enabled: mokutil --sb-state
2. Verify lockdown is integrity: cat /sys/kernel/security/lockdown
Set journald max size with sed -i 's/#SystemMaxUse=/SystemMaxUse=1G/' /etc/systemd/journald.conf

Network redundancy (note that bond-mode 4 is LACP and requires switch config as well)

#apt install ifenslave
#cat /etc/network/interfaces

auto eno1
iface eno1 inet manual
     bond-master bond0
     bond-mode 4
auto eno2
iface eno2 inet manual
     bond-master bond0
     bond-mode 4

auto bond0
iface bond0 inet static
     bond-slaves eno1 eno2
     bond-mode 4
     address ADDRESS/MASK
     gateway GATEWAY
     dns-nameservers DNS_SERVERS
     dns-search DNS_SEARCH_DOMAINS

Prerequisites

Disable swap
1. Remove swap references from /etc/fstab
2. Reboot, or deactive the active swap with swapoff -a
Install iptables/nftables and enable it to start on boot
1. systemctl enable nftables --now

Install a container runtime

As the Dockershim CRI is now deprecated, containerd is a good choice to use.

Add the Docker repo (provides the containerd packages).
Install containerd.io and enable it to start on boot.

Cgroups Config:

Kubernetes Cgroup Driver: As of 1.21, Kubernetes uses the systemd cgroup driver by default, but we'll specify it in provision.yaml as well.
Systemd Cgroup Version: As of Debian 11, systemd defaults to using control groups v2.
Containerd Cgroup Version: The default value of runtime type is io.containerd.runc.v2, which means cgroups v2.

Containerd Cgroup Driver: Set containerd to use the SystemdCgroup driver.

containerd config default > /etc/containerd/config.toml

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
  SystemdCgroup = true

Starting with containerd 1.5, the cgroup driver and version can be verified as follows. A bug in versions < 1.5 produces the wrong output. The crictl command will be available after installing the Kubernetes packages.

# crictl -r unix:///run/containerd/containerd.sock info | grep runtimes -A 29
"runtimes": {
  "runc": {
    "runtimeType": "io.containerd.runc.v2",
    "runtimePath": "",
    "runtimeEngine": "",
    "PodAnnotations": [],
    "ContainerAnnotations": [],
    "runtimeRoot": "",
    "options": {
      "BinaryName": "",
      "CriuImagePath": "",
      "CriuPath": "",
      "CriuWorkPath": "",
      "IoGid": 0,
      "IoUid": 0,
      "NoNewKeyring": false,
      "NoPivotRoot": false,
      "Root": "",
      "ShimCgroup": "",
      "SystemdCgroup": true
    },
    "privileged_without_host_devices": false,
    "privileged_without_host_devices_all_devices_allowed": false,
    "baseRuntimeSpec": "",
    "cniConfDir": "",
    "cniMaxConfNum": 0,
    "snapshotter": "",
    "sandboxMode": "podsandbox"
  }
},

Make sure the required modules load on boot

Add the following modules to a conf file in /etc/modules-load.d. Ex: /etc/modules-load.d/k8.conf

overlay
br_netfilter

Set sysctl parameters

Add the following parameters to a conf file in /etc/sysctl.d. Ex: /etc/sysctl.d/99-k8s.conf

net.bridge.bridge-nf-call-iptables  = 1
net.ipv4.ip_forward                 = 1
net.bridge.bridge-nf-call-ip6tables = 1

Load the new paramters with sysctl --system

Install Kubernetes packages

Add the Kubernetes repo and install the following packages.

kubelet: The component that runs on all of the nodes in the cluster and does things like starting pods and containers.
kubeadm: The command to bootstrap the cluster.
kubectl: The command line util to talk to your cluster.

Clone the VM

Remove SSH keys.
1. rm /etc/ssh/ssh_host*
Shutdown system and clone. In vSphere environments, this VM could be converted into a template. This same image can be used for both the controllers and workers.
On Debian distros, you'll need to regenerate SSH keys manually
1. dpkg-reconfigure openssh-server
Change the IPs and hostnames of the clones
Verify /sys/class/dmi/id/product_uuid is uniqe on every host

Initialize Kubernetes cluster

Provision the cluster on a to-be master
1. kubeadm init --config provision.yaml --upload-certs
2. Copy the kubeconfig to the correct user account
3. Install a network addon, paying attention to Network Policy support. Calico is a good option:
  1. Install the Operator: kubectl create -f https://projectcalico.docs.tigera.io/manifests/tigera-operator.yaml
  2. Download the custom resources: curl https://projectcalico.docs.tigera.io/manifests/custom-resources.yaml -O
  3. Customize if necessary
  4. Create the manifest: kubectl create -f custom-resources.yaml
  5. Install calicoctl
  6. When it comes time to upgrade Calico, instructions can be found here.
4. Approve the kubelet CSRs for the new nodes.
  1. kubectl get csr
  2. kubectl certificate approve <name>
5. kubectl get nodes should now show the new master node as Ready
```
NAME      STATUS   ROLES                  AGE   VERSION
node-01   Ready    control-plane,master   1h   v1.20.5
```
Join the other master nodes
1. On the already-running master
  1. Reupload control plane certs and print the decryption key to retrieve them on the other master nodes.
    kubeadm init phase upload-certs --upload-certs
  2. Print the join command to use on the other master nodes.
    kubeadm token create --print-join-command
2. Paste the join command with --control-plane --certificate-key xxxx appended, on each to-be master
3. Approve the CSRs for the new master nodes.
Join the other worker nodes
1. kubeadm token create --print-join-command
2. Approve the CSRs for the new worker nodes.

Verify

kubectl get nodes should now show all nodes as Ready

NAME      STATUS   ROLES                  AGE   VERSION
node-01   Ready    control-plane,master   1h   v1.20.5
node-02   Ready    control-plane,master   1h   v1.20.5
node-03   Ready    control-plane,master   1h   v1.20.5
node-04   Ready    <none>                 1h   v1.20.5
node-05   Ready    <none>                 1h   v1.20.5
node-06   Ready    <none>                 1h   v1.20.5

kubectl get pods --all-namespaces should show all pods as Running

Run the Sonobuoy conformance test

NOTE: If this exits within a couple minutes, it most likely timed out connecting to the API or looking up a name in CoreDNS.
Start the tests. They take awhile: sonobuoy run --wait
Watch the logs in another window: kubectl logs sonobuoy --namespace sonobuoy -f
Get the results: results=$(sonobuoy retrieve)
View the results: sonobuoy results $results
Delete the tests: sonobuoy delete --wait

Run the kube-bench security conformance tests

kubectl apply -f https://raw.githubusercontent.com/aquasecurity/kube-bench/main/job.yaml
Wait until kubectl get pods | grep kube-bench shows Completed
kubectl logs kube-bench-xxxxx | less
kubectl delete job kube-bench

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Bare Metal Provision with High Availability

Optional

Prerequisites

Install a container runtime

Make sure the required modules load on boot

Set sysctl parameters

Install Kubernetes packages

Clone the VM

Initialize Kubernetes cluster

Run the Sonobuoy conformance test

Run the kube-bench security conformance tests

Files

README.md

Latest commit

History

README.md

File metadata and controls

Bare Metal Provision with High Availability

Optional

Prerequisites

Install a container runtime

Make sure the required modules load on boot

Set sysctl parameters

Install Kubernetes packages

Clone the VM

Initialize Kubernetes cluster

Run the Sonobuoy conformance test

Run the kube-bench security conformance tests