Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing "cluster-init" option in config.yaml in the only control plane node. #1294

Open
mateuszlewko opened this issue Mar 26, 2024 · 4 comments

Comments

@mateuszlewko
Copy link

Description

According to steps for restoring the cluster, one of the control plane nodes should have "cluster-init: true" set in /etc/rancher/k3s/config.yaml. I have a cluster with 1 c-p node and inspecting the config.yaml file shows there is no such option set there.

Do you perhaps have an idea why is that? It's a freshly created cluster with the config below.

Kube.tf file

module "kube-hetzner" {
  providers = {
    hcloud = hcloud
  }
  hcloud_token = ...

  source  = "kube-hetzner/kube-hetzner/hcloud"
  version = "2.13.4"

  # For details on SSH see https://github.com/kube-hetzner/kube-hetzner/blob/master/docs/ssh.md
  ssh_public_key  = ...
  ssh_private_key = ...

  # For Hetzner locations see https://docs.hetzner.com/general/others/data-centers-and-connection/
  network_region = "eu-central" # change to `us-east` if location is ash

  control_plane_nodepools = [
    {
      name        = "control-plane-fsn1",
      server_type = "cax11",
      location    = "fsn1",
      labels      = [],
      taints      = [],
      count       = 1
    },
    {
      name        = "control-plane-nbg1",
      server_type = "cax11",
      location    = "nbg1",
      labels      = [],
      taints      = [],
      count       = 0
    },
    {
      name        = "control-plane-hel1",
      server_type = "cax11",
      location    = "hel1",
      labels      = [],
      taints      = [],
      count       = 0
    }
  ]

  agent_nodepools = [
    {
      name        = "agent-cax21-hel1",
      server_type = "cax21",
      location    = "hel1",
      labels      = [],
      taints      = [],
      count       = 1
    },
    {
      name        = "agent-cax11-nbg1",
      server_type = "cax11",
      location    = "nbg1",
      labels      = [],
      taints      = [],
      count       = 0
    },
  ]

  enable_wireguard = true

  # https://www.hetzner.com/cloud/load-balancer
  load_balancer_type     = "lb11"
  load_balancer_location = "fsn1"

  # See how to configure agent nodepools for longhorn here https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/discussions/373#discussioncomment-3983159
  # Also see Longhorn best practices here https://gist.github.com/ifeulner/d311b2868f6c00e649f33a72166c2e5b
  enable_longhorn = true

  # If you want to configure additional trusted IPs for traefik, enter them here as a list of IPs (strings).
  # Example for Cloudflare:
  traefik_additional_trusted_ips = [...]

  # For all options see: https://kured.dev/docs/configuration/
  kured_options = {
    "reboot-days" : "sa",
    "start-time" : "8am",
    "end-time" : "2pm",
    "time-zone" : "Local",
    "lock-release-delay" : "30m",
    "drain-grace-period" : 180,
  }

  enable_cert_manager = true

  dns_servers = [
   ...
  ]

  use_control_plane_lb = false
  create_kubeconfig    = true
  create_kustomization = false

  etcd_s3_backup = {
    ....
  }
}

Screenshots

No response

Platform

Mac

@mateuszlewko mateuszlewko added the bug Something isn't working label Mar 26, 2024
@mateuszlewko
Copy link
Author

mateuszlewko commented Mar 26, 2024

I think the initial config created with null_resources.first_control_plane is overridden by null_resource.control_plane_config. Is this intended? Perhaps in locals.k3s-config we should add something like cluster-init: k == 0?

@mysticaltech
Copy link
Collaborator

@mateuszlewko Yes, indeed we override it as to not make the first control-plane special. Could you please explain your proposed solution for the restore flow, not sure I follow.

@mateuszlewko
Copy link
Author

Hey,

I was referring to "Backup and restore a cluster" guide in https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner?tab=readme-ov-file#examples.

The postinstall_exec script there contains:

 export CLUSTERINIT=$(cat /etc/rancher/k3s/config.yaml | grep -i '"cluster-init": true')
      if [ -n "$CLUSTERINIT" ]; then
        echo indeed this is the first control plane node > /tmp/restorenotes

which kind of assumes the first control plane node is special.

@mysticaltech
Copy link
Collaborator

@mateuszlewko Ah yes, so that needs to change, PR welcome to correct this example, please.

@mysticaltech mysticaltech removed the bug Something isn't working label Mar 29, 2024
@mysticaltech mysticaltech changed the title [Bug]: Missing "cluster-init" option in config.yaml in the only control plane node. Missing "cluster-init" option in config.yaml in the only control plane node. Mar 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants