Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clusterctl move misses objects, data in objects, and results in reboot of workload cluster #931

Open
magicite opened this issue Aug 23, 2022 · 0 comments

Comments

@magicite
Copy link

There appear to be a number of bugs with clusterctl move when using sidero. If I set up the bootstrap cluster (using talosctl cluster create + CAPI + providers) with the following:

  1. In Environment CR named default, custom /spec/initrd/url and /spec/kernel/url
  2. In ServerClass named any, putting in some configPatches (for disk selection)
  3. Discovering hardware relying on auto BMC detection which installs user/pass as k8s secrets in the bootstrap cluster

then I can successfully create my workload cluster on bare metal nodes. However once I do clusterctl move to pivot from the bootstrap cluster to the workload cluster the following happens:

  1. Midway through the move, all of the Servers reboot. That includes the workload cluster to which we are pivoting.
  2. Once the servers reboot and my workload cluster recovers, I can re-run the clusterctl move command and it completes
  3. Investigating the objects that should have been moved over, I find that
    1. My changes to the Environment CR named default are lost/reset
    2. My changes to the ServerClass CR named any are lost/reset
    3. The k8s Secrets that were holding the BMC user/pass were not copied over
    4. I think due to the missing Secrets the controller(s) operating on the Servers CRs cannot make almost any forward progress including updating the clean state, allocated state, etc
  4. It is not possible to provision new nodes or do anything really that you'd want to do after pivoting. If I manually fix the issues from (3) then things seem to work again.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant