Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Secret controller does not handle TLS disablement correctly #808

Open
shreyas-s-rao opened this issue Jun 23, 2024 · 0 comments
Open

Secret controller does not handle TLS disablement correctly #808

shreyas-s-rao opened this issue Jun 23, 2024 · 0 comments
Labels
area/quality Output qualification (tests, checks, scans, automation in general, etc.) related kind/bug Bug

Comments

@shreyas-s-rao
Copy link
Contributor

How to categorize this issue?

/area quality
/kind bug

What happened:

Secret controller does not correctly handle the case where TLS was previously enabled for an Etcd resource (wither etcd client TLS or peer TLS, or etcd-backup-restore TLS), and TLS is then removed from the Etcd spec, but is not yet reconciled by etcd-druid. In such a case, secret controller simply removes the finalizer from the previously referenced secrets since they are no longer present/referenced by any Etcd resource spec, but are still being mounted/used by the etcd statefulset, until the time the Etcd resource is reconciled by druid. This leaves the etcd cluster in a vulnerable state, especially when druid is configured with auto reconciliation disabled.

How to reproduce it (as minimally and precisely as possible):

  • Run druid with auto reconciliation disabled
  • Deploy and Etcd resource with any of the three TLS configs enabled (etcd client TLS, etcd peer TLS or etcd-backup-restore TLS)
  • Wait for, or trigger, reconciliation by druid
  • Remove the TLS config from the Etcd resource spec
  • Observe from druid logs as well as the TLS secrets that secret controller removes the finalizer from the TLS secrets, but they are still used by the etcd cluster (statefulset)
  • Delete any of the TLS secrets for which finalizer was removed
  • Restart any of the etcd pods

This can possibly lead to a quorum loss if more than one pod fail or get rescheduled for any reason.

@gardener-robot gardener-robot added area/quality Output qualification (tests, checks, scans, automation in general, etc.) related kind/bug Bug labels Jun 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/quality Output qualification (tests, checks, scans, automation in general, etc.) related kind/bug Bug
Projects
None yet
Development

No branches or pull requests

2 participants