You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What would you like to be added:
Please support the operation of ETCD with ephemeral persistent volumes (sounds like a contradiction), e.g. hostpath or better/safer yet local, so that network attached persistent volumes can be avoided that are often a scarce machine resource (e.g. AWS can only attach 26 resp. 32 volumes for most machine types; Alicloud and Azure even less).
Why is this needed:
We observe that machines can rarely be fully utilised because of the high ratio of pods-with-volumes to pods-without-volumes in a Gardener managed shoot cluster control plane. If the ETCD for events could be configured to avoid network attached persistent volumes, we could improve the machine utilisation considerably (at the expense of only limited additional network costs to "catch up" when a pod is moved to another node).
Considerations:
Losing 1 of 1 pods (non-HA) or 2 of 3 pods (HA) will result in an unrecoverable permanent quorum loss. Because without network attached persistent volumes this could happen more frequently, ETCD druid should detect that and in the case of ephemeral persistent volumes, discard the statefulset and recreate it from scratch (in the context of events, this seems acceptable in many cases as the default events TTL is anyway only 1h and events are no critical/essential resource for the operation of a cluster).
While backup and restore can be added (later), it doesn't have to be added right from the start. Whoever uses ETCD druid should have the liberty to decide for ephemeral persistent volumes.
In order to stick to stateful sets (we don't have to, but it would make things easier), we need to find a PV(C) type that would work for us, e.g. local. So we need to experiment with it and see whether it works as expected, can be dynamically configured (now multiple ETCD pods would need different local paths on the node), and also the cleanup works (data is deleted once the pod is descheduled from the node).
The text was updated successfully, but these errors were encountered:
What would you like to be added:
Please support the operation of ETCD with ephemeral persistent volumes (sounds like a contradiction), e.g. hostpath or better/safer yet local, so that network attached persistent volumes can be avoided that are often a scarce machine resource (e.g. AWS can only attach 26 resp. 32 volumes for most machine types; Alicloud and Azure even less).
Why is this needed:
We observe that machines can rarely be fully utilised because of the high ratio of pods-with-volumes to pods-without-volumes in a Gardener managed shoot cluster control plane. If the ETCD for events could be configured to avoid network attached persistent volumes, we could improve the machine utilisation considerably (at the expense of only limited additional network costs to "catch up" when a pod is moved to another node).
Considerations:
The text was updated successfully, but these errors were encountered: