-
Hello everyone! I'm seeking advice on the most effective method for defragmenting an etcd cluster. Up to now, my approach has been to perform defragmentation on per instance without causing interruptions on the client side. I would change the leader instance before performing the defragmentation. However, it seems that even when running the defrag command on just one instance (non-leader), it impacts all etcd operations. We've observed numerous etcd timeout errors across the cluster, especially failures to update lease locks. If it's expected behavior for the entire cluster to be unresponsive during the defragmentation of one of the instances, then perhaps running defragmentation against the entire cluster would be a more logical approach. Any insights or recommendations on this matter would be greatly appreciated. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hey @ugur99 - Thanks for your question. It is best to perform defragmentation on a per member basis, leaving the current cluster leader to last. This should result in the least potential disruption to the cluster. When an individual member is performing defragmentation it will be blocked and unable to perform normal duties until the defrag completes. For this reason it is best to defrag only when genuinely necessary. An etcd maintainer has created https://github.com/ahrtr/etcd-defrag#defragmentation-rule to assist with this by only running defrag when specific thresholds have been reached. |
Beta Was this translation helpful? Give feedback.
Hey @ugur99 - Thanks for your question. It is best to perform defragmentation on a per member basis, leaving the current cluster leader to last. This should result in the least potential disruption to the cluster.
When an individual member is performing defragmentation it will be blocked and unable to perform normal duties until the defrag completes. For this reason it is best to defrag only when genuinely necessary. An etcd maintainer has created https://github.com/ahrtr/etcd-defrag#defragmentation-rule to assist with this by only running defrag when specific thresholds have been reached.