Replies: 3 comments 6 replies
-
Thanks for sharing the data. Would you mind sharing more data, such as RAM, CPU, count of members in the cluster, etc?
I am not surprised by this. When a golang process runs out of memory quota, it will try to do GC. Refer to https://tip.golang.org/doc/gc-guide
It's recommended to try to distribute the 2-3k watchers to different watch streams instead of sharing the same watch stream. Refer to
There are some existing watcher related performance issues in the community. What we can do is to try to optimize as much as we can. You also need to do benchmark test to understand the limitation of your system/cluster. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Could be system-related, but quite unlikely it's something simple as open fd limit or anything like that. I've added a code to pin watchers to streams (up to 100 per stream), will have some results once it's out in a day or two. |
Beta Was this translation helpful? Give feedback.
-
Hi!
We are hitting performance issues when running an etcd cluster with relatively high number of watchers.
In a stable state this cluster is capable of serving 1M+ active watchers (roughly 10k clients, some have only a few watchers, some have up to 2-3k watchers) w/o any observed issues.
Once any node in the cluster is restarted and clients need to reestablish watchers on other nodes - cluster performance degrades.
CPU load spikes (interestingly enough - most of the time is spent in go gc), read/write latencies degrade.
Any recommendations on how cluster and/or clients could be reconfigured to diffuse such node-restart triggered load spike?
Beta Was this translation helpful? Give feedback.
All reactions