-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ideas to upper bound prometheus-server's memory consumption #2222
Comments
I am curious why this only seems to affect the pangeo hubs one, and not say the 2i2c cluster |
@yuvipanda i suspect a basic relation between WAL and memory during startup, where WAL would depend on amount of metrics collected i assume. Amount of metrics would be coupled to whats being scraped. Amount of data scraped relates to more endponts scraped, such as one node-exporter per node, such as one per dask worker node. |
I think the approach of limiting the amount of metrics consumed is relevant, but I'll go for a close on this issue now, the other ideas was explored a bit. |
I just slowly incremented prometheus-server to 20 GB of memory requests for
pangeo-hubs
cluster. It appears that it wasn't sufficient with 18 GB, because it peaked at close to 19 GB before it fell down to ~3-4GB whenHead GC completed
was logged ~5 minutes after startup.This prometheus-server had a
/data
folder mounted from the attached PVC that was 5.8 GB.kubectl exec -n support deploy/support-prometheus-server -c prometheus-server -- du -sh /data 5.8G /data
The problem we have is that "write-ahead log" (WAL) is being read during startup from the disk to summarize all metrics collected as I understand it, and that takes a lot of memory. Actually, the problem is that we can't know what this memory requirement is, because it grows over time as more metrics are collected.
Ideas
Example on logs from a successfull startup
Related
The text was updated successfully, but these errors were encountered: