Prometheus exporter high memory usage #658

stibi · 2023-11-21T12:24:18Z

Hello,
we have trouble with solr exporter, it's very hungry for memory, it needs around ~6G of RAM, which is a lot and I can't figure out why.

Can I ask you any hint?

It's pretty much default setup, nothing custom:

SolrCloud 9.3.0. Nothing too much custom for the exporter deployment:

apiVersion: solr.apache.org/v1beta1
kind: SolrPrometheusExporter
metadata:
  name: solr-exporter
spec:
  customKubeOptions:
    podOptions:
      resources:
        requests:
          cpu: 500m
          memory: 3072Mi
        limits:
          cpu: 2000m
          memory: 6912Mi
      envVars:
        - name: JAVA_HEAP
          value: 6000m
  solrReference:
    cloud:
      name: "solr-cloud"
  numThreads: 6

The text was updated successfully, but these errors were encountered:

radu-gheorghe · 2023-11-21T16:01:22Z

I think this tells it to allocate 6GB:

      envVars:
        - name: JAVA_HEAP
          value: 6000m

I assume it can do with much less than 6000m. Try a 10th of that and see how it goes.

stibi · 2023-11-21T16:28:57Z

Ah … I thought that is its maximum value, not making it that big…that makes sense now. I was confused by another problem, where the exporter was in crashloop all the time, I solved that by tuning the livenes probe a bit… fiddling with heap size was one of the attempts to fix that… Thanks, I think it will be quite ok with default heap size value, will try that in a moment.

…

On Tue 21. 11. 2023 at 17:01, Radu Gheorghe ***@***.***> wrote: I think this tells it to allocate 6GB: envVars: - name: JAVA_HEAP value: 6000m I assume it can do with much less than 6000m. Try a 10th of that and see how it goes. — Reply to this email directly, view it on GitHub <#658 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAOD2SG5PENFVBMCNCJ263YFTF55AVCNFSM6AAAAAA7UPKTP6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRRGIYDOMRRG4> . You are receiving this because you authored the thread.Message ID: ***@***.***>

stibi · 2023-11-22T10:13:53Z

Ouch, so maybe I wasn't so wrong about it ... I removed the JAVA_HEAP env var, but the exporter started to failing with java.lang.OutOfMemoryError: Java heap space. Here we go, full circle :D

So I had to put the JAVA_HEAP back to see how much java heap space it actually needs and the number is 5G. With that much heap space, the exporter is running without error. But it takes quite some time to collect all the metrics, isn't that weird?

INFO  - 2023-11-22 09:53:39.225; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Completed metrics collection
INFO  - 2023-11-22 09:54:39.226; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Beginning metrics collection
INFO  - 2023-11-22 09:55:15.506; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Completed metrics collection
INFO  - 2023-11-22 09:56:15.506; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Beginning metrics collection
INFO  - 2023-11-22 09:56:53.088; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Completed metrics collection
INFO  - 2023-11-22 09:57:53.088; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Beginning metrics collection
INFO  - 2023-11-22 09:58:29.369; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Completed metrics collection
INFO  - 2023-11-22 09:59:29.369; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Beginning metrics collection
INFO  - 2023-11-22 10:00:06.842; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Completed metrics collection
INFO  - 2023-11-22 10:01:06.842; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Beginning metrics collection
INFO  - 2023-11-22 10:01:41.788; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Completed metrics collection
INFO  - 2023-11-22 10:02:41.788; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Beginning metrics collection
INFO  - 2023-11-22 10:03:22.174; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Completed metrics collection
INFO  - 2023-11-22 10:04:22.174; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Beginning metrics collection
INFO  - 2023-11-22 10:04:57.249; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Completed metrics collection
INFO  - 2023-11-22 10:05:57.250; org.apache.solr.prometheus.collector.SchedulerMetricsCollector; Beginning metrics collection

I was able to take a heap dump, using the jattach utility (awesome it's packaged with the container image, thanks for that!), but I guess I don't really know how to properly read it .. it says that the heap size is only 23549096B big ... which is 23.549096 MB? That's not so much.

radu-gheorghe · 2023-11-22T11:03:06Z

Yep, that's 23MB. Weird that it takes a while to collect metrics, is that a symptom (e.g. of the Exporter stuck in GC, then it doesn't have spare CPU to collect the metrics) or a cause (e.g. you have a ton of shards in the cluster, collecting them takes a while and takes heap)?

Maybe G1 falls behind with garbage collection? You can verify this hypothesis by setting the GC_TUNE env var to -XX:+UseG1GC -XX:GCTimeRatio=2. Unless you have a ton of shards, I'm expecting something like JAVA_HEAP=1g to be enough. Or maybe we're both missing something...

stibi · 2023-11-22T11:36:36Z

The cluster is not big at all I think, 1 shard, 2 replicas, ~ 8753202 documents, taking ~22Gb of memory ...

Thanks for hints, I'll take a look on Java metrics and how GC performs.

radu-gheorghe · 2023-11-22T12:35:23Z

You're welcome.

If you need something to monitor GC/JVM metrics (and Solr metrics, for that matter), we have a tool that you might find useful.

HoustonPutman added question Further information is requested metrics labels Nov 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prometheus exporter high memory usage #658

Prometheus exporter high memory usage #658

stibi commented Nov 21, 2023

radu-gheorghe commented Nov 21, 2023

stibi commented Nov 21, 2023 via email

stibi commented Nov 22, 2023

radu-gheorghe commented Nov 22, 2023

stibi commented Nov 22, 2023

radu-gheorghe commented Nov 22, 2023

Prometheus exporter high memory usage #658

Prometheus exporter high memory usage #658

Comments

stibi commented Nov 21, 2023

radu-gheorghe commented Nov 21, 2023

stibi commented Nov 21, 2023 via email

stibi commented Nov 22, 2023

radu-gheorghe commented Nov 22, 2023

stibi commented Nov 22, 2023

radu-gheorghe commented Nov 22, 2023