-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Solr -operator and prometheus-exporter #760
Comments
I spent some time this morning trying to trigger this using the
and
Neither one allowed me to reproduce, out of the box. @aloosnetmatch - could you share some more details about your solrcloud and solrprometheusexporter, that might help folks here reproduce? The output of commands like |
Yeah, the only way for |
Thanks for your reply. I looked up the requested info: This time I compared the PROD environment ( which runs solr operator 0.8.1 and solr 9.7.0)
We use the solrReference.
The attached files for the solrprometheusexporter: also attached files for the solrcloud specs: |
@aloosnetmatch - is this a transient error that goes away as the operator/pods retry, or does the error keep happening once it arises? Assuming it's transient, I could imagine a timing issue where solrprometheusexporter ("SPE") creates its deployment after the solrcloud exists but before the operator has tried to reconcile it and populated But that only makes sense if it's transient behavior... |
@aloosnetmatch can you provide the status of your solrcloud objects as well? |
Also could you provide your solr operator logs? The most likely culprit here is that there is an error in the SolrCloud reconciling and it cannot set its status. |
Here is the solr operator log. For which "solrcloud objects " do you like to have the status? |
So if you look at those logs, the Solr Operator cannot create an Ingress for you because your ingress controller is rejecting it. Because of this, the SolrCloud cannot finish reconciling, and the SolrCloud status is not written. It's ultimately an issue, the status should be written, probably, but it's very difficult to know which errors to ignore and which to not ignore when writing the status. I would recommend fixing your ingress settings so the error doesn't happen. |
Hi. I managed to fix the issue. I fixed it in our environment by switching to "ExternalDNS" addressability: We use an Azure Loadbalancer to handle the external traffic. As soon as this issue was fixed, the prometheus exported started working. Thanks for your support. |
We upgraded out test env from solr-operator 0.8.1 to 0.9.0 and from solr 9.7.0 to 9.8.0.
The prometheus exporter does not seem to work anymore.
The logging tells me this:
ERROR - 2025-02-26 12:42:40.917; org.apache.solr.prometheus.exporter.SolrExporter; Must provide either --base-url or --zk-host Exception in thread "main" java.lang.NullPointerException: Cannot invoke "org.apache.solr.prometheus.exporter.SolrScrapeConfiguration.getType()" because "configuration" is null at org.apache.solr.prometheus.exporter.SolrExporter.createScraper(SolrExporter.java:127) at org.apache.solr.prometheus.exporter.SolrExporter.<init>(SolrExporter.java:90) at org.apache.solr.prometheus.exporter.SolrExporter.main(SolrExporter.java:426)
I already found this link , which is similar to my issue
https://issues.apache.org/jira/browse/SOLR-17638
In the prometheus-exporter pod, the env variable for "ZK_HOST" seems to have no value.
If I set the correct value there for the ZK_HOST in the Deployment , an additional container starts up which does seem to work.
What can i do to fix this isssue?
The text was updated successfully, but these errors were encountered: