sql query result and sql_exporter metrics are not equal #747
-
Describe the bug To Reproduce scraped results postgres@host-1 ~ $ curl localhost:9399/metrics 2>/dev/null | grep 'pg_settings_source_cnt'
# HELP pg_settings_source_cnt Number of settings
# TYPE pg_settings_source_cnt gauge
pg_settings_source_cnt{job="postgresql_common_queries",slave="False",sourcefile="/etc/postgresql-15/postgresql.conf",target="database1"} 82
pg_settings_source_cnt{job="postgresql_common_queries",slave="False",sourcefile="default",target="database1"} 280 query use by database1=# select coalesce(sourcefile, 'default') as sourcefile, count(*) as cnt from pg_settings group by sourcefile;
sourcefile | cnt
------------------------------------+-----
default | 273
/etc/postgresql-15/postgresql.conf | 84
(2 rows) Expected behavior scrape Configuration - metric_name: pg_settings_source_cnt
type: gauge
help: Number of settings
values: [cnt]
static_labels:
slave: False
key_labels: [sourcefile]
query_ref: pg_settings_source
.....
....
....
- query_name: pg_settings_source
query: |
select
coalesce(sourcefile, 'default') as sourcefile,
count(*) as cnt
from pg_settings group by sourcefile
global:
# If scrape_timeout <= 0, no timeout is set unless Prometheus provides one. The default is 10s.
scrape_timeout: 30s
# Subtracted from Prometheus' scrape_timeout to give us some headroom and prevent Prometheus from timing out first.
scrape_timeout_offset: 500ms
# Minimum interval between collector runs: by default (0s) collectors are executed on every scrape.
min_interval: 0s
# Maximum number of open connections to any one target. Metric queries will run concurrently on multiple connections,
# as will concurrent scrapes.
max_connections: 3
# Maximum number of idle connections to any one target. Unless you use very long collection intervals, this should
# always be the same as max_connections.
max_idle_connections: 3
# Maximum number of maximum amount of time a connection may be reused. Expired connections may be closed lazily before reuse.
# If 0, connections are not closed due to a connection's age.
max_connection_lifetime: 5m
# Data source name always has a URI schema that matches the driver name. In some cases (e.g. MySQL)
# the schema gets dropped or replaced to match the driver expected DSN format.
jobs:
- job_name: postgresql_common_queries
collectors: ['postgresql_common_queries']
static_configs:
- targets:
database1: 'postgresql://monuser@localhost:5432/database1?sslmode=disable'
# Collectors (referenced by name) to execute on the target.
# Glob patterns are supported (see <https://pkg.go.dev/path/filepath#Match> for syntax).
# collectors: [postgresql_*]
collector_files:
- "*.collector.yml" Additional context
|
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 6 replies
-
Hey @ridvankorkmaz, thanks for your report. Interesting issue. There is no custom query handling on the sql_exporter side, but I'm happy to look into that. I believe I can reproduce the case on my end. 👍 In the meantime, there are some steps you might try:
|
Beta Was this translation helpful? Give feedback.
-
Hey @ridvankorkmaz, So far I'm unable to reproduce the case on my end (I've taken your provided configuration):
It's After inspecting your query, I also think it might be potentially related to permissions |
Beta Was this translation helpful? Give feedback.
-
Let's proceed in the discussion as it's unrelated to the exporter's functional part, at least so far. |
Beta Was this translation helpful? Give feedback.
-
Thank you @burningalchemist for looking into this. As I (I think you also) notice that only there is only one kind of setting ( Yet I would like to suggest you to test with: wal_keep_size = 64MB
wal_writer_delay = 300 and run this to set them in alter system set log_connections = on;
alter system set max_wal_size = 1GB;
alter system set log_destination = 'csvlog';
select pg_reload_conf(); Then run the query to see you at least 3 different |
Beta Was this translation helpful? Give feedback.
-
Hey @ridvankorkmaz, I was able to partially reproduce the issue. In my case, though, the distribution while using pgx driver matches the pgsql client result. lib/pq seems to prone to that issue and it's possible it wasn't resolved (lib/pq#80 (comment)). But it might not be related to your issue directly, so we need more details. What I've also learnt on my end is by adding
It's better to compare them in pairs: postgres vs pgx, and postgres vs psql cli, as these result sets reflect the different parameters grouped by source. It's important to note the differences are only present in the Once we got the differences, we might be interested to check what these parameters are. Here is the list per each driver: postgres:
pgx has no parameters registered as the client parameters (library), so we only check 'configuration file' source:
And here is the difference: while DateStyle parameter on pgx is not provided by the client, and hence remains coming from the configuration file, on postgres(libpq) driver it's provided by the client (regardless of the setting, which is equal for both). For the On the psql cli side, we also get extra parameters from the
pgx driver doesn't add Overall, I think that given different default parameters provided by the libraries and even the psql cli, it's not easy to identify the source of truth between these three, but you can declare one of them. It could be that just relying upon Alternatively, you might want to have a more sophisticated logic on when to add empty sourcefile to Ultimately, on my end, the number of parameters is generally the same, and is only affected by those parameters coming implicitly from the libpq driver under the Please have a look, and try the queries which I posted above, I think it should help. 👍 ref: https://www.postgresql.org/docs/17/view-pg-settings.html (pg_settings columns for the source and context values) |
Beta Was this translation helpful? Give feedback.
Hey @ridvankorkmaz, I was able to partially reproduce the issue. In my case, though, the distribution while using pgx driver matches the pgsql client result. lib/pq seems to prone to that issue and it's possible it wasn't resolved (lib/pq#80 (comment)). But it might not be related to your issue directly, so we need more details.
What I've also learnt on my end is by adding
source
andcontext
columns we may see how a particular driver adjusts or adds parameters. This is the listing that shows distribution differences between postgresql driver (libpq), pgx and psql client: