-
Notifications
You must be signed in to change notification settings - Fork 10k
Open
prometheus/sigv4
#18Labels
Description
What did you do?
Prometheus logs shows, that when the SigV4 token expires Prometheus doesn't gracefully handle this. The samples are dropped and not retried.
The meric rate(prometheus_remote_storage_samples_failed_total[5m]) also shows an increase at the same time which indicates the samples were not sent correctly.

Logs are attached.
What did you expect to see?
Prometheus should handle token expiry gracefully and shouldn't loose samples.
What did you see instead? Under which circumstances?
N/A
System information
Linux
Prometheus version
Prometheus/2.50.1
Prometheus configuration file
remoteWrite:
- url: https://aps-workspaces.<region>.amazonaws.com/<url>
sigv4:
region: <REGION>
roleArn: arn:aws:iam::<ACCOUNT>:role/<ROLE>
Alertmanager version
No response
Alertmanager configuration file
No response
Logs
ts=2024-10-17T19:40:40.208Z caller=dedupe.go:112 component=remote level=error remote_name=e94106 url=https://aps-workspaces.<region>.amazonaws.com/<url> msg="non-recoverable error while sending metadata" count=914 err="server returned HTTP status 403 Forbidden: {\"message\":\"The security token included in the request is expired\"}"