Skip to content

Ruler reports an error EOF when sending alert to alertmanager #4958

Open
@humblebundledore

Description

@humblebundledore

Describe the bug
Cortex ruler logs are showing an error EOF when posting alert to Cortex alertmanager.

level=error caller=notifier.go:527 user=tenant-one alertmanager=http://cortex-alertmanager.cortex.svc.cluster.local:8080/api/prom/alertmanager/api/v1/alerts count=1 msg="Error sending alert" err="Post \"http://cortex-alertmanager.cortex.svc.cluster.local:8080/api/prom/alertmanager/api/v1/alerts\": EOF"

notifier.go is Prometheus code and could miss a req.Close = true as pointed out here.

Bug report exist in Prometheus repo :

The number of file descriptor used by alertmanager process is < default limit of alertmanager file descriptor in my case.
This bug does not seems to be tight to a specific alert and happen randomly.

To Reproduce
Steps to reproduce the behavior:

  1. setup ruler / alertmanager
  2. send alerts from ruler to alertmanager up to get EOF

Expected behavior
alertmanager should receive all POST alerts correctly

Environment:

  • Infrastructure: Kubernetes
  • Deployment tool: Helm Cortex (1.6.0), Cortex (v1.13.0)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions