You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fixes#1276.
Currently, the way we update these node metrics is by removing all the
old ones, then adding back the current values.
If metrics are scraped in between removing the old and adding the new,
we can end up with single-datapoint gaps for one node at a time.
So to fix this, we should avoid removing the old metrics if and only if
the labels are unchanged -- which we can check just by storing the
previous labels we used.
Fixes#1276.
Currently, the way we update these node metrics is by removing all the
old ones, then adding back the current values. We do it that way so that
the old values can be cleaned up when there's label changes.
However: if metrics are scraped in between removing the old and adding
the new, we can end up with single-datapoint gaps for one node at a
time.
So to fix this, we should avoid removing the old metrics if and only if
the labels are unchanged -- which we can check just by storing the
previous labels we used.
Environment
Production
Steps to reproduce
Put the scheduler under load (e.g., sustaining >60 reconcile operations per second).
Expected result
The scheduler plugin's node resource metrics should have no gaps.
Actual result
There's occasional, small gaps in the metrics. For example:
Other logs, links
The text was updated successfully, but these errors were encountered: