feat(quarantine): add configurable warning/critical thresholds to flaky tests report#4912
feat(quarantine): add configurable warning/critical thresholds to flaky tests report#4912dhiller wants to merge 2 commits into
Conversation
…ky tests report Add visual highlighting for flaky test impacts in the most-flaky-tests report with configurable thresholds via CLI flags: - --three-day-warning-threshold (default 10%) - --three-day-critical-threshold (default 20%) - --fourteen-day-warning-threshold (default 3%) - --fourteen-day-critical-threshold (default 5%) Warning impacts are shown with yellow background, critical with red. This addresses the issue where 3-day impacts below 20% were hidden from the report, making it hard to spot recently degraded tests. Closes kubevirt#4906 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Daniel Hiller <dhiller@redhat.com>
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
FYI @cuiyingd |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Daniel Hiller <dhiller@redhat.com>
|
Keywords which can automatically close issues and hashtag(#) mentions are not allowed in commit messages. The list of commits with invalid commit messages:
DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The thresholds map uses a bare
[2]float64index inaggregateMostFlakyTestsBySIGwithout anokcheck, which will panic if a newTimeRangeis ever added without a corresponding entry; consider either validating the map up-front or handling the missing-key case explicitly. - Using a plain
[2]float64for thresholds makes the code harder to read and error-prone (indexes 0/1 are not self-describing); consider introducing a small struct (e.g.{ Warning, Critical float64 }) to make the thresholds usage clearer in both the map and the template helpers.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The thresholds map uses a bare `[2]float64` index in `aggregateMostFlakyTestsBySIG` without an `ok` check, which will panic if a new `TimeRange` is ever added without a corresponding entry; consider either validating the map up-front or handling the missing-key case explicitly.
- Using a plain `[2]float64` for thresholds makes the code harder to read and error-prone (indexes 0/1 are not self-describing); consider introducing a small struct (e.g. `{ Warning, Critical float64 }`) to make the thresholds usage clearer in both the map and the template helpers.
## Individual Comments
### Comment 1
<location path="pkg/searchci/searchci.go" line_range="174-179" />
<code_context>
+
+// ScrapeImpactsWithMinPercent works like ScrapeImpacts but uses a configurable minimum percentage threshold
+// instead of the default thresholds for each time range.
+func ScrapeImpactsWithMinPercent(testNameSubstring string, timeRange TimeRange, minPercent float64) ([]Impact, error) {
+ scrapeResultURL := NewScrapeURL(testNameSubstring, timeRange)
+ logger.Debugf("scraping search.ci for test %q with URL %q", testNameSubstring, scrapeResultURL)
+ resp, err := http.Get(scrapeResultURL)
+ if err != nil {
+ return nil, fmt.Errorf("failed to get search.ci results from %s: %w", scrapeResultURL, err)
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider checking HTTP status codes before reading the body
Non-2xx responses (e.g. 404, 500, 503) will currently be treated as successful and passed into `ScrapeImpact`, which can lead to misleading parsing errors or empty results. Consider checking `resp.StatusCode` and returning a clear error for non-2xx responses so callers can distinguish genuine scrape failures from legitimate “no impacts” outcomes.
```suggestion
resp, err := http.Get(scrapeResultURL)
if err != nil {
return nil, fmt.Errorf("failed to get search.ci results from %s: %w", scrapeResultURL, err)
}
defer resp.Body.Close()
if resp.StatusCode < http.StatusOK || resp.StatusCode >= http.StatusMultipleChoices {
return nil, fmt.Errorf("unexpected status code %d from search.ci at %s", resp.StatusCode, scrapeResultURL)
}
body, err := io.ReadAll(resp.Body)
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| resp, err := http.Get(scrapeResultURL) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to get search.ci results from %s: %w", scrapeResultURL, err) | ||
| } | ||
| defer resp.Body.Close() | ||
| body, err := io.ReadAll(resp.Body) |
There was a problem hiding this comment.
suggestion (bug_risk): Consider checking HTTP status codes before reading the body
Non-2xx responses (e.g. 404, 500, 503) will currently be treated as successful and passed into ScrapeImpact, which can lead to misleading parsing errors or empty results. Consider checking resp.StatusCode and returning a clear error for non-2xx responses so callers can distinguish genuine scrape failures from legitimate “no impacts” outcomes.
| resp, err := http.Get(scrapeResultURL) | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to get search.ci results from %s: %w", scrapeResultURL, err) | |
| } | |
| defer resp.Body.Close() | |
| body, err := io.ReadAll(resp.Body) | |
| resp, err := http.Get(scrapeResultURL) | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to get search.ci results from %s: %w", scrapeResultURL, err) | |
| } | |
| defer resp.Body.Close() | |
| if resp.StatusCode < http.StatusOK || resp.StatusCode >= http.StatusMultipleChoices { | |
| return nil, fmt.Errorf("unexpected status code %d from search.ci at %s", resp.StatusCode, scrapeResultURL) | |
| } | |
| body, err := io.ReadAll(resp.Body) |
Summary
Flags
--three-day-warning-threshold--three-day-critical-threshold--fourteen-day-warning-threshold--fourteen-day-critical-thresholdCritical thresholds align with quarantine criteria.
Fixes #4906
🤖 Generated with Claude Code