Skip to content

feat(quarantine): add configurable warning/critical thresholds to flaky tests report#4912

Open
dhiller wants to merge 2 commits into
kubevirt:mainfrom
dhiller:feat/add-warning-levels-for-report
Open

feat(quarantine): add configurable warning/critical thresholds to flaky tests report#4912
dhiller wants to merge 2 commits into
kubevirt:mainfrom
dhiller:feat/add-warning-levels-for-report

Conversation

@dhiller
Copy link
Copy Markdown
Contributor

@dhiller dhiller commented Apr 1, 2026

Summary

  • Adds visual highlighting (warning/critical) for flaky test impacts in the most-flaky-tests report
  • Configurable thresholds via CLI flags for both 3-day and 14-day time ranges
  • Warning (yellow/🟡) and critical (red/🔴) styling based on failure percentage relative to quarantine criteria

Flags

Flag Default Description
--three-day-warning-threshold 10% Minimum 3-day failure % for warning
--three-day-critical-threshold 20% Minimum 3-day failure % for critical
--fourteen-day-warning-threshold 3% Minimum 14-day failure % for warning
--fourteen-day-critical-threshold 5% Minimum 14-day failure % for critical

Critical thresholds align with quarantine criteria.

Fixes #4906

🤖 Generated with Claude Code

…ky tests report

Add visual highlighting for flaky test impacts in the most-flaky-tests
report with configurable thresholds via CLI flags:

- --three-day-warning-threshold (default 10%)
- --three-day-critical-threshold (default 20%)
- --fourteen-day-warning-threshold (default 3%)
- --fourteen-day-critical-threshold (default 5%)

Warning impacts are shown with yellow background, critical with red.
This addresses the issue where 3-day impacts below 20% were hidden
from the report, making it hard to spot recently degraded tests.

Closes kubevirt#4906

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Daniel Hiller <dhiller@redhat.com>
@kubevirt-bot
Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@kubevirt-bot kubevirt-bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Apr 1, 2026
@kubevirt-bot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign tiraboschi for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@dhiller
Copy link
Copy Markdown
Contributor Author

dhiller commented Apr 1, 2026

FYI @cuiyingd

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Daniel Hiller <dhiller@redhat.com>
@kubevirt-bot
Copy link
Copy Markdown
Contributor

Keywords which can automatically close issues and hashtag(#) mentions are not allowed in commit messages.

The list of commits with invalid commit messages:

  • 2d27a00 feat(quarantine): add configurable warning/critical thresholds to flaky tests report
Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@dhiller dhiller marked this pull request as ready for review April 1, 2026 16:22
@kubevirt-bot kubevirt-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 1, 2026
@kubevirt-bot kubevirt-bot requested review from enp0s3 and xpivarc April 1, 2026 16:22
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • The thresholds map uses a bare [2]float64 index in aggregateMostFlakyTestsBySIG without an ok check, which will panic if a new TimeRange is ever added without a corresponding entry; consider either validating the map up-front or handling the missing-key case explicitly.
  • Using a plain [2]float64 for thresholds makes the code harder to read and error-prone (indexes 0/1 are not self-describing); consider introducing a small struct (e.g. { Warning, Critical float64 }) to make the thresholds usage clearer in both the map and the template helpers.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The thresholds map uses a bare `[2]float64` index in `aggregateMostFlakyTestsBySIG` without an `ok` check, which will panic if a new `TimeRange` is ever added without a corresponding entry; consider either validating the map up-front or handling the missing-key case explicitly.
- Using a plain `[2]float64` for thresholds makes the code harder to read and error-prone (indexes 0/1 are not self-describing); consider introducing a small struct (e.g. `{ Warning, Critical float64 }`) to make the thresholds usage clearer in both the map and the template helpers.

## Individual Comments

### Comment 1
<location path="pkg/searchci/searchci.go" line_range="174-179" />
<code_context>
+
+// ScrapeImpactsWithMinPercent works like ScrapeImpacts but uses a configurable minimum percentage threshold
+// instead of the default thresholds for each time range.
+func ScrapeImpactsWithMinPercent(testNameSubstring string, timeRange TimeRange, minPercent float64) ([]Impact, error) {
+	scrapeResultURL := NewScrapeURL(testNameSubstring, timeRange)
+	logger.Debugf("scraping search.ci for test %q with URL %q", testNameSubstring, scrapeResultURL)
+	resp, err := http.Get(scrapeResultURL)
+	if err != nil {
+		return nil, fmt.Errorf("failed to get search.ci results from %s: %w", scrapeResultURL, err)
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider checking HTTP status codes before reading the body

Non-2xx responses (e.g. 404, 500, 503) will currently be treated as successful and passed into `ScrapeImpact`, which can lead to misleading parsing errors or empty results. Consider checking `resp.StatusCode` and returning a clear error for non-2xx responses so callers can distinguish genuine scrape failures from legitimate “no impacts” outcomes.

```suggestion
	resp, err := http.Get(scrapeResultURL)
	if err != nil {
		return nil, fmt.Errorf("failed to get search.ci results from %s: %w", scrapeResultURL, err)
	}
	defer resp.Body.Close()

	if resp.StatusCode < http.StatusOK || resp.StatusCode >= http.StatusMultipleChoices {
		return nil, fmt.Errorf("unexpected status code %d from search.ci at %s", resp.StatusCode, scrapeResultURL)
	}

	body, err := io.ReadAll(resp.Body)
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread pkg/searchci/searchci.go
Comment on lines +174 to +179
resp, err := http.Get(scrapeResultURL)
if err != nil {
return nil, fmt.Errorf("failed to get search.ci results from %s: %w", scrapeResultURL, err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): Consider checking HTTP status codes before reading the body

Non-2xx responses (e.g. 404, 500, 503) will currently be treated as successful and passed into ScrapeImpact, which can lead to misleading parsing errors or empty results. Consider checking resp.StatusCode and returning a clear error for non-2xx responses so callers can distinguish genuine scrape failures from legitimate “no impacts” outcomes.

Suggested change
resp, err := http.Get(scrapeResultURL)
if err != nil {
return nil, fmt.Errorf("failed to get search.ci results from %s: %w", scrapeResultURL, err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
resp, err := http.Get(scrapeResultURL)
if err != nil {
return nil, fmt.Errorf("failed to get search.ci results from %s: %w", scrapeResultURL, err)
}
defer resp.Body.Close()
if resp.StatusCode < http.StatusOK || resp.StatusCode >= http.StatusMultipleChoices {
return nil, fmt.Errorf("unexpected status code %d from search.ci at %s", resp.StatusCode, scrapeResultURL)
}
body, err := io.ReadAll(resp.Body)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dco-signoff: yes Indicates the PR's author has DCO signed all their commits. do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. size/M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Display 3-day > 10% failure rate in most-flaky-tests-report as Warning section

2 participants