Skip to content

Conversation

isabella-janssen
Copy link
Member

@isabella-janssen isabella-janssen commented Aug 20, 2025

This addresses two bugs to improve the performance of the PinnedImageSet tests.

OCPBUGS-60656:
This bug relates to instability seen in the Invalid PIS leads to degraded MCN in a standard Pool test on the virtualmedia test variant. To stabilize this test, the verification of the PIS degrade was updated to include a retrying wait to validate MCN conditions are correctly set.

OCPBUGS-60883:
This bug relates to flaws associated with the three PIS tests making use of custom MCPs, All Nodes in a custom Pool should have the PinnedImages even after Garbage Collection, All Nodes in a Custom Pool should have the PinnedImages in PIS, and Invalid PIS leads to degraded MCN in a custom Pool. To address the issue in these tests, a wait was added after the worker node labeling to ensure the node successfully becomes part of the custom MCP before proceeding with the test.

General:
In general, some more comments and enhanced error messages were added to help with developer experience in maintaining and monitoring these tests.

Verifying changes:
I was not able to reproduce the issue for OCPBUGS-60656 locally. However, the changes in this PR should make the failing test more stable in general. The following payload tests were run with the changes and the PinnedImages tests were successful & took a more reasonable amount of time to complete than previously:

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 20, 2025
Copy link
Contributor

openshift-ci bot commented Aug 20, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 20, 2025
@isabella-janssen
Copy link
Member Author

/payload-job periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-virtualmedia-1of2

Copy link
Contributor

openshift-ci bot commented Aug 20, 2025

@isabella-janssen: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-virtualmedia-1of2

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/edd38480-7d60-11f0-8469-01cddc550bca-0

@isabella-janssen
Copy link
Member Author

/payload-aggregate periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-virtualmedia-1of2 6

Copy link
Contributor

openshift-ci bot commented Aug 20, 2025

@isabella-janssen: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-virtualmedia-1of2

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/0050bf10-7dc0-11f0-8ae8-fa73f2d591b7-0

@isabella-janssen isabella-janssen changed the title (WIP) OCPBUGS-60591 (WIP) OCPBUGS-60591: OCPBUGS-60883: Fix PinnedImageSet test instability & update flow for tests using custom MCPs Aug 26, 2025
@openshift-ci-robot openshift-ci-robot added jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Aug 26, 2025
@openshift-ci-robot
Copy link

@isabella-janssen: This pull request references Jira Issue OCPBUGS-60591, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.20.0) matches configured target version for branch (4.20.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @sergiordlr

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested a review from sergiordlr August 26, 2025 19:36
@openshift-ci-robot
Copy link

@isabella-janssen: This pull request references Jira Issue OCPBUGS-60591, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.20.0) matches configured target version for branch (4.20.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @sergiordlr

In response to this:

This addresses two bugs to improve the performance of the PinnedImageSet tests.

** OCPBUGS-60591:**
This bug relates to instability seen in the Invalid PIS leads to degraded MCN in a standard Pool test on the virtualmedia test variant. To stabilize this test, the verification of the PIS degrade was updated to include a retrying wait to validate MCN conditions are correctly set.

** OCPBUGS-60883:**
This bug relates to flaws associated with the three PIS tests making use of custom MCPs, All Nodes in a custom Pool should have the PinnedImages even after Garbage Collection, All Nodes in a Custom Pool should have the PinnedImages in PIS, and Invalid PIS leads to degraded MCN in a custom Pool. To address the issue in these tests, a wait was added after the worker node labeling to ensure the node successfully becomes part of the custom MCP before proceeding with the test.

General:
In general, some more comments and enhanced error messages were added to help with developer experience in maintaining and monitoring these tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@isabella-janssen
Copy link
Member Author

/payload-job periodic-ci-openshift-release-master-nightly-4.20-e2e-aws-ovn-serial periodic-ci-openshift-release-master-ci-4.20-e2e-aws-ovn-techpreview-serial periodic-ci-openshift-release-master-nightly-4.20-e2e-aws-ovn-single-node-serial periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-ipv4 periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-virtualmedia-1of2 periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-serial-ovn-ipv6

Copy link
Contributor

openshift-ci bot commented Aug 26, 2025

@isabella-janssen: trigger 6 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.20-e2e-aws-ovn-serial
  • periodic-ci-openshift-release-master-ci-4.20-e2e-aws-ovn-techpreview-serial
  • periodic-ci-openshift-release-master-nightly-4.20-e2e-aws-ovn-single-node-serial
  • periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-ipv4
  • periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-virtualmedia-1of2
  • periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-serial-ovn-ipv6

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/b4b2c220-82b9-11f0-9e68-434bb60f536a-0

@isabella-janssen
Copy link
Member Author

/payload-job periodic-ci-openshift-release-master-nightly-4.20-e2e-aws-ovn-serial periodic-ci-openshift-release-master-ci-4.20-e2e-aws-ovn-techpreview-serial periodic-ci-openshift-release-master-nightly-4.20-e2e-aws-ovn-single-node-serial periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-ipv4 periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-virtualmedia-1of2 periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-serial-ovn-ipv6

Copy link
Contributor

openshift-ci bot commented Aug 27, 2025

@isabella-janssen: trigger 6 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.20-e2e-aws-ovn-serial
  • periodic-ci-openshift-release-master-ci-4.20-e2e-aws-ovn-techpreview-serial
  • periodic-ci-openshift-release-master-nightly-4.20-e2e-aws-ovn-single-node-serial
  • periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-ipv4
  • periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-virtualmedia-1of2
  • periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-serial-ovn-ipv6

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/a6c09f00-834c-11f0-8512-19b9cb7e79c6-0

@isabella-janssen isabella-janssen changed the title (WIP) OCPBUGS-60591: OCPBUGS-60883: Fix PinnedImageSet test instability & update flow for tests using custom MCPs (WIP) OCPBUGS-60656: OCPBUGS-60883: Fix PinnedImageSet test instability & update flow for tests using custom MCPs Aug 27, 2025
@openshift-ci-robot openshift-ci-robot added jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. and removed jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Aug 27, 2025
@openshift-ci-robot
Copy link

@isabella-janssen: This pull request references Jira Issue OCPBUGS-60656, which is invalid:

  • expected the bug to target the "4.20.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

TODO: clean

This addresses two bugs to improve the performance of the PinnedImageSet tests.

OCPBUGS-60591:
This bug relates to instability seen in the Invalid PIS leads to degraded MCN in a standard Pool test on the virtualmedia test variant. To stabilize this test, the verification of the PIS degrade was updated to include a retrying wait to validate MCN conditions are correctly set.

OCPBUGS-60883:
This bug relates to flaws associated with the three PIS tests making use of custom MCPs, All Nodes in a custom Pool should have the PinnedImages even after Garbage Collection, All Nodes in a Custom Pool should have the PinnedImages in PIS, and Invalid PIS leads to degraded MCN in a custom Pool. To address the issue in these tests, a wait was added after the worker node labeling to ensure the node successfully becomes part of the custom MCP before proceeding with the test.

General:
In general, some more comments and enhanced error messages were added to help with developer experience in maintaining and monitoring these tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@isabella-janssen
Copy link
Member Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Aug 27, 2025
@openshift-ci-robot
Copy link

@isabella-janssen: This pull request references Jira Issue OCPBUGS-60656, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.20.0) matches configured target version for branch (4.20.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

@isabella-janssen: This pull request references Jira Issue OCPBUGS-60656, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.20.0) matches configured target version for branch (4.20.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

In response to this:

This addresses two bugs to improve the performance of the PinnedImageSet tests.

OCPBUGS-60591:
This bug relates to instability seen in the Invalid PIS leads to degraded MCN in a standard Pool test on the virtualmedia test variant. To stabilize this test, the verification of the PIS degrade was updated to include a retrying wait to validate MCN conditions are correctly set.

OCPBUGS-60883:
This bug relates to flaws associated with the three PIS tests making use of custom MCPs, All Nodes in a custom Pool should have the PinnedImages even after Garbage Collection, All Nodes in a Custom Pool should have the PinnedImages in PIS, and Invalid PIS leads to degraded MCN in a custom Pool. To address the issue in these tests, a wait was added after the worker node labeling to ensure the node successfully becomes part of the custom MCP before proceeding with the test.

General:
In general, some more comments and enhanced error messages were added to help with developer experience in maintaining and monitoring these tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@isabella-janssen isabella-janssen marked this pull request as ready for review August 27, 2025 19:48
@openshift-ci openshift-ci bot requested a review from dkhater-redhat August 27, 2025 19:50
@isabella-janssen isabella-janssen force-pushed the ocpbugs-60591 branch 2 times, most recently from 40f738e to c18f1c2 Compare August 27, 2025 19:53
@isabella-janssen
Copy link
Member Author

/payload-job periodic-ci-openshift-release-master-nightly-4.20-e2e-aws-ovn-serial periodic-ci-openshift-release-master-ci-4.20-e2e-aws-ovn-techpreview-serial periodic-ci-openshift-release-master-nightly-4.20-e2e-aws-ovn-single-node-serial periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-ipv4 periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-virtualmedia-1of2 periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-serial-ovn-ipv6

Copy link
Contributor

openshift-ci bot commented Aug 27, 2025

@isabella-janssen: trigger 6 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.20-e2e-aws-ovn-serial
  • periodic-ci-openshift-release-master-ci-4.20-e2e-aws-ovn-techpreview-serial
  • periodic-ci-openshift-release-master-nightly-4.20-e2e-aws-ovn-single-node-serial
  • periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-ipv4
  • periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-virtualmedia-1of2
  • periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-serial-ovn-ipv6

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/8a4135a0-837f-11f0-8aac-7e01574679ba-0

@isabella-janssen isabella-janssen changed the title (WIP) OCPBUGS-60656: OCPBUGS-60883: Fix PinnedImageSet test instability & update flow for tests using custom MCPs OCPBUGS-60656: OCPBUGS-60883: Fix PinnedImageSet test instability & update flow for tests using custom MCPs Aug 28, 2025
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 28, 2025
@isabella-janssen
Copy link
Member Author

/payload-job periodic-ci-openshift-release-master-nightly-4.20-e2e-aws-ovn-serial periodic-ci-openshift-release-master-ci-4.20-e2e-aws-ovn-techpreview-serial periodic-ci-openshift-release-master-nightly-4.20-e2e-aws-ovn-single-node-serial periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-ipv4 periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-virtualmedia-1of2 periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-serial-ovn-ipv6

Copy link
Contributor

openshift-ci bot commented Aug 29, 2025

@isabella-janssen: trigger 6 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.20-e2e-aws-ovn-serial
  • periodic-ci-openshift-release-master-ci-4.20-e2e-aws-ovn-techpreview-serial
  • periodic-ci-openshift-release-master-nightly-4.20-e2e-aws-ovn-single-node-serial
  • periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-ipv4
  • periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-virtualmedia-1of2
  • periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-serial-ovn-ipv6

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/9a070e20-84eb-11f0-9b76-4159ddbe3187-0

@pablintino
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 29, 2025
Copy link
Contributor

openshift-ci bot commented Aug 29, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: isabella-janssen, pablintino

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link

@isabella-janssen: This pull request references Jira Issue OCPBUGS-60656, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.20.0) matches configured target version for branch (4.20.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

This addresses two bugs to improve the performance of the PinnedImageSet tests.

OCPBUGS-60656:
This bug relates to instability seen in the Invalid PIS leads to degraded MCN in a standard Pool test on the virtualmedia test variant. To stabilize this test, the verification of the PIS degrade was updated to include a retrying wait to validate MCN conditions are correctly set.

OCPBUGS-60883:
This bug relates to flaws associated with the three PIS tests making use of custom MCPs, All Nodes in a custom Pool should have the PinnedImages even after Garbage Collection, All Nodes in a Custom Pool should have the PinnedImages in PIS, and Invalid PIS leads to degraded MCN in a custom Pool. To address the issue in these tests, a wait was added after the worker node labeling to ensure the node successfully becomes part of the custom MCP before proceeding with the test.

General:
In general, some more comments and enhanced error messages were added to help with developer experience in maintaining and monitoring these tests.

Verifying changes:
I was not able to reproduce the issue for OCPBUGS-60656 locally. However, the changes in this PR should make the failing test more stable in general. The following payload tests were run with the changes and the PinnedImages tests were successful & took a more reasonable amount of time to complete than previously:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

openshift-trt bot commented Aug 29, 2025

Job Failure Risk Analysis for sha: 6e3de92

Job Name Failure Risk
pull-ci-openshift-origin-main-e2e-aws-disruptive High
[bz-Etcd] clusteroperator/etcd should not change condition/Available
This test has passed 99.61% of 1289 runs on release 4.21 [Overall] in the last week.
---
[sig-cli][OCPFeatureGate:UpgradeStatus] oc amd upgrade status never fails
This test has passed 99.76% of 1255 runs on release 4.21 [Overall] in the last week.
---
[sig-node] node-lifecycle detects unexpected not ready node
This test has passed 99.61% of 1289 runs on release 4.21 [Overall] in the last week.

@isabella-janssen
Copy link
Member Author

/label acknowledge-critical-fixes-only

This bug fix is necessary for stabilizing the MCO's component readiness. Additionally, these fixes have been validated in payload rehearsals, so this change should be safe to merge.

@openshift-ci openshift-ci bot added the acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. label Aug 31, 2025
@isabella-janssen
Copy link
Member Author

/retest-required

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 87d58ca and 2 for PR HEAD 6e3de92 in total

1 similar comment
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 87d58ca and 2 for PR HEAD 6e3de92 in total

Copy link
Contributor

openshift-ci bot commented Sep 1, 2025

@isabella-janssen: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit f90f90a into openshift:main Sep 1, 2025
40 of 47 checks passed
@openshift-ci-robot
Copy link

@isabella-janssen: Jira Issue OCPBUGS-60656: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-60656 has been moved to the MODIFIED state.

In response to this:

This addresses two bugs to improve the performance of the PinnedImageSet tests.

OCPBUGS-60656:
This bug relates to instability seen in the Invalid PIS leads to degraded MCN in a standard Pool test on the virtualmedia test variant. To stabilize this test, the verification of the PIS degrade was updated to include a retrying wait to validate MCN conditions are correctly set.

OCPBUGS-60883:
This bug relates to flaws associated with the three PIS tests making use of custom MCPs, All Nodes in a custom Pool should have the PinnedImages even after Garbage Collection, All Nodes in a Custom Pool should have the PinnedImages in PIS, and Invalid PIS leads to degraded MCN in a custom Pool. To address the issue in these tests, a wait was added after the worker node labeling to ensure the node successfully becomes part of the custom MCP before proceeding with the test.

General:
In general, some more comments and enhanced error messages were added to help with developer experience in maintaining and monitoring these tests.

Verifying changes:
I was not able to reproduce the issue for OCPBUGS-60656 locally. However, the changes in this PR should make the failing test more stable in general. The following payload tests were run with the changes and the PinnedImages tests were successful & took a more reasonable amount of time to complete than previously:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@isabella-janssen isabella-janssen deleted the ocpbugs-60591 branch September 2, 2025 19:49
@isabella-janssen
Copy link
Member Author

/cherrypick release-4.19

@openshift-cherrypick-robot

@isabella-janssen: new pull request created: #30205

In response to this:

/cherrypick release-4.19

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants