Upgrade test scenario "Fail when there is operator manifest is not applicable" runs for more than 15 minutes #1237

triffer · 2024-08-09T05:16:36Z

Description
The scenario Fail when there is operator manifest is not applicable runs into the max retry timeout. The reason for this is, that an api-gateway-controller-deployment is applied that will fail on update with the following error:

Deployment.apps "api-gateway-controller-manager" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"control-plane":"controller-manager"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable

The update is invoked in the function upgradeApiGateway. The invocation of s.resourceManager.CreateOrUpdateResourcesGVR(s.k8sClient, manifestCrds...) internally calls the function UpdateResource. This function always uses the default retries to update. Since recently the retry time was increased from 5 to 15 minutes, this unintended behaviour started to surface.

In general we should avoid retries in lower level functions like UpdateResource and retry rather in the scenario function. Nevertheless I don't understand the upgrade scenario and what is tested in his case, since the only step in this scenario doesn't give a meaningful explanation:
Upgrade: API Gateway is upgraded to current branch version with "failing" manifest and should "fail"

I think we should also question whether we need this scenario at all, as we are providing an incorrect yaml that is expected to fail the apply.
If we want to keep it, we should fix the retries and improve the scenario and step to make it easier to understand what is tested.

Expected result
The upgrade integration tests execute in less than 10 minutes.

Actual result
Due to the retries the upgrade integration test runs for ~20 minutes.

Steps to reproduce
Run the upgrade integration tests.

Troubleshooting

The text was updated successfully, but these errors were encountered:

triffer added the kind/bug Categorizes issue or PR as related to a bug. label Aug 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade test scenario "Fail when there is operator manifest is not applicable" runs for more than 15 minutes #1237

Upgrade test scenario "Fail when there is operator manifest is not applicable" runs for more than 15 minutes #1237

triffer commented Aug 9, 2024 •

edited

Loading

Upgrade test scenario "Fail when there is operator manifest is not applicable" runs for more than 15 minutes #1237

Upgrade test scenario "Fail when there is operator manifest is not applicable" runs for more than 15 minutes #1237

Comments

triffer commented Aug 9, 2024 • edited Loading

triffer commented Aug 9, 2024 •

edited

Loading