-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
What steps did you take and what happened?
CAPZ runs the ClusterctlUpgradeSpec to validate upgrades of CAPZ:
cluster-api/test/e2e/clusterctl_upgrade.go
Line 193 in 79465fd
| func ClusterctlUpgradeSpec(ctx context.Context, inputGetter func() ClusterctlUpgradeSpecInput) { |
Since the check to verify resource versions was added in #12546, CAPZ's tests have been flaking with a mysterious error at the "Check resourceVersions are stable" step as described in kubernetes-sigs/cluster-api-provider-azure#6058.
I found out that the test would only fail when the management cluster in the test is running in the australiaeast region which is much further away from our Prow cluster running the e2e tests than the other regions where we test.
For clusters in that region, the Eventually block that gathers resource versions for the cluster's objects was spending its entire 1-minute budget gathering an initial view of the objects and therefore not even attempting a second look like the test assumes will always happen.
cluster-api/test/framework/resourceversion_helpers.go
Lines 41 to 42 in 79465fd
| Eventually(func(g Gomega) { | |
| objectsWithResourceVersion, objects, err := getObjectsWithResourceVersion(ctx, proxy, namespace, ownerGraphFilterFunction) |
Even with #12848 ported onto CAPI v1.11.5, I still see this flake in CAPZ.
What did you expect to happen?
The test is resilient to higher latency or allows the timeouts to be adjusted.
Cluster API version
v1.11.5
Kubernetes version
No response
Anything else you would like to add?
No response
Label(s) to be applied
/kind bug
One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.