Skip to content

Add s390 prow e2e job#4832

Open
Davo911 wants to merge 1 commit into
kubevirt:mainfrom
Davo911:add-s390x-prow-job
Open

Add s390 prow e2e job#4832
Davo911 wants to merge 1 commit into
kubevirt:mainfrom
Davo911:add-s390x-prow-job

Conversation

@Davo911
Copy link
Copy Markdown
Member

@Davo911 Davo911 commented Mar 16, 2026

What this PR does / why we need it:
Adds pull-containerized-data-importer-e2e-s390x, a presubmit job that runs CDI functional e2e tests against an external s390x OpenShift cluster, similar to how kubevirt runs periodic-kubevirt-e2e-test-S390X.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Related:
#4785

Release note:

NONE

@kubevirt-bot
Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@kubevirt-bot kubevirt-bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. size/M labels Mar 16, 2026
@Davo911
Copy link
Copy Markdown
Member Author

Davo911 commented Mar 17, 2026

/cc: @dhiller @cfilleke

@Davo911 Davo911 force-pushed the add-s390x-prow-job branch from d2314e3 to ebec68f Compare March 17, 2026 14:33
@Davo911
Copy link
Copy Markdown
Member Author

Davo911 commented Mar 19, 2026

So I created a single ceph pull job now for s390x.
Would this be sufficient or do we need to test across the board (except destructive)?

@Davo911 Davo911 force-pushed the add-s390x-prow-job branch from ebec68f to b30fb05 Compare March 24, 2026 12:51
@Davo911 Davo911 marked this pull request as ready for review March 24, 2026 12:51
@kubevirt-bot kubevirt-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 24, 2026
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@Davo911 Davo911 force-pushed the add-s390x-prow-job branch 3 times, most recently from 603ff09 to dfda279 Compare March 24, 2026 17:48
@Davo911 Davo911 force-pushed the add-s390x-prow-job branch from dfda279 to a034f10 Compare March 25, 2026 01:27
@Davo911
Copy link
Copy Markdown
Member Author

Davo911 commented Mar 26, 2026

Is there a mechanism of test-running this job before merging?

@dollierp
Copy link
Copy Markdown
Contributor

dollierp commented Mar 26, 2026

Is there a mechanism of test-running this job before merging?

/rehearse pull-containerized-data-importer-e2e-s390x

@kubevirt-bot
Copy link
Copy Markdown
Contributor

Rehearsal jobs created for this PR:

rehearsal-pull-containerized-data-importer-e2e-s390x
Further information on rehearsals

A rehearsal can be triggered for all jobs by commenting either /rehearse or /rehearse all on this PR.

A rehearsal for a specific job can be triggered by commenting /rehearse {job-name}.

Commenting /rehearse ? triggers a comment with a list of jobs that can be rehearsed.

A pull request can be rehearsed if either the user is authorized to rehearse or the pull
request has the ok-to-rehearse label.

Authorized users are the group of users that are members of the KubeVirt GitHub
organization AND either are approvers[1] for all files in the pull request or are
top-level approvers[1] in the project-infra project.

[1]: see OWNERS file definition for reference.

@kubevirt-bot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from dhiller. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@awels
Copy link
Copy Markdown
Member

awels commented Apr 23, 2026

I see, thanks for the PR, I though line 77 would fix it, but we overrode the value passed in.

@Davo911
Copy link
Copy Markdown
Member Author

Davo911 commented Apr 24, 2026

kubevirt/containerized-data-importer#4100 got merged
Could we give it another try @dollierp

@awels
Copy link
Copy Markdown
Member

awels commented Apr 27, 2026

it looks like it is not using the random cdi namespace anymore, but something else is still wrong.

@dollierp
Copy link
Copy Markdown
Contributor

it looks like it is not using the random cdi namespace anymore, but something else is still wrong.

It sees the name resolution issue for the cluster fqdn is still present.

Also the service account prow-workloads-cluster-automation needs to be granted more permissions because the jobs is using the kubevirtci external provider.

@Davo911
Copy link
Copy Markdown
Member Author

Davo911 commented Apr 27, 2026

Added the permissions missing, thanks @dollierp , but I'm unsure if the connection failure to the cdi-uploadproxy-cdi.apps.odf.cdi.ci is a cluster issue or a prow connection issue...

In my local enviroment a simple hosts entry fixed this error

@dollierp
Copy link
Copy Markdown
Contributor

I'm unsure if the connection failure to the cdi-uploadproxy-cdi.apps.odf.cdi.ci is a cluster issue or a prow connection issue...

IIUC, tests are using this URL because it is not overridden for external clusters (https://github.com/kubevirt/containerized-data-importer/blob/main/cluster-sync/sync.sh#L322-L328).

Usually OpenShift clusters have their *.apps."${cluster_name}"."${cluster_domain}" publicly resolvable.

If that's not the case for your cluster I guess you can simply modify the Prow job to add this record to the /etc/hosts file inside the Prow pod before running the automation/test.sh script.

@Davo911 Davo911 force-pushed the add-s390x-prow-job branch from c965861 to 333721d Compare April 27, 2026 18:28
@Davo911
Copy link
Copy Markdown
Member Author

Davo911 commented Apr 27, 2026

Well I added that, let's try another run please @dollierp

@Davo911
Copy link
Copy Markdown
Member Author

Davo911 commented Apr 27, 2026

There seemed to be an issue setting up cdi, the cluster looks good though, there was one stuck ressource i cleaned up

@Davo911
Copy link
Copy Markdown
Member Author

Davo911 commented Apr 28, 2026

I successfully kicked off a few runs manually, I'd give it another go, before considering to reinstall the whole cluster..

@Davo911 Davo911 force-pushed the add-s390x-prow-job branch from 333721d to 0a0d820 Compare April 28, 2026 22:13
@Davo911
Copy link
Copy Markdown
Member Author

Davo911 commented Apr 28, 2026

Ah found the problem and was able to replicate it.

My manual tests deployed latest, while the Prow job's upgrade test tries to install v1.65.0 first — which is a downgrade the operator seems to rejects (Reconciling to error state, illegal phase).
Properly cleaning up fixed it for me locally, so I added that step to the job..

@Davo911
Copy link
Copy Markdown
Member Author

Davo911 commented Apr 29, 2026

ERROR: Post "https://cdi-uploadproxy-cdi.apps.odf.cdi.ci/v1beta1/upload": dial tcp 172.16.41.35:443: connect: connection timed out

at least it finds the host now, but times out connecting..

@dollierp
Copy link
Copy Markdown
Contributor

ERROR: Post "https://cdi-uploadproxy-cdi.apps.odf.cdi.ci/v1beta1/upload": dial tcp 172.16.41.35:443: connect: connection timed out

at least it finds the host now, but times out connecting..

The ginkgo test program is executed in a Pod on the KubeVirt Prow Control Plane cluster.

The IP address 172.16.41.35 is private and can't be reached from this cluster.

@Davo911
Copy link
Copy Markdown
Member Author

Davo911 commented Apr 29, 2026

Yes, I just talked t an admin about this.
For one thats a internal IP, I'll adjust it to an external, but also there is a port mapping happening, so trying to reach https://cdi-uploadproxy-cdi.apps.<cluster>:443 won't work, we have to speak to a different port.

I'm adjusting this with iptables, but in future it would be more elegant to be able to set the uploadproxURL also for external clusters IMO

@Davo911 Davo911 force-pushed the add-s390x-prow-job branch from 0a0d820 to 702d086 Compare April 29, 2026 11:51
@awels
Copy link
Copy Markdown
Member

awels commented Apr 29, 2026

I'm adjusting this with iptables, but in future it would be more elegant to be able to set the uploadproxURL also for external clusters IMO

You can set the uploadProxyURL In the CDI CR.

@Davo911
Copy link
Copy Markdown
Member Author

Davo911 commented Apr 29, 2026

@awels I guess through
kubectl patch cdi ${CR_NAME} --type=merge -p '{"spec": {"config": {"uploadProxyURLOverride": <url>}}}'
But how would this be done when using the /automation/test.sh flow?

@awels
Copy link
Copy Markdown
Member

awels commented Apr 29, 2026

You already found the code that sets it for internal providers in sync.sh. We should allow setting an ENV variable to override that. Then in sync.sh see if that variable exists, and set it.

@Davo911 Davo911 force-pushed the add-s390x-prow-job branch from 702d086 to 4078c56 Compare April 30, 2026 10:38
@Davo911
Copy link
Copy Markdown
Member Author

Davo911 commented May 5, 2026

Fix is merged and parameter in place, could we give it another go? @dollierp

@Davo911
Copy link
Copy Markdown
Member Author

Davo911 commented May 5, 2026

Still seeing

  ERROR: Post "https://cdi-uploadproxy-cdi.apps.odf.cdi.ci/v1beta1/upload-async": dial tcp 195.212.74.206:443: connect: connection timed out

even though we succesfully patched

+ kubectl patch cdi cdi --type=merge -p '{"spec": {"config": {"uploadProxyURLOverride": "https://cdi-uploadproxy-cdi.apps.odf.cdi.ci:6005"}}}'
cdi.cdi.kubevirt.io/cdi patched

@Davo911
Copy link
Copy Markdown
Member Author

Davo911 commented May 6, 2026

Ok as far as I understand automation/test.sh auto-sets UPGRADE_FROM=v1.65.0, so the job does a v1.65.0 -> latest upgrade before patching.
The config controller doesn't seem to pick up the override after that upgrade reconciliation.
This is just a guess.

I'm adding a

- name: RANDOM_CR
  value: "true"

just like the amd64 ceph jobs

Could we run this again @dollierp
After this iptables approach would be my only other option

@Davo911 Davo911 force-pushed the add-s390x-prow-job branch from 4078c56 to fe47bc8 Compare May 6, 2026 14:37
@kubevirt-bot
Copy link
Copy Markdown
Contributor

@Davo911: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
rehearsal-pull-containerized-data-importer-e2e-s390x fe47bc8 link false /test pull-containerized-data-importer-e2e-s390x
Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Add s390 prow e2e job

Signed-off-by: Thomas-David Griedel griedel911@gmail.com

remove node selector

add hosts entry

Signed-off-by: Thomas-David Griedel <griedel911@gmail.com>
Co-Authored-By: Daniel Hiller <daniel.hiller.1972@googlemail.com>
@Davo911 Davo911 force-pushed the add-s390x-prow-job branch from fe47bc8 to 1381615 Compare May 11, 2026 15:10
@Davo911
Copy link
Copy Markdown
Member Author

Davo911 commented May 11, 2026

I tried to pinpoint where the override was and posted a PR for this, I'd be happy if I'd get a review.
Until then I'd like to try the port mapping approach to see a successful rehearsal job and see if there are any other issues. @dollierp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dco-signoff: yes Indicates the PR's author has DCO signed all their commits. size/M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants