Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test upgrade from latest release to current branch image in CI #576

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

sjpb
Copy link
Collaborator

@sjpb sjpb commented Feb 13, 2025

No description provided.

@sjpb sjpb force-pushed the ci/test-compute-init branch 6 times, most recently from cace350 to bf1ceed Compare February 13, 2025 14:36
@sjpb sjpb changed the title WIP: Use latest release for initial CI cluster setup Test upgrade from latest release to current branch image in CI Feb 13, 2025
@sjpb sjpb force-pushed the ci/test-compute-init branch from 63c35b7 to d504363 Compare February 14, 2025 09:34
@sjpb
Copy link
Collaborator Author

sjpb commented Feb 14, 2025

First CI run above - RL8 worked, RL9 didn't. Both compute nodes got rebooted (not rebuilt - as no image change), then they froze up. Could ping but couldn't ssh in. Rescued -1, set a root password, unrescued, then it started working (!) then OOMkilled on HPL.

Just trying the RL9 one again to see if we got unlucky with the cloud.

@sjpb
Copy link
Collaborator Author

sjpb commented Feb 14, 2025

Ok it passed the 2nd time!

@sjpb sjpb force-pushed the ci/test-compute-init branch from 1355b5f to 0ac9de5 Compare February 14, 2025 16:45
@sjpb
Copy link
Collaborator Author

sjpb commented Feb 14, 2025

Cancelled tests after rebase, building image: https://github.com/stackhpc/ansible-slurm-appliance/actions/runs/13333602095

NB: once bumped, this should trigger a rebuild rather than reimage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant