Skip to content

Releases: stackhpc/ansible-slurm-appliance

v1.139

18 Jan 09:45
2d78fc9
Compare
Choose a tag to compare

What's Changed

  • Pull containers by @sjpb in #351:
    - Container images pulled before service start and during fat image build.
    - Fixes an issue where podman commands failed after reboot.
    See PR for full details.
  • Use most-recent image in skeleton terraform if multiple found by @sjpb in #350

Full Changelog: v1.138...v.139

Deployment notes

No changes to galaxy roles/collections.

Image Details

New fat image openhpc-240116-1156-aa8dba7d, requires 12GB root disk.

CI changes

  • FIP now used for build on Arcus to avoid docker.io rate limits.

v1.138

02 Jan 13:43
bfa719f
Compare
Choose a tag to compare

What's Changed

  • Don't ignore image changes in skeleton terraform lifecycle by @sjpb in #334
  • Cope with stale NFS file handles by @sjpb in #332
  • Update fatimage base to RL8.9 with robust volume mounts by @sjpb in #341
  • Remove cve-2023-41914 role by @sjpb in #337
  • Avoid prompting user to accept hostkey in OOD shell by @sjpb in #331
  • Fix removal of packer bundled ansible plugin by @sjpb in #346

Full Changelog: v1.137...v1.138

Deployment notes

The stackhpc.nfs role has changed. To update this run:

dev/setup-env.sh

Image Details

New image openhpc-240102-1025-e533fd70, requiring 10GB root disk.

v1.137

07 Dec 14:07
6b0cc36
Compare
Choose a tag to compare

What's Changed

  • Fat image update by @sjpb in #340:
    • Updated fat image (now RockyLinux 8.9 with OpenHPC v2.7)
    • Updates Open Ondemand to v3.0.3
    • Fixes regression where basic_users functionality ran in fat image build (note this did not affect any released images)

Deployment Notes

No changes to galaxy roles/collections.

CI changes

None

Image Details

New image openhpc-231206-1648-9d6aa4e4, requiring 10GB root disk.

Full Changelog: v1.136...v1.137

v1.136

01 Dec 17:25
1b453af
Compare
Choose a tag to compare

What's Changed

  • Fix testuser password in CI image build by @sjpb in #335
  • Fix #73: Fails late if no secrets defined by @sjpb in #329
  • Use new TurboVNC repofile by @sjpb in #339: NB: turbovnc repofile has moved so this PR is required to deploy the appliance as of this date

Full Changelog: v1.135...v1.136

Deployment notes

No changes, no galaxy updates required.

Image Details

No new image provided at this time

v1.135

24 Nov 12:22
ea3155a
Compare
Choose a tag to compare

What's Changed

  • Merge caas slurm appliance into slurm appliance by @sjpb in #325

Full Changelog: v1.134...v1.135

This should not affect existing deployments.

v1.134

10 Nov 14:13
6f31af4
Compare
Choose a tag to compare

What's Changed

  • Updates to OpenHPC role and source image by @sjpb in #324
  • Development quality-of-life improvements by @sjpb in #316
  • Add support for freeipa clients by @sjpb in #241

Full Changelog: v1.133...v1.134

Deployment notes

The stackhpc.openhpc role has changed. To update this run:

dev/setup-env.sh

Image Details

New image openhpc-231027-0916-893570de

v1.133

24 Oct 09:24
f03c89f
Compare
Choose a tag to compare

What's Changed

CI changes

  • Use local container image registry for CI to avoid docker.io ratelimits by @sjpb in #318

Deployment notes

The osc.ood role has changed. To update this run:

ansible-galaxy role install --force -r requirements.yml -p ansible/roles

Image Details

New image openhpc-231020-1357-b5d8b056, requiring 10GB root disk.

New Contributors

Full Changelog: v1.132...v1.133

v1.132

26 Sep 10:05
7eb855e
Compare
Choose a tag to compare

What's Changed

  • Fix ssh ControlPath in skeleton by @sjpb in #297
  • Fix issues when using GenericCloud image by @sjpb in #313

Full Changelog: v1.131...v1.132

CI changes

  • "Fat" image build can now be done either on Arcus (using volume-backed instances -> 10GB virtual disk) or SMS-labs (using non-volume-backed instances - 12GB virtual disk)

Deployment notes

No galaxy-installed roles/collections have changed.

Image Details

Built a new image openhpc-230922-0940-434e190f

Now only requires a 10GB root disk.

v1.131

06 Sep 14:20
b13b98d
Compare
Choose a tag to compare

What's Changed

New features

  • Support for CUDA by @sjpb in #253 and #283 - see 253 for full details and configuration

Fixes and Enhancements

  • Make etc_hosts role more flexible by @sjpb in #277
  • Update prometheus-slurm-exporter version by @m-bull in #280
  • Install out of tree openstack builder plugin by @m-bull in #285
  • Remove warn parameter for ansible>=2.14 by @mkjpryor in #286
  • Fix opensearch grafana plugin at last working version by @sjpb in #292
  • Fix query type in the Slurm jobs Grafana dashboard by @mkarpiarz in #293
  • Use Python3.9 for jupyter notebook server by @sjpb in #294
  • Pin Terraform in CI to MPL licenced version by @sjpb in #302
  • Update opensearch to 2.9.0 by @sjpb in #299

CI changes

  • Make CI cloud selectable between SMSlabs and Arcus by @sjpb in #288
  • Disable EESSI tests in CI and make them debuggable by @sjpb in #295
  • Fix SMS ssh by @sjpb in #296
  • Use portal-internal network (with normal-mode ports) for Arcus CI by @sjpb in #306

Deployment notes

Galaxy roles/collection versions have been changed so use ansible-galaxy {role,collection} install -f ... after merging to force-update these.

Image Details

New Contributors

Full Changelog: v1.130...v1.131

v1.130

12 May 14:43
999cfc8
Compare
Choose a tag to compare

What's Changed

New functionallity/roles/groups

Changes to Packer build functionality

  • Allow Packer base images to be specified by either UUID or name by @m-bull in #266
  • Support attaching a floating IP to the fatimage builder instance by @m-bull in #267
  • Support using volume-backed instances for building and selecting the output image format by @m-bull in #269
  • Allow specifying the packer manifest output path by @m-bull in #268
  • Allow use of ephemeral SSH keys when building Packer images by @m-bull in #274

Other changes

  • Support changing the podman user's uid by @sjpb in #264
  • Fix to proxy role: now defaults to including localhost in no_proxy by @sjpb in #270
  • Add debug logging options for opensearch & filebeat by @sjpb in #271
  • The UCX device to use for hpctests can now be defined per partition by @sjpb in #275
  • Always delete resources on deploy failure in CI by @sjpb in #272

Full Changelog: v1.129...v1.130

Deployment notes

  • No galaxy reinstalls required since last release.

Image details