Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when trying to use latest-gpu container inside GitHub actions workflow. #1428

Open
daavoo opened this issue Sep 26, 2023 · 1 comment
Labels
bug Something isn't working ci-github p0-critical Max priority (ASAP)

Comments

@daavoo
Copy link

daavoo commented Sep 26, 2023

Workflow is here:

https://github.com/iterative/example-get-started-experiments/blob/main/.github/workflows/dvc-studio.yml

Example failure is here:

https://github.com/iterative/example-get-started-experiments/actions/runs/6310277365/job/17131981606

  Status: Downloaded newer image for iterativeai/cml:latest-gpu
  docker.io/iterativeai/cml:latest-gpu
  /usr/bin/docker create --name d36559e92e4847fcb5d0a04521f541f1_iterativeaicmllatestgpu_f5f4d0 --label 70c3d0 --workdir /__w/example-get-started-experiments/example-get-started-experiments --network github_network_5dc1361cfcd641c69071c53a762bc452 --gpus all --ipc host -e "HOME=/github/home" -e GITHUB_ACTIONS=true -e CI=true -v "/var/run/docker.sock":"/var/run/docker.sock" -v "/tmp/tmp.cjcLZAuplu/.cml/cml-r5thu1mwal-1c6kgrd3-2k2cbdhp/_work":"/__w" -v "/tmp/tmp.cjcLZAuplu/.cml/cml-r5thu1mwal-1c6kgrd3-2k2cbdhp/externals":"/__e":ro -v "/tmp/tmp.cjcLZAuplu/.cml/cml-r5thu1mwal-1c6kgrd3-2k2cbdhp/_work/_temp":"/__w/_temp" -v "/tmp/tmp.cjcLZAuplu/.cml/cml-r5thu1mwal-1c6kgrd3-2k2cbdhp/_work/_actions":"/__w/_actions" -v "/tmp/tmp.cjcLZAuplu/.cml/cml-r5thu1mwal-1c6kgrd3-2k2cbdhp/_work/_tool":"/__w/_tool" -v "/tmp/tmp.cjcLZAuplu/.cml/cml-r5thu1mwal-1c6kgrd3-2k2cbdhp/_work/_temp/_github_home":"/github/home" -v "/tmp/tmp.cjcLZAuplu/.cml/cml-r5thu1mwal-1c6kgrd3-2k2cbdhp/_work/_temp/_github_workflow":"/github/workflow" --entrypoint "tail" iterativeai/cml:latest-gpu "-f" "/dev/null"
  e45972c2305532a031a11328f784587dd0ec9b98581fdfab529d350955e6a2ba
  /usr/bin/docker start e45972c2305532a031a11328f784587dd0ec9b98581fdfab529d350955e6a2ba
  Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
  nvidia-container-cli: initialization error: nvml error: driver not loaded: unknown
  Error: failed to start containers: e45972c2305532a031a11328f784587dd0ec9b98581fdfab529d350955e6a2ba
@daavoo daavoo added bug Something isn't working ci-github labels Sep 26, 2023
@omesser omesser added the p0-critical Max priority (ASAP) label Sep 26, 2023
@omesser
Copy link
Contributor

omesser commented Sep 26, 2023

cc @iterative/cml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working ci-github p0-critical Max priority (ASAP)
Projects
None yet
Development

No branches or pull requests

2 participants