Official Support for NVIDIA GPU Acceleration in Bacalhau's Docker-in-Docker (dind) Image #4851
Labels
request/new
Request: Indicates a new request that has been submitted and awaits initial triage
type/enhancement
Type: New features or enhancements to existing features
The problem:
Bacalhau doesn't support GPU workloads when deployed using the bacalhau-dind image.
Description:
Currently, Bacalhau's official dind image is based on docker:dind, which uses Alpine Linux. This limits its compatibility with NVIDIA Container Toolkit (nvidia-container-toolkit), as the toolkit does not officially support Alpine. Users who require GPU acceleration must manually modify the Bacalhau image or use a different base image, leading to issues with compatibility, missing dependencies (dockerd-entrypoint.sh), and increased complexity.
This feature request proposes adding official support for NVIDIA GPUs in Bacalhau's dind image by:
Providing a variant of the Bacalhau dind image based on an Ubuntu/Debian base (possibly this one that already includes the necessary NVIDIA toolkit - https://github.com/prasad89/dind-ubuntu-nvidia)
Ensuring the image includes NVIDIA Container Toolkit for GPU passthrough.
Retaining Bacalhau’s existing functionalities while making it GPU-capable.
Why This is Needed:
Lack of GPU Support in Alpine-based dind
The current docker:dind base does not support nvidia-container-toolkit, making it difficult to run Bacalhau workloads on GPUs.
Workarounds Are Complex & Unstable
Users must manually modify Bacalhau’s Dockerfile to base it on an Ubuntu/Debian dind image, copy missing dependencies (dockerd-entrypoint.sh), and install NVIDIA-related packages. This is prone to failure and requires ongoing maintenance.
Expanding Bacalhau’s Use Cases
Official GPU support in dind would allow Bacalhau to be used in machine learning, AI training, and other GPU-intensive workloads more effectively.
The text was updated successfully, but these errors were encountered: