Skip to content

[Feature] Update GPU Docker image to CUDA 12.x for RTX 50-series support #168

@slomus

Description

@slomus

The current GPU Docker image pluja/whishper:latest-gpu uses CUDA 11.8, which is incompatible with newer NVIDIA GPUs from the RTX 50-series (e.g. RTX 5090).

When attempting to use GPU acceleration, transcription fails with the following error:

RuntimeError: cuBLAS failed with status CUBLAS_STATUS_NOT_SUPPORTED

Env
-GPU: NVIDIA GeForce RTX 5090
-Driver Version: 580.82.07
-Host CUDA Version: 13.0
-Docker Image: pluja/whishper:latest-gpu
-Image CUDA Version: 11.8 (PyTorch: torch==2.4.1+cu118)

Analysis
This appears to be caused by the image’s CUDA version (11.8), which does not support the compute capability of the latest RTX 50-series GPUs.
Other containers using CUDA 12.6 on the same host work correctly, confirming this issue is image-specific rather than a host configuration problem.

Suggested solution
To restore GPU compatibility and performance:

  1. Update the base image from nvidia/cuda:11.8 to nvidia/cuda:12.6.0-runtime-ubuntu22.04 (or newer if possible).
  2. Upgrade PyTorch to a CUDA 12.x compatible version (e.g. torch>=2.5.0).
  3. Verify faster-whisper compatibility with the updated CUDA stack.

GPU acceleration would significantly improve transcription performance, especially on high-end GPUs like the RTX 5090.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions