-
-
Notifications
You must be signed in to change notification settings - Fork 162
Description
The current GPU Docker image pluja/whishper:latest-gpu uses CUDA 11.8, which is incompatible with newer NVIDIA GPUs from the RTX 50-series (e.g. RTX 5090).
When attempting to use GPU acceleration, transcription fails with the following error:
RuntimeError: cuBLAS failed with status CUBLAS_STATUS_NOT_SUPPORTED
Env
-GPU: NVIDIA GeForce RTX 5090
-Driver Version: 580.82.07
-Host CUDA Version: 13.0
-Docker Image: pluja/whishper:latest-gpu
-Image CUDA Version: 11.8 (PyTorch: torch==2.4.1+cu118)
Analysis
This appears to be caused by the image’s CUDA version (11.8), which does not support the compute capability of the latest RTX 50-series GPUs.
Other containers using CUDA 12.6 on the same host work correctly, confirming this issue is image-specific rather than a host configuration problem.
Suggested solution
To restore GPU compatibility and performance:
- Update the base image from nvidia/cuda:11.8 to nvidia/cuda:12.6.0-runtime-ubuntu22.04 (or newer if possible).
- Upgrade PyTorch to a CUDA 12.x compatible version (e.g. torch>=2.5.0).
- Verify faster-whisper compatibility with the updated CUDA stack.
GPU acceleration would significantly improve transcription performance, especially on high-end GPUs like the RTX 5090.