You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, great work! I am trying to perform Full Training from Start, but I am running out of GPU memory. How much GPU resources are needed for training?
The repository states: At least 4A6000 GPUs or 2A100 GPUs will be enough for the training.
I am training on 2*A100 GPUs, each with 80GB. However, I still encounter out of memory issues:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 3.82 GiB (GPU 1; 79.15 GiB total capacity; 71.88 GiB already allocated; 3.40 GiB free; 74.46 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
The text was updated successfully, but these errors were encountered:
Hello, great work! I am trying to perform Full Training from Start, but I am running out of GPU memory. How much GPU resources are needed for training?
The repository states: At least 4A6000 GPUs or 2A100 GPUs will be enough for the training.
I am training on 2*A100 GPUs, each with 80GB. However, I still encounter out of memory issues:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 3.82 GiB (GPU 1; 79.15 GiB total capacity; 71.88 GiB already allocated; 3.40 GiB free; 74.46 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
The text was updated successfully, but these errors were encountered: