You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to reproduce VoxelNeXt on mmdetection3d. I transferred the VoxelNeXt code in pull request 2692 to the main branch version of mmdetection3d and fixed bugs in their code. My code can be run successfully now, but I found that memory usage on each GPU is quite unbalanced. Is this phenomenon normal?
I have tried to train CenterPoint on my machine, and the GPU memory usage on GPUs are balanced. Do you guys know what possible factors could cause this phenomenon?
Prerequisite
Task
I have modified the scripts/configs, or I'm working on my own tasks/models/datasets.
Branch
main branch https://github.com/open-mmlab/mmdetection3d
Environment
sys.platform: linux
Python: 3.8.19 (default, Mar 20 2024, 19:58:24) [GCC 11.2.0]
CUDA available: True
MUSA available: False
numpy_random_seed: 2147483648
GPU 0,1,2,3,4,5,6,7: NVIDIA GeForce RTX 3090
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 11.7, V11.7.64
GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
PyTorch: 1.10.1
PyTorch compiling details: PyTorch built with:
TorchVision: 0.11.2
OpenCV: 4.9.0
MMEngine: 0.10.4
MMDetection: 3.3.0
MMDetection3D: 1.4.0+fe25f7a
spconv2.0: True
Reproduces the problem - code sample
I'm trying to reproduce VoxelNeXt on mmdetection3d. I transferred the VoxelNeXt code in pull request 2692 to the main branch version of mmdetection3d and fixed bugs in their code. My code can be run successfully now, but I found that memory usage on each GPU is quite unbalanced. Is this phenomenon normal?
I have tried to train CenterPoint on my machine, and the GPU memory usage on GPUs are balanced.
Do you guys know what possible factors could cause this phenomenon?
Reproduces the problem - command or script
shell ./tools/dist_train.sh ./configs/voxelnext/voxelnext_voxel0075_second_secfpn_8xb4-cyclic-20e_nus-3d.py 8
Reproduces the problem - error message
Additional information
No response
The text was updated successfully, but these errors were encountered: