Fix the model parallelism error in DeepSeek-VL2 #91

DeadLining · 2025-02-20T08:34:29Z

Fix Model Parallelism Error: KVCache CUDA Device Switching Issue

The RuntimeError:
Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (when checking argument for argument tensors in method wrapper_CUDA_cat)
error mentioned in #issues8 was caused by KVCache failing to switch CUDA devices correctly during inference.

This change is intended to fix the issue by ensuring proper device switching during inference.

krishnateja95 · 2025-02-24T19:20:22Z

Thanks for the PR. I am able to run across multiple GPUs now.

Fix the model parallelism error in DeepSeek-VL2

8ed2d7f

krishnateja95 mentioned this pull request Feb 26, 2025

can't run python web_demo.py with 8 4090 (24GB) cards #86

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix the model parallelism error in DeepSeek-VL2 #91

Fix the model parallelism error in DeepSeek-VL2 #91

Uh oh!

DeadLining commented Feb 20, 2025

Uh oh!

krishnateja95 commented Feb 24, 2025

Uh oh!

Uh oh!

Fix the model parallelism error in DeepSeek-VL2 #91

Are you sure you want to change the base?

Fix the model parallelism error in DeepSeek-VL2 #91

Uh oh!

Conversation

DeadLining commented Feb 20, 2025

Uh oh!

krishnateja95 commented Feb 24, 2025

Uh oh!

Uh oh!