Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VRAM requirement for full sft deepseek VL 7B #860

Open
SinanAkkoyun opened this issue May 1, 2024 · 0 comments
Open

VRAM requirement for full sft deepseek VL 7B #860

SinanAkkoyun opened this issue May 1, 2024 · 0 comments

Comments

@SinanAkkoyun
Copy link

SinanAkkoyun commented May 1, 2024

Describe the bug
How much VRAM is needed to finetune the 7b VL model?

# Experimental Environment: A100
# GPU Memory Requirement: 80GB
# Runtime: 2.5 hours
CUDA_VISIBLE_DEVICES=0 \
swift sft \
    --model_type qwen1half-7b-chat \
    --dataset blossom-math-zh \
    --num_train_epochs 5 \
    --sft_type full \
    --output_dir output \
    --eval_steps 500 \

The docs say one needs 80GB for a normal 7b model, however when I try to train the DeepSeek VL 7B on the research rig with an A100 I get an OOM. When trying to split across 4 GPUs (1 A100 and 3 4090s), it does not utilize the A100 and OOMs with the 3 4090s before training can start

@SinanAkkoyun SinanAkkoyun changed the title VRAM requirement for full sft deepseek 7B VRAM requirement for full sft deepseek VL 7B May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant