[Bugfix] Fix deepseekv3 grouped topk error #13474

Chen-XiaoBing · 2025-02-18T09:33:41Z

Fix the logic in grouped topk computation of fused moe.

There is a slight difference between vllm grouped_topk and the official code. When the newly-introduced bias term (e_score_correction_bias in vllm) is not None, we should firstly get the top-2 scores of each group, and use the summation to get top-k groups. You can also check the compute logic in DeepSeek v3's official inference code

The mask scores are set to 0. This configuration may result in the selection of experts in masked groups if the scores in the unmasked groups are negative. This behavior can lead to incorrect or suboptimal selections in certain scenarios. DeepSeek v3 also reset masked scores to -inf.

github-actions · 2025-02-18T09:33:55Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

mgoin · 2025-02-18T17:14:02Z

Thank you for your contribution!

I ran a quick eval on gsm8k with DeepSeek R1 and see that this PR seems to slightly regress performance, however it is within stderr.

(vllm) ➜  vllm git:(main) lm_eval --model vllm --model_args pretrained=/home/vllm-dev/DeepSeek-R1,tensor_parallel_size=8 --trust_remote_code --tasks gsm8k --num_fewshot 5 --batch_size auto
vllm (pretrained=/home/vllm-dev/DeepSeek-R1,tensor_parallel_size=8,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: auto
|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value|   |Stderr|
|-----|------:|----------------|-----:|-----------|---|----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|↑  |0.956|±  |0.0056|
|     |       |strict-match    |     5|exact_match|↑  |0.956|±  |0.0056|

(vllm) ➜  vllm git:(fix-dsv3-grouped-topk) lm_eval --model vllm --model_args pretrained=/home/vllm-dev/DeepSeek-R1,tensor_parallel_size=8 --trust_remote_code --tasks gsm8k --num_fewshot 5 --batch_size auto
vllm (pretrained=/home/vllm-dev/DeepSeek-R1,tensor_parallel_size=8,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: auto
|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|↑  |0.9507|±  | 0.006|
|     |       |strict-match    |     5|exact_match|↑  |0.9507|±  | 0.006|

I will try to run harder evals to determine improvement. Would you have an example of bad performance from DeepSeek on main due to this issue?

simon-mo · 2025-02-18T18:10:18Z

@mgoin, we might need to test V3 model as it is more trained for GSM and MMLU.

mgoin · 2025-02-18T19:32:12Z

Here is MMLU for R1, similar small drops for each category - I'll try to get V3 going on another machine

(vllm) ➜  vllm git:(main) lm_eval --model vllm --model_args pretrained=/home/vllm-dev/DeepSeek-R1,tensor_parallel_size=8,max_model_len=2048,gpu_memory_utilization=0.8 --trust_remote_code --tasks mmlu --batch_size 16
vllm (pretrained=/home/vllm-dev/DeepSeek-R1,tensor_parallel_size=8,max_model_len=2048,gpu_memory_utilization=0.8,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 16

|      Groups      |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|------------------|------:|------|------|------|---|-----:|---|-----:|
|mmlu              |      2|none  |      |acc   |↑  |0.8514|±  |0.0029|
| - humanities     |      2|none  |      |acc   |↑  |0.7845|±  |0.0057|
| - other          |      2|none  |      |acc   |↑  |0.8822|±  |0.0055|
| - social sciences|      2|none  |      |acc   |↑  |0.9227|±  |0.0047|
| - stem           |      2|none  |      |acc   |↑  |0.8513|±  |0.0062|

(vllm) ➜  vllm git:(fix-dsv3-grouped-topk) lm_eval --model vllm --model_args pretrained=/home/vllm-dev/DeepSeek-R1,tensor_parallel_size=8,max_model_len=2048,gpu_memory_utilization=0.8 --trust_remote_code --tasks mmlu --batch_size 16
vllm (pretrained=/home/vllm-dev/DeepSeek-R1,tensor_parallel_size=8,max_model_len=2048,gpu_memory_utilization=0.8,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 16
|      Groups      |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|------------------|------:|------|------|------|---|-----:|---|-----:|
|mmlu              |      2|none  |      |acc   |↑  |0.8492|±  |0.0029|
| - humanities     |      2|none  |      |acc   |↑  |0.7824|±  |0.0057|
| - other          |      2|none  |      |acc   |↑  |0.8806|±  |0.0055|
| - social sciences|      2|none  |      |acc   |↑  |0.9194|±  |0.0048|
| - stem           |      2|none  |      |acc   |↑  |0.8493|±  |0.0062|

simon-mo · 2025-02-20T06:48:42Z

I'll release v0.7.3 after this PR is merged.

Chen-XiaoBing · 2025-02-20T07:40:51Z

@simon-mo Some checks in the CI pipeline have failed. Would you kindly assist with merging the code?

Signed-off-by: Chen-XiaoBing <[email protected]>

vllm-project#13474

Signed-off-by: Kunshang Ji <[email protected]>

Signed-off-by: Chen-XiaoBing <[email protected]>

Chen-XiaoBing force-pushed the fix-dsv3-grouped-topk branch from 76896ef to d390d1b Compare February 18, 2025 12:31

simon-mo approved these changes Feb 20, 2025

View reviewed changes

simon-mo added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 20, 2025

simon-mo enabled auto-merge (squash) February 20, 2025 06:46

auto-merge was automatically disabled February 20, 2025 06:54
Head branch was pushed to by a user without write access

Chen-XiaoBing force-pushed the fix-dsv3-grouped-topk branch 2 times, most recently from 1dbfe7b to cb53786 Compare February 20, 2025 07:08

simon-mo enabled auto-merge (squash) February 20, 2025 07:09

auto-merge was automatically disabled February 20, 2025 07:14
Head branch was pushed to by a user without write access

Chen-XiaoBing force-pushed the fix-dsv3-grouped-topk branch from cb53786 to 54dfc7e Compare February 20, 2025 07:14

Chen-XiaoBing added 3 commits February 20, 2025 17:04

[Bugfix] Fix deepseekv3 grouped topk error

4b21670

Signed-off-by: Chen-XiaoBing <[email protected]>

replace mask scores with -inf

5382a37

Signed-off-by: Chen-XiaoBing <[email protected]>

pass the pre-commit checks

7676a35

Signed-off-by: Chen-XiaoBing <[email protected]>

Wei-Lin-Intel added a commit to yangulei/vllm-fork that referenced this pull request Feb 20, 2025

Correct Accuracy Issue for grouped_topk and Merge pull/13474

0c7ea0d

vllm-project#13474

Wei-Lin-Intel added a commit to yangulei/vllm-fork that referenced this pull request Feb 20, 2025

Correct Accuracy Issue for grouped_topk and Merge pull/13474

34ef63d

vllm-project#13474

yiliu30 pushed a commit to yiliu30/vllm-fork that referenced this pull request Feb 20, 2025

Correct Accuracy Issue for grouped_topk and Merge pull/13474

4763d8e

vllm-project#13474

Chen-XiaoBing force-pushed the fix-dsv3-grouped-topk branch from 54dfc7e to 7676a35 Compare February 20, 2025 10:18

jikunshang added a commit to jikunshang/vllm that referenced this pull request Feb 20, 2025

pick PR vllm-project#13474

5352324

Signed-off-by: Kunshang Ji <[email protected]>

simon-mo merged commit ed6e907 into vllm-project:main Feb 20, 2025
41 of 44 checks passed

xuechendi pushed a commit to xuechendi/vllm-fork that referenced this pull request Feb 20, 2025

[Bugfix] Fix deepseekv3 grouped topk error (vllm-project#13474)

9aaccbc

Signed-off-by: Chen-XiaoBing <[email protected]>

Chen-XiaoBing deleted the fix-dsv3-grouped-topk branch February 20, 2025 16:27

kerthcet pushed a commit to kerthcet/vllm that referenced this pull request Feb 21, 2025

[Bugfix] Fix deepseekv3 grouped topk error (vllm-project#13474)

ef8cc6f

Signed-off-by: Chen-XiaoBing <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] Fix deepseekv3 grouped topk error #13474

[Bugfix] Fix deepseekv3 grouped topk error #13474

Chen-XiaoBing commented Feb 18, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Feb 18, 2025

mgoin commented Feb 18, 2025

simon-mo commented Feb 18, 2025

mgoin commented Feb 18, 2025

simon-mo commented Feb 20, 2025

Chen-XiaoBing commented Feb 20, 2025

[Bugfix] Fix deepseekv3 grouped topk error #13474

[Bugfix] Fix deepseekv3 grouped topk error #13474

Conversation

Chen-XiaoBing commented Feb 18, 2025 • edited by github-actions bot Loading

github-actions bot commented Feb 18, 2025

mgoin commented Feb 18, 2025

simon-mo commented Feb 18, 2025

mgoin commented Feb 18, 2025

simon-mo commented Feb 20, 2025

Chen-XiaoBing commented Feb 20, 2025

Chen-XiaoBing commented Feb 18, 2025 •

edited by github-actions bot

Loading