Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/source/Instruction/命令行参数.md
Original file line number Diff line number Diff line change
Expand Up @@ -356,7 +356,7 @@ Vera使用`target_modules`、`target_regex`、`modules_to_save`三个参数,
- 注意:该参数在"ms-swift<3.7"的参数名为`gpu_memory_utilization`。下面的`vllm_`参数同理。若出现参数不匹配问题,请查看[ms-swift3.6文档](https://swift.readthedocs.io/zh-cn/v3.6/Instruction/%E5%91%BD%E4%BB%A4%E8%A1%8C%E5%8F%82%E6%95%B0.html#vllm)。
- 🔥vllm_tensor_parallel_size: tp并行数,默认为`1`。
- vllm_pipeline_parallel_size: pp并行数,默认为`1`。
- vllm_data_parallel_size: dp并行数,默认为`1`,在`rollout`命令中生效。
- vllm_data_parallel_size: dp并行数,默认为`1`,在`swift deploy/rollout`命令中生效。
- 若在`swift infer`中,使用`NPROC_PER_NODE`来设置dp并行数。参考这里的[例子](https://github.com/modelscope/ms-swift/blob/main/examples/infer/vllm/mllm_ddp.sh)。
- vllm_enable_expert_parallel: 开启专家并行,默认为False。
- vllm_max_num_seqs: 单次迭代中处理的最大序列数,默认为`256`。
Expand Down
2 changes: 1 addition & 1 deletion docs/source_en/Instruction/Command-line-parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -362,7 +362,7 @@ Parameter meanings can be found in the [vllm documentation](https://docs.vllm.ai
- Note: For ms-swift versions earlier than 3.7, this parameter is named `gpu_memory_utilization`. The same applies to the following `vllm_` parameters. If you encounter parameter mismatch issues, please refer to the [ms-swift 3.6 documentation](https://swift.readthedocs.io/en/v3.6/Instruction/Command-line-parameters.html#vllm-arguments).
- 🔥vllm_tensor_parallel_size: Tensor parallelism size. Default is `1`.
- vllm_pipeline_parallel_size: Pipeline parallelism size. Default is `1`.
- vllm_data_parallel_size: Data parallelism size, default is 1, effective in the infer and rollout commands.
- vllm_data_parallel_size: Data parallelism size, default is `1`, effective in the `swift deploy/rollout` command.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better clarity and conciseness, consider rephrasing this line. The current wording is a bit verbose.

Suggested change
- vllm_data_parallel_size: Data parallelism size, default is `1`, effective in the `swift deploy/rollout` command.
- vllm_data_parallel_size: Number of data parallelism (DP) replicas. Default is `1`, effective in the `swift deploy/rollout` command.

- In `swift infer`, use `NPROC_PER_NODE` to set the data parallelism (DP) degree. See the example [here](https://github.com/modelscope/ms-swift/blob/main/examples/infer/vllm/mllm_ddp.sh).
- vllm_enable_expert_parallel: Enable expert parallelism. Default is False.
- vllm_max_num_seqs: Maximum number of sequences to be processed in a single iteration. Default is `256`.
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
22 changes: 22 additions & 0 deletions examples/deploy/vllm_dp.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
CUDA_VISIBLE_DEVICES=0,1 swift deploy \
--model Qwen/Qwen2.5-VL-7B-Instruct \
--infer_backend vllm \
--served_model_name Qwen2.5-VL-7B-Instruct \
--vllm_max_model_len 8192 \
--vllm_gpu_memory_utilization 0.9 \
--vllm_data_parallel_size 2

# After the server-side deployment above is successful, use the command below to perform a client call test.

# curl http://localhost:8000/v1/chat/completions \
# -H "Content-Type: application/json" \
# -d '{
# "model": "Qwen2.5-VL-7B-Instruct",
# "messages": [{"role": "user", "content": [
# {"type": "image", "image": "http://modelscope-open.oss-cn-hangzhou.aliyuncs.com/images/cat.png"},
# {"type": "image", "image": "http://modelscope-open.oss-cn-hangzhou.aliyuncs.com/images/animal.png"},
# {"type": "text", "text": "What is the difference between the two images?"}
# ]}],
# "max_tokens": 256,
# "temperature": 0
# }'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

It's a good practice for shell scripts and other text files to end with a newline character. This can prevent potential issues with file processing tools. Please add a newline at the end of the file.

Suggested change
# }'
}'

Loading