-
-
Notifications
You must be signed in to change notification settings - Fork 11k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Your current environment
The output of python collect_env.py
Docker image vllm/vllm-openai:v0.11.0
🐛 Describe the bug
Server
docker compose command to deploy the server:
command: --model /data/model --served-model-name demo-gpt --host 0.0.0.0 --port 8000 -tp 2 --tokenizer-mode auto --trust-remote-code --gpu-memory-utilization 0.95 --max-model-len 8192 --max-num-seqs=100 --safetensors-load-strategy eager --enable-log-requests --enable-log-outputs --reasoning-parser qwen3
Request with continue_final_message is False
{
"model": "qwen3-32b",
"messages": [
{
"role": "system",
"content": "You are an AI"
},
{
"role": "user",
"content": "1+1=?"
}
],
"chat_template_kwargs": {"enable_thinking": true},
}
prompt from server log:
prompt: '<|im_start|>system\nYou are an AI<|im_end|>\n<|im_start|>user\n1+1=?<|im_end|>\n<|im_start|>assistant\n'
Request with continue_final_message is True
{
"model": "qwen3-32b",
"messages": [
{
"role": "system",
"content": "You are an AI"
},
{
"role": "user",
"content": "1+1=?"
},
{
"role": "assistant",
"content": "<think>\nGood Question"
}
],
"add_generation_prompt": false,
"continue_final_message": true,
"chat_template_kwargs": {"enable_thinking": true}
}
prompt from server log:
prompt: '<|im_start|>system\nYou are an AI<|im_end|>\n<|im_start|>user\n1+1=?<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\n<think>\nGood Question
The Bug
When set {"enable_thinking": true} and "continue_final_message": true, extra <think>\n\n</think>\n\n was inserted to the prompt.
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working