fix: omit null delta fields in streaming chat completions (issue #2082)#2092
Merged
Merged
Conversation
Evanev7
approved these changes
May 14, 2026
Member
Evanev7
left a comment
There was a problem hiding this comment.
not certain this is a complete solution, but it certainly doesn't hurt. 3/3!
ed6fb98 to
ac4c027
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Streaming /v1/chat/completions responses emitted null for tool_calls,
function_call, name, and tool_call_id in every delta chunk. The OpenAI
streaming spec marks these fields as non-nullable — they must either carry a
real value or be absent entirely. Spec-correct clients doing
delta.get("tool_calls", []) receive None and crash with 'NoneType' object is
not iterable.
Root cause: the streaming serialisation path called model_dump_json() without
exclude_none=True, while the request-parsing path already used it correctly.
Three call sites in chat_completions.py and two in responses.py were affected.
Testing
Before — every delta carries explicit nulls:
$ curl -sN -X POST http://localhost:52415/v1/chat/completions
-H 'Content-Type: application/json'
-d '{"model":"mlx-community/Qwen3.5-2B-MLX-8bit","messages":[{"role":"user","
content":"hi"}],"max_tokens":3,"stream":true}'
| grep "^data: "
data: {"id":"7c4dae10-...","choices":[{"index":0,"delta":{"role":"assistant","c
ontent":null,"reasoning_content":"Okay","name":null,"tool_calls":null,"tool_cal
l_id":null,"function_call":null},"logprobs":null,"finish_reason":null,"usage":n
ull}],"usage":null,"service_tier":null}
data: {"id":"7c4dae10-...","choices":[{"index":0,"delta":{"role":"assistant","c
ontent":null,"reasoning_content":",","name":null,"tool_calls":null,"tool_call_i
d":null,"function_call":null},"logprobs":null,"finish_reason":null,"usage":null
}],"usage":null,"service_tier":null}
data: {"id":"7c4dae10-...","choices":[{"index":0,"delta":{"role":"assistant","c
ontent":" the","reasoning_content":null,"name":null,"tool_calls":null,"tool_cal
l_id":null,"function_call":null},"logprobs":null,"finish_reason":"length","usag
e":{"prompt_tokens":11,...}}],"usage":null,"service_tier":null}
data: [DONE]
After — only populated fields are emitted:
data: {"id":"demo","object":"chat.completion","created":...,"model":"mlx-commun
ity/Qwen3.5-2B-MLX-8bit","choices":[{"index":0,"delta":{"role":"assistant","rea
soning_content":"Okay"}}]}
data: {"id":"demo","object":"chat.completion","created":...,"model":"mlx-commun
ity/Qwen3.5-2B-MLX-8bit","choices":[{"index":0,"delta":{"role":"assistant","rea
soning_content":","}}]}
data: {"id":"demo","object":"chat.completion","created":...,"model":"mlx-commun
ity/Qwen3.5-2B-MLX-8bit","choices":[{"index":0,"delta":{"role":"assistant","con
tent":" the"},"finish_reason":"length"}],"usage":{"prompt_tokens":11,"completio
n_tokens":3,"total_tokens":14,...}}
data: [DONE]