Version
v1.6.0-beta2-24-g998d535f
Which installation method(s) does this occur on?
Source
Describe the bug.
When using streaming endpoints with ReAct agent (either HTTP or WebSocket), the full response is displayed all at once instead of being streamed out token-by-token. This results in significantly increased perceived latency before the user sees any response, even if TTFT is low.
Relevant log output
Click here to see error details
[Paste the error here, it will be hidden by default]
Other/Misc.
The Tool-Calling Agent streams tokens correctly under the same configs.
Code of Conduct
Version
v1.6.0-beta2-24-g998d535f
Which installation method(s) does this occur on?
Source
Describe the bug.
When using streaming endpoints with ReAct agent (either HTTP or WebSocket), the full response is displayed all at once instead of being streamed out token-by-token. This results in significantly increased perceived latency before the user sees any response, even if TTFT is low.
Relevant log output
Click here to see error details
Other/Misc.
The Tool-Calling Agent streams tokens correctly under the same configs.
Code of Conduct