Skip thinking section in Claude tool call response #226
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Skip the
<thinking>
/ chain-of-thought section when parsing the response from Claude.This seems to always be included when tools are provided in the request.The models do not reliably start their answers with the<thinking>
tag, so the changes here are only a partial fix for this. Seems like the only full solution is to not use streaming responses so the whole response can inform how to parse.This is only an issue for return type annotations that union
str
/StreamedStr
with a structured object orFunctionCall
/ParallelFunctionCall
, because the potential for string output means a tool call cannot be forced.Potential full solution: if a union of tool call and string return type is given, then stream the whole response before determining how to parse it. This would essentially disable streaming in this case, which means
StreamedStr
/Iterable[T]
/ParallelFunctionCall
would arrive all at once instead of as the parts are generated.Issue #220
In future, an
AnthropicAssistantMessage
could be added with athinking: str
attribute. This could be registered withmessage_to_anthropic_message
so thethinking
string is persisted and serialized back to the model. This could be exposed to users for use in@chatprompt
(non-anthropic models would treat it likeAssistantMessage
and ignorethinking
).