-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Claude Sonnet 3.7 as a reasoning tool #11445
Conversation
84de25f
to
c5b5e2a
Compare
c5b5e2a
to
bd5b92a
Compare
fc8dd59
to
54866db
Compare
a52f838
to
e9a7671
Compare
…s named thinking_delta and not thinking)
4c79228
to
cc04550
Compare
ea9281f
to
6109159
Compare
71213ba
to
c1696c1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Directionally LGTM but we don't want to change the ChatMessage interface. Why other solutions do we have?
Good to math options through extras for now.
core/src/providers/chat_messages.rs
Outdated
@@ -78,6 +78,10 @@ pub struct AssistantChatMessage { | |||
#[serde(skip_serializing_if = "Option::is_none")] | |||
pub reasoning_content: Option<String>, | |||
#[serde(skip_serializing_if = "Option::is_none")] | |||
pub thinking: Option<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is too overfitted on Anthropic for our general API. We already have things in place to transmit reasoning tokens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have reasoning_content
that was used for deepseek r1 (only other model to expose CoTs AFAIK) but my issue was that I really needed 2 fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can use reasoning_content
for the regular CoTs and add a field additional_reasoning_content
for the non human-parsable part, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you expand a bit on why you need two fields? What is the non_parseable part?
A possibility is to use content with that we then parse front-side to dissociate between different types of streams (as we do with reasoning tokens from claude 3.5 afaict?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they seem to parse back the reasoning tokens (they mention smth about "preserving the reasoning flow and conversation integrity" in the doc) in a pretty strict manner. AFAICT the non parsable part is just encrypted CoTs.
is parsing on the front-end the delimiter on the <thinking>
blocks? if so I don't get <thinking>
, I really get different structs in the messages streamed by anthropic and want to pass them back accurately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forgot to tell this: we get a 400 from the api if we try to use thinking but don't pass the thinking
and/or redacted_thinking
blocks (it's actually the main drive behind this change), so if we were to use it as a reasoning tool instead of like this we would need to swallow the thinking tokens probably
core/src/providers/mistral.rs
Outdated
@@ -238,6 +238,8 @@ impl TryFrom<&MistralChatMessage> for AssistantChatMessage { | |||
Ok(AssistantChatMessage { | |||
content, | |||
reasoning_content: None, | |||
thinking: None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought so aha
@@ -369,6 +369,8 @@ impl TryFrom<&OpenAICompletionChatMessage> for AssistantChatMessage { | |||
Ok(AssistantChatMessage { | |||
content, | |||
reasoning_content, | |||
thinking: None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope nope ;-)
a5775e6
to
afaf519
Compare
afaf519
to
18a9874
Compare
StreamContent::AnthropicStreamThinking(content)) => { | ||
content.thinking.push_str(delta.thinking.as_str()); | ||
if delta.thinking.len() > 0 { | ||
let _ = event_sender.send(json!({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gets turned into text? I'm pretty sure we do something else for r1/o1-o3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per IRL. Out of consistency we want to inject thinking delimiters here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
If you commit to moving to delimiters for reasoning stuff THIS WEEK
Description
thinking
andredacted_thinking
blocks + deltas (AssistantChatMessage
has 2 new fields that are optional and can be omitted).output-128k-2025-02-19
(increases the length of the output) + pass thethinking
field in extras (ref), with a hardcoded token budget.We choose to expose it as a reasoning tool instead of using its thinking mode in the regular multi step process for one main reason:
thinking
andredacted_thinking
tokens, the latter being encrypted data.AssistantChatMessage
in its current state (we would need an additional field that would store the encrypted data, which does not make a lot of sense).Tests
Risk
Deploy Plan
front
.core
.