Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip thinking section in Claude tool call response #226

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

jackmpcollins
Copy link
Owner

@jackmpcollins jackmpcollins commented May 23, 2024

Skip the <thinking> / chain-of-thought section when parsing the response from Claude. This seems to always be included when tools are provided in the request. The models do not reliably start their answers with the <thinking> tag, so the changes here are only a partial fix for this. Seems like the only full solution is to not use streaming responses so the whole response can inform how to parse.

This is only an issue for return type annotations that union str/StreamedStr with a structured object or FunctionCall/ParallelFunctionCall, because the potential for string output means a tool call cannot be forced.

Potential full solution: if a union of tool call and string return type is given, then stream the whole response before determining how to parse it. This would essentially disable streaming in this case, which means StreamedStr/Iterable[T]/ParallelFunctionCall would arrive all at once instead of as the parts are generated.

Issue #220


In future, an AnthropicAssistantMessage could be added with a thinking: str attribute. This could be registered with message_to_anthropic_message so the thinking string is persisted and serialized back to the model. This could be exposed to users for use in @chatprompt (non-anthropic models would treat it like AssistantMessage and ignore thinking).

@jackmpcollins
Copy link
Owner Author

When using tools, Claude will often show its “chain of thought”, i.e. the step-by-step reasoning it uses to break down the problem and decide which tools to use. The Claude 3 Opus model will do this if tool_choice is set to auto (this is the default value, see Forcing tool use), and Sonnet and Haiku can be prompted into doing it.

It’s important to note that while the tags are a common convention Claude uses to denote its chain of thought, the exact format (such as what this XML tag is named) may change over time. Your code should treat the chain of thought like any other assistant-generated text, and not rely on the presence or specific formatting of the tags.

https://docs.anthropic.com/en/docs/tool-use#error-handling

@mnicstruwig
Copy link
Contributor

mnicstruwig commented May 28, 2024

Not sure if it's related, but I'm also unable to manually handle function calls (although as you'll see below, it would also likely apply to prompt_chain if it were working correctly):

from magentic import ParallelFunctionCall, StreamedStr, prompt, prompt_chain, chatprompt, SystemMessage, UserMessage, AssistantMessage, FunctionResultMessage, FunctionCall
from magentic.chat_model.anthropic_chat_model import AnthropicChatModel

def get_weather(city: str) -> str:
    return f"The weather in {city} is 20°C."


function_call = FunctionCall(function=get_weather, city="Cape Town")

messages = [
    SystemMessage("You are helpful."),
    UserMessage("What's the weather like in Cape Town?"),
    AssistantMessage(function_call),
    FunctionResultMessage(function_call=function_call, content="The weather in Cape Town is 20°C.")
]

@chatprompt(
    *messages,
    functions=[get_weather],
    model=AnthropicChatModel(
        model="claude-3-opus-20240229",
        temperature=0.2,
    )
)
def _llm() -> FunctionCall | StreamedStr: ...

response = _llm()
response

Which leads to the following Anthropic API error:

anthropic.BadRequestError: Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'messages.2.content.0.tool_result.content.0: Input should be a valid dictionary or object to extract fields from'}}

What's bizarre is that if I modify the way the function results get handled, to return a string, rather than converting into an object / dict:

@message_to_anthropic_message.register(FunctionResultMessage)
def _(message: FunctionResultMessage[Any]) -> ToolsBetaMessageParam:
    function_schema = function_schema_for_type(type(message.content))

    return {
        "role": AnthropicMessageRole.USER.value,
        "content": [
            {
                "type": "tool_result",
                "tool_use_id": message.function_call._unique_id,
                "content": function_schema.serialize_args(message.content) #json.loads(function_schema.serialize_args(message.content)),
            }
        ],
    }

Then it works. The Anthropic docs seems to suggest that the answers can now be specified as a string, or as a list of nested content blocks.

I'm wondering if this changed with the public beta of tool use? Either way, it seems to work this way now.

Would probably be a good test case to include in the future!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants