[BUG] Open researcher get 400 error from litellm after planning step when using deepseek-r1 (CodeAgent) #646

lianggecm · 2025-02-14T05:57:45Z

Describe the bug
I'm using open researcher with deepseek-r1 using litellmmodel. The CodeAgent after the planning steps would create an input message that contains two consecutive message from "assistant". See the following screenshot from the open-telemetry.

The first assistant message is from the "initial_facts" step while the second assistant message was from the "initial_planning" step.

This message would cause the deepseek-r1 model to return a 400 error code.

Code to reproduce the error
open researcher with deepseek-r1 as model

Error logs (if any)

Error in generating model output:
litellm.BadRequestError: OpenAIException - Error code: 400 - {'error': {'code': 'invalid_parameter_error', 'param':
None, 'message': '<400> InternalError.Algo.InvalidParameter: An unknown error occurred due to an unsupported input
format.', 'type': 'invalid_request_error'}, 'id': 'chatcmpl-df620c7e-749a-999e-b873-bc32d7a6112a', 'request_id':
'df620c7e-749a-999e-b873-bc32d7a6112a'}

Packages version:
1.8.1

Additional context
Add any other context about the problem here.

callanwu · 2025-02-20T06:16:59Z

same issue, do you solve it?

zhengmzong · 2025-02-20T07:17:08Z

The main issue is that the output message format of the plan step does not meet the requirements of the thought/code structure, which leads to execution failure.

I have modify the original code file of src/smolagents/agent.py. It works.

original code:
Class CodeAget, step function
line 1256-1260
chat_message: ChatMessage = self.model( self.input_messages, stop_sequences=["<end_code>", "Observation:"], **additional_args, )

modeified code:

       ` last_message = self.input_messages[-1]
        if last_message['role'] == 'assistant' and last_message['content'][0]['text'].startswith(
            '[PLAN]'
        ):
            
            last_message_content = last_message['content'][0]['text']
            last_message_content_code = last_message_content
            responses_content = f"Thought: Here is the plan of action that I will follow to solve the task \n\nCode:\n```py\nplans_info=\"\"\"{last_message_content_code}\"\"\"\nprint(plans_info)\n```"
            
            chat_message: ChatMessage = ChatMessage.from_dict({
                'role': 'assistant',
                'content': responses_content
            })
        else:
            
            chat_message: ChatMessage = self.model(
                self.input_messages,
                stop_sequences=["<end_code>", "Observation:"],
                **additional_args,
        )`

So it works!

callanwu · 2025-02-20T07:33:29Z

thx a lot!

callanwu · 2025-02-21T04:57:56Z

After my double-checking, another reason seems that the last role cannot be Assistant, such as Qwen models in the Alibaba Cloud API (https://help.aliyun.com/zh/model-studio/developer-reference/use-qwen-by-calling-api).

Jovines · 2025-02-22T10:55:57Z

After my double-checking, another reason seems that the last role cannot be Assistant, such as Qwen models in the Alibaba Cloud API (https://help.aliyun.com/zh/model-studio/developer-reference/use-qwen-by-calling-api).

I encountered the same problem, and I customized a QwenOpenAIServerModel to resolve it.

class QwenOpenAIServerModel(OpenAIServerModel):
    def __call__(
        self,
        messages: List[Dict[str, str]],
        **kwargs,
    ) -> ChatMessage:
        # 预处理消息
        processed_messages = []
        for message in messages:
            # 复制消息字典以避免修改原始数据
            processed_message = message.copy()
            processed_messages.append(processed_message)
            # 如果最后一条消息角色是r是assistant，添加一个新的user消息
            if processed_message['role'] == MessageRole.ASSISTANT:
                processed_messages.append({
                    'role': MessageRole.USER,
                    'content': [{'type': 'text','text': 'Continue'}
                    ]
                })
        # 调用父类的 __call__ 方法处理预处理后的消息
        return super().__call__(messages=processed_messages,**kwargs)

lianggecm added the bug Something isn't working label Feb 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Open researcher get 400 error from litellm after planning step when using deepseek-r1 (CodeAgent) #646

[BUG] Open researcher get 400 error from litellm after planning step when using deepseek-r1 (CodeAgent) #646

lianggecm commented Feb 14, 2025

callanwu commented Feb 20, 2025

zhengmzong commented Feb 20, 2025 •

edited

Loading

callanwu commented Feb 20, 2025

callanwu commented Feb 21, 2025

Jovines commented Feb 22, 2025

[BUG] Open researcher get 400 error from litellm after planning step when using deepseek-r1 (CodeAgent) #646

[BUG] Open researcher get 400 error from litellm after planning step when using deepseek-r1 (CodeAgent) #646

Comments

lianggecm commented Feb 14, 2025

callanwu commented Feb 20, 2025

zhengmzong commented Feb 20, 2025 • edited Loading

callanwu commented Feb 20, 2025

callanwu commented Feb 21, 2025

Jovines commented Feb 22, 2025

zhengmzong commented Feb 20, 2025 •

edited

Loading