Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Open researcher get 400 error from litellm after planning step when using deepseek-r1 (CodeAgent) #646

Open
lianggecm opened this issue Feb 14, 2025 · 5 comments
Labels
bug Something isn't working

Comments

@lianggecm
Copy link

Describe the bug
I'm using open researcher with deepseek-r1 using litellmmodel. The CodeAgent after the planning steps would create an input message that contains two consecutive message from "assistant". See the following screenshot from the open-telemetry.

Image

The first assistant message is from the "initial_facts" step while the second assistant message was from the "initial_planning" step.

This message would cause the deepseek-r1 model to return a 400 error code.

Code to reproduce the error
open researcher with deepseek-r1 as model

Error logs (if any)

Error in generating model output:
litellm.BadRequestError: OpenAIException - Error code: 400 - {'error': {'code': 'invalid_parameter_error', 'param':
None, 'message': '<400> InternalError.Algo.InvalidParameter: An unknown error occurred due to an unsupported input
format.', 'type': 'invalid_request_error'}, 'id': 'chatcmpl-df620c7e-749a-999e-b873-bc32d7a6112a', 'request_id':
'df620c7e-749a-999e-b873-bc32d7a6112a'}

Packages version:
1.8.1

Additional context
Add any other context about the problem here.

@lianggecm lianggecm added the bug Something isn't working label Feb 14, 2025
@callanwu
Copy link

same issue, do you solve it?

@zhengmzong
Copy link

zhengmzong commented Feb 20, 2025

The main issue is that the output message format of the plan step does not meet the requirements of the thought/code structure, which leads to execution failure.

I have modify the original code file of src/smolagents/agent.py. It works.

original code:
Class CodeAget, step function
line 1256-1260
chat_message: ChatMessage = self.model( self.input_messages, stop_sequences=["<end_code>", "Observation:"], **additional_args, )

modeified code:

       ` last_message = self.input_messages[-1]
        if last_message['role'] == 'assistant' and last_message['content'][0]['text'].startswith(
            '[PLAN]'
        ):
            
            last_message_content = last_message['content'][0]['text']
            last_message_content_code = last_message_content
            responses_content = f"Thought: Here is the plan of action that I will follow to solve the task \n\nCode:\n```py\nplans_info=\"\"\"{last_message_content_code}\"\"\"\nprint(plans_info)\n```"
            
            chat_message: ChatMessage = ChatMessage.from_dict({
                'role': 'assistant',
                'content': responses_content
            })
        else:
            
            chat_message: ChatMessage = self.model(
                self.input_messages,
                stop_sequences=["<end_code>", "Observation:"],
                **additional_args,
        )`

So it works!

@callanwu
Copy link

thx a lot!

@callanwu
Copy link

After my double-checking, another reason seems that the last role cannot be Assistant, such as Qwen models in the Alibaba Cloud API (https://help.aliyun.com/zh/model-studio/developer-reference/use-qwen-by-calling-api).

@Jovines
Copy link

Jovines commented Feb 22, 2025

After my double-checking, another reason seems that the last role cannot be Assistant, such as Qwen models in the Alibaba Cloud API (https://help.aliyun.com/zh/model-studio/developer-reference/use-qwen-by-calling-api).

I encountered the same problem, and I customized a QwenOpenAIServerModel to resolve it.

class QwenOpenAIServerModel(OpenAIServerModel):
    def __call__(
        self,
        messages: List[Dict[str, str]],
        **kwargs,
    ) -> ChatMessage:
        # 预处理消息
        processed_messages = []
        for message in messages:
            # 复制消息字典以避免修改原始数据
            processed_message = message.copy()
            processed_messages.append(processed_message)
            # 如果最后一条消息角色是r是assistant,添加一个新的user消息
            if processed_message['role'] == MessageRole.ASSISTANT:
                processed_messages.append({
                    'role': MessageRole.USER,
                    'content': [{'type': 'text','text': 'Continue'}
                    ]
                })
        # 调用父类的 __call__ 方法处理预处理后的消息
        return super().__call__(messages=processed_messages,**kwargs)
   

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants