-
|
How to handle conversation history limits with multiple agents? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
I hit this same issue. The trick is understanding how AutoGen handles message context. Here's what worked for me: For automatic context sharing in sequential conversations, AutoGen passes previous messages by default but you need to manage the token limit. The best approach is setting max_consecutive_auto_reply to control conversation depth. Set it to something like 10-15 exchanges which prevents runaway token usage. If you need more control, you can implement custom message trimming where you keep the system message plus the last 10-15 messages and drop everything in between. I also found that using GPT-3.5 for some agents instead of GPT-4 helps a lot since it has better rate limits and lower token costs. The key is being strategic about which agents really need GPT-4 level reasoning. For production I use max_consecutive_auto_reply=15 combined with keeping prompts concise and it works great. You can also disable caching if memory is an issue by setting cache_seed to None in your llm_config. Hope this helps! |
Beta Was this translation helpful? Give feedback.
I hit this same issue. The trick is understanding how AutoGen handles message context. Here's what worked for me: For automatic context sharing in sequential conversations, AutoGen passes previous messages by default but you need to manage the token limit. The best approach is setting max_consecutive_auto_reply to control conversation depth. Set it to something like 10-15 exchanges which prevents runaway token usage. If you need more control, you can implement custom message trimming where you keep the system message plus the last 10-15 messages and drop everything in between. I also found that using GPT-3.5 for some agents instead of GPT-4 helps a lot since it has better rate limits and …