Deep research by ayush-alag · Pull Request #27 · HazyResearch/minions

ayush-alag · 2025-03-19T21:36:07Z

This feature implements a "deep research" version of minions that generates the context for a query from a GPT-4o web search. The minions protocol of using this context for generating sub-tasks that are handled by local models is then used.

Example 1 (MCP)

(minions) ayushalag@Ayushs-MBP minions % python3 minions_cli.py --protocol "minions_deep_research"
Warning: mlx_lm is not installed. If you want to use mlx_lm, please install it with `pip install mlx-lm`.
Warning: cartesia_mlx is not installed. If you want to use cartesia_mlx, please follow the instructions in the README to install it.
Warning: mlx_lm is not installed. MLX models will not be available.
No context file or folder provided. Starting with empty context.
Initializing clients...
Initializing local client with ollama/llama3.2
Explain to me how AnthropicInitializing remote client with openai/gpt-4o
'Initializing minions_deep_research protocol
sSetup completed in 4.03 seconds

=== Minions ===
Type 'exit', 'quit', or Ctrl+D to end the conversation.
Type your message and press Enter to chat with the document.

> Can you explain how Anthropic's MCP works?

=== Processing ===
[Remote] Working...
Anthropic's Model Context Protocol (MCP) is an open-source standard designed to enhance the interoperability between large language models (LLMs) and external data sources. By providing a standardized interface, MCP enables AI systems to dynamically access and integrate information from various tools, databases, and platforms during inference, thereby improving their adaptability and functionality.

**Key Features of MCP:**

- **Universal Data Integration:** MCP offers a universal protocol that connects AI systems to diverse datasets, eliminating the need for custom code for each data source. This approach simplifies AI development and promotes scalability. ([theaitrack.com](https://theaitrack.com/anthropic-mcp-universal-ai-data-integration/?utm_source=openai))

- **Dynamic Context Augmentation:** Through MCP, AI models can retrieve external context from specialized tools during inference. This allows models to access real-time data, interact with memory systems, and perform tasks that extend beyond their standalone capabilities. ([peerlist.io](https://peerlist.io/blog/commentary/what-is-mcp?utm_source=openai))

- **Open-Source Framework:** As an open-source standard, MCP fosters collaboration and innovation within the AI community. Developers can adapt the protocol for diverse use cases, contribute to standardized tools, and reduce fragmentation in the AI ecosystem. ([geeky-gadgets.com](https://www.geeky-gadgets.com/anthropic-model-context-protocol-explained/?utm_source=openai))

**How MCP Works:**

1. **Tool Invocation:** When an AI model encounters a task requiring external information, it uses MCP to invoke the appropriate tool or data source.

2. **Data Retrieval:** The selected tool retrieves the necessary data and returns it to the AI model in a structured format.

3. **Context Integration:** The AI model integrates the retrieved data into its reasoning process, enabling it to generate informed and context-aware responses.

This process allows AI systems to "query" external systems similarly to how developers query APIs, enhancing the models' ability to perform complex tasks autonomously. ([peerlist.io](https://peerlist.io/blog/commentary/what-is-mcp?utm_source=openai))

**Real-World Applications:**

MCP has been implemented in various platforms to demonstrate its versatility:

- **Claude Desktop App:** Anthropic's Claude desktop app utilizes MCP to integrate with tools like web search engines, file systems, and GitHub, showcasing the protocol's capability to streamline workflows across different domains. ([geeky-gadgets.com](https://www.geeky-gadgets.com/anthropic-model-context-protocol-explained/?utm_source=openai))

- **Enterprise Adoption:** Companies such as Block and Apollo have adopted MCP to enhance productivity and scalability by integrating AI systems directly with their internal workflows. ([techmonitor.ai](https://www.techmonitor.ai/digital-economy/ai-and-automation/anthropic-introduces-open-source-mcp-to-simplify-ai-system-integrations?utm_source=openai))

**Getting Started with MCP:**

Developers interested in leveraging MCP can access the official MCP documentation and repository provided by Anthropic. The resources include prebuilt servers for popular platforms like Google Drive, Slack, and GitHub, as well as SDKs in languages such as Python and TypeScript to facilitate custom integrations. ([huggingface.co](https://huggingface.co/blog/Kseniase/mcp?utm_source=openai))

By standardizing the integration process between AI models and external data sources, MCP aims to democratize AI development, foster innovation, and enable the creation of more sophisticated, context-aware AI systems. 

Working...
To explain how Anthropic's MCP works, focus on its function as an open-source protocol that enhances interoperability between AI models and external data sources. Highlight how MCP allows AI systems to dynamically access external information during inference. Emphasize its role in data integration, context augmentation, and enabling real-time data retrieval. Mention the ease of integration through standardized interfaces and its application in various platforms like Claude and enterprise solutions. Lastly, note MCP's open-source nature, fostering community collaboration and innovation.

Round 1/5
Attempt 1/10
Working...

def prepare_jobs(
    context: List[str],
    prev_job_manifests: Optional[List[JobManifest]] = None,
    prev_job_outputs: Optional[List[JobOutput]] = None,
) -> List[JobManifest]:
    # Break the context into chunks
    chunks = chunk_by_section(context[0], max_chunk_size=1500)

    # Define keywords and weights for retrieval
    keywords_1 = ["Anthropic", "MCP", "Model Context Protocol", "interoperability", "external data"]
    weights_1 = {"Anthropic": 5.0, "MCP": 5.0, "Model Context Protocol": 4.0, "interoperability": 3.0, "external data": 3.0}

    keywords_2 = ["MCP", "data integration", "context augmentation", "real-time data retrieval"]
    weights_2 = {"MCP": 5.0, "data integration": 4.0, "context augmentation": 4.0, "real-time data retrieval": 4.0}

    keywords_3 = ["MCP", "open-source", "community collaboration", "innovation"]
    weights_3 = {"MCP": 5.0, "open-source": 4.0, "community collaboration": 3.0, "innovation": 3.0}

    # Retrieve relevant chunks for each task
    relevant_chunks_1 = retrieve_top_k_chunks(keywords_1, chunks, weights_1, k=10)
    relevant_chunks_2 = retrieve_top_k_chunks(keywords_2, chunks, weights_2, k=10)
    relevant_chunks_3 = retrieve_top_k_chunks(keywords_3, chunks, weights_3, k=10)

    # Create job manifests for each task
    job_manifests = []

    for chunk in relevant_chunks_1:
        job_manifests.append(JobManifest(
            chunk=chunk,
            task="Explain how Anthropic's MCP enhances interoperability between AI models and external data sources.",
            advice="Focus on the role of MCP in connecting AI models with external data."
        ))

    for chunk in relevant_chunks_2:
        job_manifests.append(JobManifest(
            chunk=chunk,
            task="Describe the data integration and context augmentation capabilities of MCP.",
            advice="Highlight how MCP enables real-time data retrieval and context integration."
        ))

    for chunk in relevant_chunks_3:
        job_manifests.append(JobManifest(
            chunk=chunk,
            task="Discuss the open-source nature of MCP and its impact on community collaboration and innovation.",
            advice="Emphasize the benefits of MCP being open-source for the AI community."
        ))

    return job_manifests

def transform_outputs(
    jobs: List[Job],
) -> str:
    # Filter out irrelevant or empty worker outputs
    relevant_outputs = [job.output for job in jobs if job.output and job.output.answer]

    # Aggregate results by task_id and chunk_id
    aggregated_results = {}
    for output in relevant_outputs:
        task_id = output.manifest.task_id
        if task_id not in aggregated_results:
            aggregated_results[task_id] = []
        aggregated_results[task_id].append(output.answer)

    # Create a final aggregated string
    final_aggregated_str = ""
    for task_id, answers in aggregated_results.items():
        final_aggregated_str += f"Task {task_id} Results:\n"
        final_aggregated_str += "\n".join(answers)
        final_aggregated_str += "\n\n"

    return final_aggregated_str

Created 3 job manifests (1 chunks, apriori requested 1 samples per chunk, 3 tasks)
Total number of job_manifests: 3
[Local] Working...
Sending 3 worker chats to the worker client
Processed 0/3 chunks successfully

Task: Explain how Anthropic's MCP enhances interoperability between AI models and external data sources.

Chunks without relevant information (1):
  Chunk 1

Task: Describe the data integration and context augmentation capabilities of MCP.

Chunks without relevant information (1):
  Chunk 1

Task: Discuss the open-source nature of MCP and its impact on community collaboration and innovation.

Chunks without relevant information (1):
  Chunk 1
MCP connects AI models with external data by providing a standardized interface for integrating diverse data formats and protocols.
MCP's data integration capabilities enable real-time data retrieval and context augmentation, allowing users to access and analyze data from multiple sources in a single interface.
MCP's open-source status fosters a collaborative environment where experts from around the world can contribute their expertise, share knowledge, and work together to advance the field of artificial intelligence.
After filtering, 3/3 jobs were included
[Remote] Working...
Working...
Attempt 1/5 response: {
  "explanation": "The gathered information sufficiently explains how Anthropic's MCP works, covering interoperability, data integration, and its open-source nature.",
  "feedback": null,
  "decision": "provide_final_answer",
  "answer": "Anthropic's MCP works by providing a standardized interface for integrating diverse data formats and protocols, enabling real-time data retrieval and context augmentation, and fostering community collaboration through its open-source nature.",
  "scratchpad": "MCP enhances interoperability by standardizing data integration with AI models. It allows real-time data access and context augmentation, and its open-source nature promotes global collaboration and innovation."
}
{
  "explanation": "The gathered information sufficiently explains how Anthropic's MCP works, covering interoperability, data integration, and its open-source nature.",
  "feedback": null,
  "decision": "provide_final_answer",
  "answer": "Anthropic's MCP works by providing a standardized interface for integrating diverse data formats and protocols, enabling real-time data retrieval and context augmentation, and fostering community collaboration through its open-source nature.",
  "scratchpad": "MCP enhances interoperability by standardizing data integration with AI models. It allows real-time data access and context augmentation, and its open-source nature promotes global collaboration and innovation."
}

ayush-alag · 2025-03-19T21:38:31Z

minions/prompts/minions.py

-Now synthesize the findings from multiple junior workers (LLMs). 
-Your task is to finalize an answer to the question below **if and only if** you have sufficient, reliable information. 
+Now synthesize the findings from multiple junior workers (LLMs).
+Your task is to finalize an answer to the question below **if and only if** you have sufficient, reliable information.


there's a lot of noise in this diff because my code editor removes trailing whitespace. (the main change is lines 747-58).

happy to revert this noise for easier code review; if we'd like to keep it in just to clean things up, that's fine too.

ayush-alag · 2025-03-19T21:40:24Z

minions/minions_deep_research.py

+        local_usage = Usage()
+
+        # 1. [REMOTE] CONTEXT --- Read the query with big model and generate web-search context
+        web_preview_client = OpenAIClient(model_name="gpt-4o-mini",


doesn't seem like GPT-4.5 preview supports web search anymore (https://discord.com/channels/@me/1351994271699566622/1351994706988498986)

ayush-alag · 2025-03-20T00:12:38Z

Results

Query: Can you explain how Anthropic's MCP works?
Answer: "Anthropic's MCP works by providing a standardized interface for integrating diverse data formats and protocols, enabling real-time data retrieval and context augmentation, and fostering community collaboration through its open-source nature."

Query: What are the principles of GRPO?
Answer: "Group Relative Policy Optimization (GRPO) is a reinforcement learning algorithm designed to train large language models for complex tasks. Its principles include group-based advantage estimation, where multiple responses are generated and scored to calculate advantages; a policy update mechanism that minimizes a loss function with a KL divergence term for stable updates; and iterative online learning, which continuously improves the model using data generated during training. GRPO is memory-efficient as it eliminates the need for a separate value function, unlike traditional methods like PPO."

ayush-alag · 2025-03-20T00:13:47Z

minions/minions_deep_research.py

+        if self.callback:
+            self.callback("supervisor", None, is_final=False)
+
+        advice_response, usage = self.remote_search_client.chat(


could perhaps use the non-search model here for advice if search is extra-expensive

danbider

this looks good. we acknowledge that the task was straightforward tough.
@ayush-alag how would you see a complete version of deep research running local? what will the remote do and what will the local model do?
just getting the docs from remote doesn't mean we are doing deep research. for deep research, I'd expect running multiple queries to grab context, analyzing them locally, sending the results back up to remote, and iterating. i.e., local model does document processing, and remote model does reasoning. but each deep research might have many minion calls on many retrieved documents, interleaved by remote reasoning.

ayush-alag · 2025-03-30T03:41:29Z

@danbider great point. I'd envision something similar to what you mentioned, where we could have a remote search model, a remote reasoning model, and many local minions.

First, the remote search model extracts a set of documents (context) based on the initial prompt.
Each document is passed to a set of minions. As per the original minions protocol, the remote model will provide the set of local models with code to prepare jobs (divide into subtasks) based on the context of the document in question. These jobs will be executed and then filtered (via code from the remote).
The remote reasoning model will aggregate all of the information from the local models, across all context documents. It will then decide what additional information needs to be searched.
If additional information needs to be searched, the remote search model will grab the context documents for the new query.
These documents will be passed to the local minions to process. As in Step 2, the local minions will complete their subtasks and pass them onto the reasoning model, which will synthesize the information alongside its current information.

Repeat steps 3-5 as necessary. Again, in keeping with the general Minions philosophy, we avoid having the remote models process long-context documents; these are handled by the local models.

Happy to discuss this more in depth or make any changes to the current PR!

Ayush Alag added 4 commits March 19, 2025 08:45

copy minions.py to minions_deep_research.py

311e9ab

deep research class + cli integration

289e41c

use 4o mini since gpt4.5 doesn't have web search

8436c84

fix handling the web search response

93f803f

ayush-alag commented Mar 19, 2025

View reviewed changes

Ayush Alag added 3 commits March 19, 2025 15:47

callback fn + add output to messages

6b918c5

cleaner way to define search model; use search model for advice too

c5102c5

better prompt

e55a071

ayush-alag commented Mar 20, 2025

View reviewed changes

danbider reviewed Mar 24, 2025

View reviewed changes

resolve merge conflicts

f7d0fc3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deep research#27

Deep research#27
ayush-alag wants to merge 8 commits intoHazyResearch:mainfrom
ayush-alag:deep_research

ayush-alag commented Mar 19, 2025 •

edited

Loading

Uh oh!

ayush-alag Mar 19, 2025

Uh oh!

ayush-alag Mar 19, 2025

Uh oh!

ayush-alag commented Mar 20, 2025 •

edited

Loading

Uh oh!

ayush-alag Mar 20, 2025

Uh oh!

danbider left a comment

Uh oh!

ayush-alag commented Mar 30, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ayush-alag commented Mar 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Example 1 (MCP)

Uh oh!

ayush-alag Mar 19, 2025

Choose a reason for hiding this comment

Uh oh!

ayush-alag Mar 19, 2025

Choose a reason for hiding this comment

Uh oh!

ayush-alag commented Mar 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Results

Uh oh!

ayush-alag Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

danbider left a comment

Choose a reason for hiding this comment

Uh oh!

ayush-alag commented Mar 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ayush-alag commented Mar 19, 2025 •

edited

Loading

ayush-alag commented Mar 20, 2025 •

edited

Loading

ayush-alag commented Mar 30, 2025 •

edited

Loading