Skip to content

[BUG] Agents can be registered with Text Embedding models #4558

@rithin-pullela-aws

Description

@rithin-pullela-aws

What is the bug?
This is an extension of #4540
The Agent Registration goes through when we provide a Text Embedding model as a Model for Agent Registration.
The Agent created this way fails during Agent Execution with a non readable error:

{
    "status": 500,
    "error": {
        "type": "ClassCastException",
        "reason": "System Error",
        "details": "class org.opensearch.ml.common.dataset.remote.RemoteInferenceInputDataSet cannot be cast to class org.opensearch.ml.common.dataset.TextDocsInputDataSet (org.opensearch.ml.common.dataset.remote.RemoteInferenceInputDataSet and org.opensearch.ml.common.dataset.TextDocsInputDataSet are in unnamed module of loader java.net.FactoryURLClassLoader @1f0e2bdc)"
    }
}

How can one reproduce the bug?
Steps to reproduce the behavior:

  1. Create a Local Text Embedding Model:
POST /_plugins/_ml/models/_register
{
  "name": "huggingface/sentence-transformers/paraphrase-mpnet-base-v2",
  "version": "1.0.1",
  "model_format": "TORCH_SCRIPT"
}

  1. Deploy the model:
POST /_plugins/_ml/models/<Model ID>/_deploy

  1. Create an Agent with this model:
POST /_plugins/_ml/agents/_register
{
    "name": "Agent with MCP tools",
    "type": "conversational",
    "description": "Some Agent",
    "llm": {
        "model_id": "<Model ID from Step 1 and 2>",
        "parameters": {
            "max_iteration": 15
        }
    },
    "memory": {
        "type": "conversation_index"
    },
    "parameters": {
        "_llm_interface": "openai/v1/chat/completions"
    },
    "tools": [
        {
            "type": "ListIndexTool"
        },
        {
            "type": "IndexMappingTool"
        }
    ],
    "app_type": "os_chat"
}

Observe that this step succeeds and the model is registered

  1. Execute the Agent:
POST /_plugins/_ml/agents/<Agent ID>/_execute
{
    "parameters": {
        "system_prompt": "helper",
        "question": "hello",
        "verbose": true
    }
}

The response would look like this:

{
    "status": 500,
    "error": {
        "type": "ClassCastException",
        "reason": "System Error",
        "details": "class org.opensearch.ml.common.dataset.remote.RemoteInferenceInputDataSet cannot be cast to class org.opensearch.ml.common.dataset.TextDocsInputDataSet (org.opensearch.ml.common.dataset.remote.RemoteInferenceInputDataSet and org.opensearch.ml.common.dataset.TextDocsInputDataSet are in unnamed module of loader java.net.FactoryURLClassLoader @1f0e2bdc)"
    }
}

What is the expected behavior?
Option 1: Improve the validation during Agent Registration, validate the models and tools.

Option 2: Have a better, more readable Error message during Agent execution.

What is your host/environment?

  • OS: [e.g. iOS]
  • Version [e.g. 22]
  • Plugins

Do you have any screenshots?
If applicable, add screenshots to help explain your problem.

Do you have any additional context?
Add any other context about the problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions