-
Notifications
You must be signed in to change notification settings - Fork 361
Implement sampling #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
|
The sampling message was generated from the official schema, but the actual functionality has not been implemented. |
Looked at the official document, feel the now there is a difficulty is: |
I can see why the author delayed adding this feature, I wrote a little code, looked at the architecture, and realized that this sampling feature, it's hard to add. I am going to change the architecture of the server. |
This processInputStream function may be changed Can the author give me some inspiration? What changes are allowed? I think this thing should be written like this, that is, a request |
The Model Context Protocol (MCP) provides a standardized way for servers to request LLM sampling ("completions" or "generations") from language models via clients. This flow allows clients to maintain control over model access, selection, and permissions while enabling servers to leverage AI capabilities—with no server API keys necessary. Servers can request text or image-based interactions and optionally include context from MCP servers in their prompts.
User Interaction Model
Sampling in MCP allows servers to implement agentic behaviors, by enabling LLM calls to occur nested inside other MCP server features.
Implementations are free to expose sampling through any interface pattern that suits their needs—the protocol itself does not mandate any specific user interaction model.
{{< callout type="warning" >}}
For trust & safety and security, there SHOULD always be a human in the loop with the ability to deny sampling requests.
Applications SHOULD:
{{< /callout >}}
Capabilities
Clients that support sampling MUST declare the
sampling
capability during [initialization]({{< ref "/specification/basic/lifecycle#initialization" >}}):Protocol Messages
Creating Messages
To request a language model generation, servers send a
sampling/createMessage
request:Request:
Response:
Message Flow
Data Types
Messages
Sampling messages can contain:
Text Content
Image Content
Model Preferences
Model selection in MCP requires careful abstraction since servers and clients may use different AI providers with distinct model offerings. A server cannot simply request a specific model by name since the client may not have access to that exact model or may prefer to use a different provider's equivalent model.
To solve this, MCP implements a preference system that combines abstract capability priorities with optional model hints:
Capability Priorities
Servers express their needs through three normalized priority values (0-1):
costPriority
: How important is minimizing costs? Higher values prefer cheaper models.speedPriority
: How important is low latency? Higher values prefer faster models.intelligencePriority
: How important are advanced capabilities? Higher values prefer more capable models.Model Hints
While priorities help select models based on characteristics,
hints
allow servers to suggest specific models or model families:For example:
The client processes these preferences to select an appropriate model from its available options. For instance, if the client doesn't have access to Claude models but has Gemini, it might map the sonnet hint to
gemini-1.5-pro
based on similar capabilities.Error Handling
Clients SHOULD return errors for common failure cases:
Example error:
Security Considerations
The text was updated successfully, but these errors were encountered: