-
Notifications
You must be signed in to change notification settings - Fork 47
feat: Add OpenTelemetry integration design proposal #597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This proposal outlines a comprehensive approach to add OpenTelemetry observability to ToolHive's MCP server proxies through middleware-based instrumentation. Key features: - Leverages existing middleware system for clean integration - Supports both SSE and stdio transport modes - Provides traces, metrics, and structured logging - Includes MCP-specific instrumentation beyond HTTP metrics - Supports multiple OTEL backends and Prometheus integration - Maintains backward compatibility with zero performance regression The design includes detailed examples of traces and metrics for tools/call operations, showing rich observability into MCP protocol interactions. Related-to: #474 Signed-off-by: Juan Antonio Osorio <[email protected]>
6fabc17
to
2ef2ba2
Compare
## Non-Goals | ||
|
||
- Instrumenting MCP servers themselves (only the proxy layer) | ||
- Custom telemetry formats or proprietary backends |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right! as long as they're OTLP traces and not something custom, it should be fine. a custom collector is fine, it's the format I referred to.
Environment variable support: | ||
```bash | ||
TOOLHIVE_OTEL_ENABLED=true | ||
TOOLHIVE_OTEL_ENDPOINT=https://api.honeycomb.io |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right! that's the idea; or a command line flag for this would be good too.
|
||
Here's how a `tools/call` MCP method would appear in traces and metrics: | ||
|
||
**Distributed Trace:** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't be limited, no. This is correct.
├── http.method: POST | ||
├── http.url: http://localhost:8080/messages?session_id=abc123 | ||
├── http.status_code: 200 | ||
├── mcp.server.name: github |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that's a good point! I can make that configurable. I think we don't want to add that by default because of things like GDPR.
This proposal outlines a comprehensive approach to add OpenTelemetry observability to ToolHive's MCP server proxies through middleware-based instrumentation.
Key Features
What's Included
The design follows KISS and DRY principles by leveraging existing patterns and infrastructure, making it maintainable and extensible for future observability needs.
Related-to: #474