-
Notifications
You must be signed in to change notification settings - Fork 117
Description
Overview
Implement comprehensive OpenTelemetry (OTel) tracing and logging for py-shiny, providing observability into Shiny application behavior with minimal performance overhead. This implementation follows the R Shiny approach but adapts it to Python idioms (async/await, context managers, decorators, contextvars).
Goals
- 5-level collection granularity:
none,session,reactive_update,reactivity,all - Async-aware span propagation using Python's
contextvars - Lazy initialization with minimal performance impact (< 50μs per span)
- Dual configuration: Environment variables (
SHINY_OTEL_COLLECT) and programmatic API (otel_collectcontext manager) - Source location attribution for reactive computations
- Error sanitization and proper exception recording
- Standard OTel compatibility with backends like Jaeger, Zipkin, etc.
Architecture
New public module shiny/otel/ containing:
- Core tracer/logger initialization
- Collection level management
- Context propagation utilities
- Span creation helpers
- Attribute extraction (source refs, HTTP metadata)
- Error handling and sanitization
- User-facing decorators and context managers
Dependencies
Phase 1: Add opentelemetry-api>=1.20.0 as required dependency
Phase 10: Evaluate optional dependency group approach (pip install shiny[telemetry])
Success Criteria
- OTel integration works with all 5 collection levels
- Session lifecycle spans include correct attributes (session ID, HTTP metadata)
- Reactive execution spans nest correctly under reactive update spans
- Async context propagation maintains correct parent-child relationships
- Users can control collection via environment variable and context managers
- Performance overhead < 50μs per span
- Test coverage > 90% for OTel code
- Complete documentation and example applications
- Compatible with standard OTel backends
Sub-Issues
This epic is broken down into 10 phases, each corresponding to a sub-issue:
1. Foundation (Core OTel Infrastructure)
Set up basic OTel infrastructure with tracer/logger initialization and collection level management.
2. Session Lifecycle Instrumentation
Add OTel spans for session start/end and HTTP/WebSocket connections.
3. Reactive Flush Instrumentation
Add "reactive update" spans that wrap each flush cycle, serving as parent for all reactive spans.
4. Reactive Execution Instrumentation
Instrument individual reactive computations (calcs, effects, outputs) with descriptive labels and source attribution.
5. Value Updates and Logging
Log reactive value updates as OTel log events.
6. Error Handling and Sanitization
Record exceptions in spans with proper sanitization for sensitive information.
7. User-Facing API
Provide context managers and decorators for user control over OTel collection.
8. Testing Infrastructure
Add comprehensive tests for OTel functionality with in-memory exporters and fixtures.
9. Documentation
Document OTel features for users including API docs, user guide, and examples.
10. Follow-up Evaluation
Evaluate whether optional dependency group approach is better than required dependency.
See linked sub-issues below for detailed implementation plans for each phase.
Implementation Notes
Non-Goals (Out of Scope)
- ASGI middleware for automatic HTTP tracing
- Metrics collection (only traces and logs)
- Custom span processors or exporters
- Integration with specific observability platforms
- Automatic instrumentation of third-party libraries
Open Questions
- Extended Tasks: Should we instrument
shiny/reactive/_extended_task.py?- Decision: Add in Phase 4 if time permits, otherwise defer
- Bookmark Operations: Should bookmark save/restore be instrumented?
- Decision: Defer to follow-up, focus on reactive core
- Module Namespacing: Use existing
_current_namespacecontextvar fromshiny/_namespaces.py - Performance Impact: Target < 50μs per span overhead
Critical Files Modified
Core:
pyproject.tomlshiny/__init__.pyshiny/otel/*(all new files)
Integration Points:
shiny/session/_session.py(lines ~599, ~719, ~1821)shiny/reactive/_core.py(line ~175)shiny/reactive/_reactives.py(lines ~180, ~235, ~305, ~575)
Testing:
tests/pytest/test_otel_*.py(all new files)