-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Objective
Improve WebSocket session rollover detection to eliminate multi-second dead connection periods and establish clear architectural guidelines for client reconnection responsibilities.
Origin Document
Discussion arising from WebSocket reconnection logic analysis. Currently, session rollovers cause dead connections lasting several seconds until ping/pong timeout detection. This creates poor UX while raising broader questions about server vs. client reconnection responsibilities.
Goals
- Eliminate Dead Connection Delays: Reduce session rollover detection from 30+ seconds to near-instantaneous
- Establish Reconnection Architecture: Define clear boundaries between server and client responsibilities
- Align with Industry Standards: Follow established WebSocket patterns used by major providers
Deliverables
PATH Implementation Improvements
- Proactive Session Rollover Detection: Integrate existing
sessionRolloverStatemonitoring into WebSocket bridge to immediately drop client connections when endpoints become unresponsive - Configurable Ping/Pong Timeouts: Make WebSocket timeouts configurable and optimize for faster dead connection detection
- Enhanced Connection Monitoring: Add observability for session rollover-related disconnections
Architectural Decision & Documentation
- Reconnection Responsibility ADR: Document decision that client-side reconnection is PATH's architectural standard, aligned with industry practices (Infura, Alchemy, QuickNode)
- WebSocket Connection Guide: Create developer documentation clarifying client reconnection expectations and providing implementation examples
- Subscription Re-establishment Patterns: Document best practices for clients handling Ethereum subscriptions across reconnections
Non-goals / Non-deliverables
- Server-Side Reconnection Logic: Will not implement transparent failover or server-managed reconnection
- Subscription State Tracking: Will not track/restore client subscriptions server-side
- Breaking API Changes: Will not modify existing WebSocket contracts
Technical Context
Current implementation uses 30-second pong timeout, creating dead connection periods during session rollovers when endpoints become unresponsive. Existing sessionRolloverState in path/protocol/shannon/fullnode_session_rollover.go already monitors rollover windows but isn't integrated with WebSocket connection management.
General deliverables
- Testing: Add unit tests for session rollover detection scenarios
- Configuration: Add timeout configuration options
- Documentation: Update WebSocket connection management docs
Creator: @commoddity
Co-Owners: @Olshansk @adshmh