Replies: 1 comment
-
Relates to #1478. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Background
For paritytech/polkadot#1532 I am investigating why connections initiated through Kademlia are kept alive for more than the expected 10 second idle timeout. As far as I can tell there is no way to learn why connections are kept alive without intrusive libp2p source code changes. Ideally I would like to expose a Prometheus metric in Substrate.
Hacky solution
To help debugging I extended
KeepAlive
with a protocol id indicating the protocol that would like to keep the connection alive.ProtocolHandler
s that delegate to otherProtocolHandler
s aggregate theKeepAlive
and pass the protocol id of the highestKeepAlive
upwards.Within
node_handler.rs
I could then log the id of the protocol that keeps the connection alive.This lead to #1698 which triggered investigation for #1700.
Way forward
First of all: Do people feel the need to surface keep-alive information to the user? Or is the on-demand debugging through log lines good enough?
If we do want to expose that information we need to find a consistent way to do so. One suggestion from my side would be to bubble up a ~
KeepAlive
event that records the id of the protocol keeping the protocol alive from theNodeHandler
for each connection on a regular interval .Beta Was this translation helpful? Give feedback.
All reactions