Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make gossip routines use TrySend, not Send #3186

Open
Tracked by #3245
ValarDragon opened this issue Jun 4, 2024 · 1 comment
Open
Tracked by #3245

Make gossip routines use TrySend, not Send #3186

ValarDragon opened this issue Jun 4, 2024 · 1 comment
Assignees
Labels
consensus enhancement New feature or request

Comments

@ValarDragon
Copy link
Collaborator

Feature Request

Currently our gossip routines for vote and block gossip use Send not TrySend. Send is worse than TrySend in terms of our net CPU overhead due to instantiating multiple timers. The way select statements in golang work, is that every argument gets evaluated, and timer.NewTimer takes overhead to instantiate, and significant overhead to our scheduler to create. Its not clear to me we are ever benefitting in the gossip routine by using Send though. Lets think through the cases:

  • Send does not block - Pretty clear to me that we wasted a timer allocation, and would have been better off doing a TrySend.
  • Send blocks (channel buffer full, and we wait) - We would be better off not waiting, and retrying our loop on a more "refreshed" peer state, processing new inbound has votes. We will be more likely to get something helpful to the peer. Channel buffers are per peer x channel combo, so with modest capacity (which both votes and block parts have), we should not be hitting buffers. In the off chance we did, it should clear quickly.

Problem Definition

Here is where we see it adding overhead to us:
image
(About 10% of the current gossip routine CPU time, but other concurrent work lowering overhead)

evidence of the go scheduler being high overhead right now. So likely we'd benefit from a few less timers.
image

Proposal

Change gossip routines to use TrySend and not Send.

Is there any suggested QA process for this?

@ValarDragon ValarDragon added enhancement New feature or request needs-triage This issue/PR has not yet been triaged by the team. labels Jun 4, 2024
@cason cason added consensus and removed needs-triage This issue/PR has not yet been triaged by the team. labels Jun 11, 2024
@cason cason self-assigned this Jun 11, 2024
@cason
Copy link
Contributor

cason commented Jun 11, 2024

Hey, I have in my backlog checking the usage of Send and TrySend in the consensus reactor.

In particular because when Send() fails, the connection is broken (the timeout is like 10s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
consensus enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants