Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kgo: broadcast batch finishes in one big blast #878

Merged
merged 1 commit into from
Jan 20, 2025
Merged

kgo: broadcast batch finishes in one big blast #878

merged 1 commit into from
Jan 20, 2025

Conversation

twmb
Copy link
Owner

@twmb twmb commented Dec 25, 2024

Previously, we would broadcast after every individual record finished in a batch. If we produced 5000 records concurrently but only allowed 1000 in flight, we would first send a few large batches, and then once they started finishing, we would:

  • Finish a record
  • Wake up a waiting record
  • That record would race into producing
  • The producing wouldn't linger even if we wanted to, because other records are blocked
  • That record would be produced alone in one batch immediately
  • Goto start

This forced unnecessary synchronization and caused things to produce slowly.

By doing one broadcast at the end of a batch, we actually give more "space" for the client to buffer before waking up everything waiting. The first goroutine awoken will still produce a small batch, but we will reach a point where the client is buffering larger batches.

Overall, this speeds things up for a niche case and is not detrimental in any way to the non-niche case.

Closes #876.

Previously, we would broadcast after every individual record finished in
a batch. If we produced 5000 records concurrently but only allowed 1000
in flight, we would first send a few large batches, and then once they
started finishing, we would:
* Finish a record
* Wake up a waiting record
* That record would race into producing
* The producing wouldn't linger even if we wanted to, because other
  records are blocked
* That record would be produced alone in one batch immediately
* Goto start

This forced unnecessary synchronization and caused things to produce
slowly.

By doing one broadcast at the end of a batch, we actually give more
"space" for the client to buffer before waking up everything waiting.
The first goroutine awoken will still produce a small batch, _but_ we
will reach a point where the client is buffering larger batches.

Overall, this speeds things up for a niche case and is not detrimental
in any way to the non-niche case.
if broadcast {
p.c.Broadcast()
}
}()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we validate this through a test?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's to validate? This is a change from doing a broadcast many times to doing it once -- the test is that things still work (which existing tests cover).

What can't really be tested is if this improves the problem you're running into.

Copy link

@StarpTech StarpTech Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine for me. I was able to verify the fix so I thought it can be verified as a test as well. A test could validate if producing is actually fast enough.

@twmb twmb added the patch label Jan 8, 2025
@twmb twmb mentioned this pull request Jan 15, 2025
10 tasks
@twmb twmb merged commit 9d27aac into master Jan 20, 2025
8 checks passed
@twmb twmb deleted the 876 branch January 20, 2025 23:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unusual behaviour when batching
2 participants