Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PQ: avoid blocking writer when precisely full #16176

Merged
merged 2 commits into from
May 22, 2024

Conversation

yaauie
Copy link
Member

@yaauie yaauie commented May 21, 2024

Release notes

Fixes an issue in which an input writing to a PQ could become blocked indefinitely after writing an event that precisely fit the remaining capacity of a PQ configured to handle a single PQ-page of events.

What does this PR do?

A PQ is considered full (and therefore needs to block before releasing the
writer) when its persisted size on disk exceeds its queue.max_bytes
capacity.

This removes an edge-case preemptive block when the persisted size after
writing an event meets its queue.max_bytes precisely AND its current
head page has insufficient room to also accept a hypothetical future event.

A block formed in this way would only be broken the next time a tail page
is fully-acked, which cannot occur when the queue.max_bytes is configured
to match queue.page_capacity.

Why is it important/What is the impact to the user?

Users who configure their queue.max_bytes to match queue.page_capacity do so to eliminate buffering, and this fix eliminates an edge-case that could cause their PQ to block indefinitely.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files (and/or docker env variables)
  • I have added tests that prove my fix is effective or that my feature works

How to test this PR locally

See: #16172

Related issues

Fixes: #16172

A PQ is considered full (and therefore needs to block before releasing the
writer) when its persisted size on disk _exceeds_ its `queue.max_bytes`
capacity.

This removes an edge-case preemptive block when the persisted size after
writing an event _meets_ its `queue.max_bytes` precisely AND its current
head page has insufficient room to also accept a hypothetical future event.

Fixes: elastic#16172
Copy link

elastic-sonarqube bot commented May 21, 2024

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

Copy link
Member

@jsvd jsvd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, tested manually and confirmed the patch fixes the issue, and that the test needs the patch to Queue.java to pass.

@yaauie yaauie merged commit ea93086 into elastic:main May 22, 2024
6 of 7 checks passed
@yaauie yaauie deleted the pq-exact-capacity-full branch May 22, 2024 15:23
@yaauie
Copy link
Member Author

yaauie commented May 22, 2024

@logstashmachine backport 8.14

github-actions bot pushed a commit that referenced this pull request May 22, 2024
* pq: avoid blocking writer when queue is precisely full

A PQ is considered full (and therefore needs to block before releasing the
writer) when its persisted size on disk _exceeds_ its `queue.max_bytes`
capacity.

This removes an edge-case preemptive block when the persisted size after
writing an event _meets_ its `queue.max_bytes` precisely AND its current
head page has insufficient room to also accept a hypothetical future event.

Fixes: #16172

* docs: PQ `queue.max_bytes` cannot be less than `queue.page_capacity`

(cherry picked from commit ea93086)
yaauie added a commit that referenced this pull request May 22, 2024
* pq: avoid blocking writer when queue is precisely full

A PQ is considered full (and therefore needs to block before releasing the
writer) when its persisted size on disk _exceeds_ its `queue.max_bytes`
capacity.

This removes an edge-case preemptive block when the persisted size after
writing an event _meets_ its `queue.max_bytes` precisely AND its current
head page has insufficient room to also accept a hypothetical future event.

Fixes: #16172

* docs: PQ `queue.max_bytes` cannot be less than `queue.page_capacity`

(cherry picked from commit ea93086)

Co-authored-by: Ry Biesemeyer <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pq writer can become stuck after precisely filling queue max events
4 participants