Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reporting: Serialize batches as they're constructed #1227

Merged
merged 3 commits into from
Mar 7, 2025

Conversation

sharnoff
Copy link
Member

@sharnoff sharnoff commented Jan 29, 2025

Part of #1220.

For scaling events, the size of a gzip-compressed batch is much less than the uncompressed version, so it's actually worth it to compress the JSON-encoded contents of the batch as we go, instead of buffering the events as a []ScalingEvent in memory.

Implementation

This change primarily adds the reporting.BatchBuilder[E] type (and some implementations), which abstracts over the process of gradually building the serialized bytes from a batch of events.

Two implementations of BatchBuilder are provided in pkg/reporting - one for large JSON arrays, and one for the JSON lines format.

To allow continuous gzip compression, both of these JSON batch builders are constructed with a new IOBuffer type, which is just an io.Writer with a method to collect the bytes at the end. There's implementations of IOBuffer for normal bytes and for a gzip-compressed buffer.

All that together means that batches of billing and scaling events should now be GZIP-compressed (where possible) as events are added to the batches, dramatically reducing the memory footprint.


Note: This PR builds on #1226, and must not be merged before it.

Copy link

github-actions bot commented Jan 29, 2025

No changes to the coverage.

HTML Report

Click to open

@sharnoff sharnoff force-pushed the sharnoff/reporting-streaming-compression branch from a4afcb3 to 5fdd4ad Compare January 29, 2025 11:56
@Omrigan Omrigan self-requested a review February 10, 2025 17:30
@sharnoff sharnoff force-pushed the sharnoff/reporting-batching branch from 95a12a0 to b6d3c65 Compare February 13, 2025 11:57
@sharnoff sharnoff force-pushed the sharnoff/reporting-streaming-compression branch from 5fdd4ad to 5956ccd Compare February 13, 2025 11:57
@stradig stradig requested a review from petuhovskiy February 18, 2025 15:34
@sharnoff sharnoff force-pushed the sharnoff/reporting-batching branch from b6d3c65 to a8e7dfe Compare February 26, 2025 17:58
@sharnoff
Copy link
Member Author

Needs to be updated after rebasing #1226. Final commit is still ok-ish to review in the meantime, but ok to leave until after I'm done.

@sharnoff sharnoff self-assigned this Feb 26, 2025
Base automatically changed from sharnoff/reporting-batching to main February 26, 2025 19:46
Part of #1220.

For scaling events, the size of a gzip-compressed batch is *much* less
than the uncompressed version, so it's actually worth it to compress the
JSON-encoded contents of the batch as we go, instead of buffering the
events as a []ScalingEvent in memory.

**Implementation**

This change primarily adds the reporting.BatchBuilder[E] type (and some
implementations), which abstracts over the process of gradually building
the serialized bytes from a batch of events.

Two implementations of BatchBuilder are provided in pkg/reporting - one
for large JSON arrays, and one for the JSON lines format.

To allow continuous gzip compression, both of these JSON batch builders
are constructed with a new 'IOBuffer' type, which is just an io.Writer
with a method to collect the bytes at the end. There's implementations
of IOBuffer for normal bytes and for a gzip-compressed buffer.

All that together means that batches of billing and scaling events
should now be GZIP-compressed (where possible) as events are added to
the batches, dramatically reducing the memory footprint.
@sharnoff sharnoff force-pushed the sharnoff/reporting-streaming-compression branch from 4a2bfd6 to 4062128 Compare March 3, 2025 12:29
@sharnoff sharnoff removed their assignment Mar 3, 2025
Copy link
Member

@petuhovskiy petuhovskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(non-blocking) I don't really like panics when handling errors from io.Writer, but should be fine as long as we know that byte buffer and gzip implementation should not return errors.

@petuhovskiy petuhovskiy assigned sharnoff and unassigned petuhovskiy Mar 4, 2025
@sharnoff
Copy link
Member Author

sharnoff commented Mar 4, 2025

I don't really like panics when handling errors from io.Writer (non-blocking), but should be fine as long as we know that byte buffer and gzip implementation should not return errors.

That's fair — my thinking is that there isn't really anything we can do if we get an error there: it's a buffer in memory, so retrying won't help - should we just drop the events? IMO it's kind of better to panic so that we bubble the error up into something more noisy, hoping that restarting the autoscaler-agent will resolve the issue.

Probably worthwhile to at least clarify that in the code — WDYT?

@petuhovskiy
Copy link
Member

there isn't really anything we can do if we get an error there

There are always two options:

  1. propagate the error up
  2. handle the error

In terms of handling, it's true that we can't do much here. The only thing we can do is to drop the event and collect context about the error (what was the event, what was the write that failed, etc).

In this particular case, I'd probably changed the definition of BatchBuilder to this:

type BatchBuilder[E any] interface {
	// Add appends an event to the in-progress batch. If it fails, there are no guarantees
	// about the state of the batch and it should be dropped.
	Add(event E) error
	// Finish completes the in-progress batch, returning the events serialized as bytes.
	Finish() []byte
}

Then, I'd propagated the error up, probably up to EventSink. The error handling there would be to drop the batch, log the error, maybe increase the metric for the errors. I don't like panic in JSONLinesBuilder and JSONArrayBuilder because it might actually get triggered if someone adds new BufferIO implementation in future PRs. And if it gets triggered, it might break things unexpectedly, if there is no explicit recover().

But for now we know that there is only ByteBuffer and GZIPBuffer so I'm ok with merging as is.

@sharnoff sharnoff enabled auto-merge (squash) March 7, 2025 09:21
@sharnoff sharnoff merged commit 0f8030a into main Mar 7, 2025
35 checks passed
@sharnoff sharnoff deleted the sharnoff/reporting-streaming-compression branch March 7, 2025 09:43
Omrigan pushed a commit that referenced this pull request Mar 7, 2025
Part of #1220.

For scaling events, the size of a gzip-compressed batch is *much* less
than the uncompressed version, so it's actually worth it to compress the
JSON-encoded contents of the batch as we go, instead of buffering the
events as a []ScalingEvent in memory.

**Implementation**

This change primarily adds the reporting.BatchBuilder[E] type (and some
implementations), which abstracts over the process of gradually building
the serialized bytes from a batch of events.

Two implementations of BatchBuilder are provided in pkg/reporting - one
for large JSON arrays, and one for the JSON lines format.

To allow continuous gzip compression, both of these JSON batch builders
are constructed with a new 'IOBuffer' type, which is just an io.Writer
with a method to collect the bytes at the end. There's implementations
of IOBuffer for normal bytes and for a gzip-compressed buffer.

All that together means that batches of billing and scaling events
should now be GZIP-compressed (where possible) as events are added to
the batches, dramatically reducing the memory footprint.
Signed-off-by: Oleg Vasilev <oleg@neon.tech>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants