Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Significantly decrease startup times for WAL #25643

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

mgattozzi
Copy link
Contributor

@mgattozzi mgattozzi commented Dec 11, 2024

feat: add startup time to logging output

This change adds a startup time counter to the output when starting up
a server. The main purpose of this is to verify whether the impact of
changes actually speeds up the loading of the server.

feat: Significantly decrease startup times for WAL

This commit does a few important things to speedup startup times:

  1. We avoid changing an Arc to a String with the series key as the
    From impl will call with_column which will then turn it into
    an Arc again. Instead we can just call with_column directly
    and pass in the iterator without also collecting into a Vec
  2. We switch to using bitcode as the serialization format for the WAL.
    This significantly reduces startup time as this format is faster to
    use instead of JSON, which was eating up massive amounts of time.
    Part of this change involves not using the tag feature of serde as
    it's currently not supported by bincode
  3. We also parallelize reading and deserializing the WAL files before
    we then apply them in order. This reduces time waiting on IO and we
    eagerly evaluate each spawned task in order as much as possible.

This gives us about a 189% speedup over what we were doing before.

Closes #25534

This change adds a startup time counter to the output when starting up
a server. The main purpose of this is to verify whether the impact of
changes actually speeds up the loading of the server.
This commit does a few important things to speedup startup times:
1. We avoid changing an Arc<str> to a String with the series key as the
   From<String> impl will call with_column which will then turn it into
   an Arc<str> again. Instead we can just call `with_column` directly
   and pass in the iterator without also collecting into a Vec<String>
2. We switch to using bitcode as the serialization format for the WAL.
   This significantly reduces startup time as this format is faster to
   use instead of JSON, which was eating up massive amounts of time.
   Part of this change involves not using the tag feature of serde as
   it's currently not supported by bincode
3. We also parallelize reading and deserializing the WAL files before
   we then apply them in order. This reduces time waiting on IO and we
   eagerly evaluate each spawned task in order as much as possible.

This gives us about a 189% speedup over what we were doing before.

Closes #25534
@mgattozzi mgattozzi force-pushed the mgattozzi/startup-times branch from 5e83ae9 to a32801b Compare December 11, 2024 18:49
@mgattozzi mgattozzi changed the title feat: Significantly increase startup times for WAL feat: Significantly decrease startup times for WAL Dec 11, 2024
@hiltontj
Copy link
Contributor

Curious why you chose bitcode. One of their project non-goals is:

Stable format across major versions

i.e., they do not guarantee stability of the format across major versions. I know we have a bit of leeway with WAL files since they are only kept around for relatively short amount of time, but I think we would need to have some kind of versioning and upgrade process in place if we ever need to update our bitcode version, based on that statement.

If any of the other serde-compatible formats offer stability and similar performance gains, it might be worth considering those (Avro, MSGpack, etc.).

@mgattozzi
Copy link
Contributor Author

Bitcode was the fastest according to these benchmarks which was the main draw. I think we can get around versioning issues by adding a version into the header that we check already to verify the wal is okay and then check which version to use to deserialize the code. We would need this also for breaking changes to the file format anyways if we wanted to do that and handle it without the user needing to know

@hiltontj
Copy link
Contributor

Bitcode was the fastest according to these benchmarks which was the main draw.

Seems like others that perform similarly in their benchmarks are a bit too obscure.

We would need this also for breaking changes to the file format anyways if we wanted to do that and handle it without the user needing to know

That is true. Also, looking at some others, e.g., bincode and apache_avro, it seems like they have similar statements about stability, so something we probably need to handle regardless of the format we choose.

Copy link
Contributor

@hiltontj hiltontj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had one comment on the profile change. Otherwise, the only thing to mention is that it would be good to track an issue for versioning the WAL file in the header.

@@ -182,6 +183,7 @@ inherits = "release"
codegen-units = 16
lto = false
incremental = true
debug = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the quick-bench below provides this profile, i.e., quick-release+debug=1, would you be okay to leave this out in favour of that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make WAL load files in parallel on startup
2 participants