-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Significantly decrease startup times for WAL #25643
base: main
Are you sure you want to change the base?
Conversation
This change adds a startup time counter to the output when starting up a server. The main purpose of this is to verify whether the impact of changes actually speeds up the loading of the server.
This commit does a few important things to speedup startup times: 1. We avoid changing an Arc<str> to a String with the series key as the From<String> impl will call with_column which will then turn it into an Arc<str> again. Instead we can just call `with_column` directly and pass in the iterator without also collecting into a Vec<String> 2. We switch to using bitcode as the serialization format for the WAL. This significantly reduces startup time as this format is faster to use instead of JSON, which was eating up massive amounts of time. Part of this change involves not using the tag feature of serde as it's currently not supported by bincode 3. We also parallelize reading and deserializing the WAL files before we then apply them in order. This reduces time waiting on IO and we eagerly evaluate each spawned task in order as much as possible. This gives us about a 189% speedup over what we were doing before. Closes #25534
5e83ae9
to
a32801b
Compare
Curious why you chose
i.e., they do not guarantee stability of the format across major versions. I know we have a bit of leeway with WAL files since they are only kept around for relatively short amount of time, but I think we would need to have some kind of versioning and upgrade process in place if we ever need to update our If any of the other |
Bitcode was the fastest according to these benchmarks which was the main draw. I think we can get around versioning issues by adding a version into the header that we check already to verify the wal is okay and then check which version to use to deserialize the code. We would need this also for breaking changes to the file format anyways if we wanted to do that and handle it without the user needing to know |
Seems like others that perform similarly in their benchmarks are a bit too obscure.
That is true. Also, looking at some others, e.g., |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had one comment on the profile change. Otherwise, the only thing to mention is that it would be good to track an issue for versioning the WAL file in the header.
@@ -182,6 +183,7 @@ inherits = "release" | |||
codegen-units = 16 | |||
lto = false | |||
incremental = true | |||
debug = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the quick-bench
below provides this profile, i.e., quick-release
+debug=1
, would you be okay to leave this out in favour of that?
feat: add startup time to logging output
This change adds a startup time counter to the output when starting up
a server. The main purpose of this is to verify whether the impact of
changes actually speeds up the loading of the server.
feat: Significantly decrease startup times for WAL
This commit does a few important things to speedup startup times:
From impl will call with_column which will then turn it into
an Arc again. Instead we can just call
with_column
directlyand pass in the iterator without also collecting into a Vec
This significantly reduces startup time as this format is faster to
use instead of JSON, which was eating up massive amounts of time.
Part of this change involves not using the tag feature of serde as
it's currently not supported by bincode
we then apply them in order. This reduces time waiting on IO and we
eagerly evaluate each spawned task in order as much as possible.
This gives us about a 189% speedup over what we were doing before.
Closes #25534