New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Alternative build) Possible performance regression in v24.3 #63500
Comments
cc @al13n321 |
Didn't you mean v24.3 ? |
@jorisgio yes, thanks for pointing out! |
did you happen to collect any profiles? from CH profiler or perf records, whatever |
@nickitat yes, we collected top trace from CH profiler: top_cpu_trace_before_upgrade.txt I also collect |
The page cache should not be enabled by default. We will disable it and backport this change. |
(you don't have to strictly follow this form)
Describe the situation
And it's possibly caused by new feature user pagecache.
After upgrade our baseline ClickHouse version to v24.3, we observe a significant increase in P95 and P99 latency:
It's not official ClickHouse build, nevertheless, we didn't add any new code in this release, only upgrade the baseline from v24.2 to v24.3.2.1 and observe the performance regression.
We check the profiling but cannot found any concrete issue (just the queries run longer and with more trace recorded by profiler). After testing some concreted queries, we suspect the issue is that when running high concurrent workload, on v24.3 queries more often don't read from OS page cache.
That leads us to #53770. We try to revert the PR and the query latency back to normal.
May be someone in core team can look at this PR if you have time.
The text was updated successfully, but these errors were encountered: