New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of magit-insert-recent-commits
#5075
base: main
Are you sure you want to change the base?
Conversation
From my debugging, `range' here is usually something like "HEAD~10..HEAD" (up to the value of `magit-log-section-commit-count'), which of course means 'find me everything reachable from HEAD that isn't reachable from HEAD~10'. We also pass "-n10" as a further limiter. This appears to be duplicative, so prefer "-n10" alone. "HEAD~10..HEAD" not only requires Git to parse HEAD~10, but also requires reachability logic at each step when walking the graph. Performance of this reachability logic is improved with tools like `git-commit-graph', but it is not what I'd consider 'solved'. On a branch with ~3400000 commits (~3700000 commits in --all), this is a significant improvement (tested with `magit-refresh-verbose'): current implementation: magit-insert-recent-commits 0.884695 !! magit-insert-recent-commits 0.508423 !! magit-insert-recent-commits 0.357324 !! magit-insert-recent-commits 0.360272 !! magit-insert-recent-commits 0.356748 !! new implementation: magit-insert-recent-commits 0.125902 !! magit-insert-recent-commits 0.125966 !! magit-insert-recent-commits 0.125386 !! magit-insert-recent-commits 0.125143 !! magit-insert-recent-commits 0.125195 !! This change has also been tested on a repository with <10 commits and had no issues that I could see.
Now that we're not using `range' in the `magit--insert-log' call, we don't need to have a value for `range' until its needed as the default value of the section. This saves a process-start and nominal compute for the string manipulation in `format'/`concat'.
magit-insert-recent-commits
We've added the range in 55a61d1. |
Fascinating! Your explanation in that commit message makes good sense. Should this optimization be made conditional on whether |
Maybe, though such a solution might depend on the user knowing that this is a bottleneck and that disabling that argument is needed. I'll experiment with this eventually, but currently I trying to get some other things done. I don't have access to a humongous monorepo, so maybe you could run some benchmarks. (Using |
To be specific, 'all the variants' here means at least the following, right?
I'll have that shortly. Based on how long I've been in It struck me (while watching the christmas tree that is |
Turns out it wasn't Results are in. Old, one-off numbers:
Cached (ran the script twice in quick succession; this is the second run):
Averaged numbers (10 iterations) #!/usr/bin/zsh
function time_it {
for i in {1..10}
do
git --no-pager log --oneline $@
done
}
for args in "HEAD~10..HEAD" "--graph HEAD~10..HEAD" "-n10" "-n10 HEAD" "-n10 HEAD~10..HEAD" "-n10 --graph" "-n10 --graph HEAD" "-n10 --graph HEAD~10..HEAD" "HEAD~20..HEAD" "--graph HEAD~20..HEAD" "-n20" "-n20 HEAD" "-n20 HEAD~20..HEAD" "-n20 --graph" "-n20 --graph HEAD" "-n20 --graph HEAD~20..HEAD"
do
echo +$args
time (time_it ${=args})
done Processed output below. Divide timings by 10 to get the per-run timings.
Sorted by
100 iterations. script (note #!/usr/bin/zsh
function time_it {
for i in {1..$1}
do
git --no-pager log --oneline ${@:2} >/dev/null
done
}
# time git commit-graph write --changed-paths
for args in "HEAD~10..HEAD" "--graph HEAD~10..HEAD" "-n10" "-n10 HEAD" "-n10 HEAD~10..HEAD" "-n10 --graph" "-n10 --graph HEAD" "-n10 --graph HEAD~10..HEAD" "HEAD~20..HEAD" "--graph HEAD~20..HEAD" "-n20" "-n20 HEAD" "-n20 HEAD~20..HEAD" "-n20 --graph" "-n20 --graph HEAD" "-n20 --graph HEAD~20..HEAD"
do
echo -n $args
time (time_it 100 ${=args})
done Output:
Sorted:
The fact that |
Please include the results for 100 and 1000 commits. You can skip the |
Still including the HEAD variants because, well, I've seen weirder things happen with this migration project. Things that are conceptually identical can go down different codepaths. Providing HEAD for example requires Git to effectively rev-parse it rather than having the hard-coded HEAD already understood. The consistent differences between different calling patterns below seem a fitting testament to how weird it can get.
|
Any other metrics I can add here? The numbers above seem to support the idea that removing the range improves performance. It's been a number of years since 55a61d1; perhaps Git's rev-walking logic has changed in that time? |
Sorry for not following up. I'm rather busy getting Forge into a releasable state, so that I can finally release "everything". |
Related to my investigation in #2982.
See commits for details.