You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed that mooncake is significantly slower for simple affine expressions that involve large matrices. For instance:
n =1000
A =LowerTriangular(randn(n, n))
b =randn(n)
functionf(x)
y = A * x + b
sum(abs2, y)
end
For this function, Mooncake is more than an order of magnitude slower than Zygote, which is a bit surprising to me:
julia> prep =prepare_gradient(f, AutoMooncake(; config=nothing), randn(1000))
@benchmark DifferentiationInterface.value_and_gradient(f, prep, AutoMooncake(; config=nothing), randn(1000))
BenchmarkTools.Trial:499 samples with 1 evaluation per sample.
Range (min … max):8.292 ms …15.307 ms ┊ GC (min … max):0.00%…10.96%
Time (median):9.601 ms ┊ GC (median):0.00%
Time (mean ± σ):10.017 ms ±1.128 ms ┊ GC (mean ± σ):1.36%±4.24%
▁▁▂█▅▁▃
▂▁▁▂▃▄▄▄▆███████▅▅▃▄▃▂▂▁▂▃▃▂▃▃▃▄▃▄▃▃▂▃▃▂▃▃▂▂▃▂▃▃▂▁▁▃▂▁▁▁▁▁▂ ▃
8.29 ms Histogram: frequency by time 14 ms <
Memory estimate:15.34 MiB, allocs estimate:108.
julia> prep =prepare_gradient(f, AutoZygote(), randn(1000))
@benchmark DifferentiationInterface.value_and_gradient(f, prep, AutoZygote(), randn(1000))
BenchmarkTools.Trial:7491 samples with 1 evaluation per sample.
Range (min … max):490.286 μs …3.431 ms ┊ GC (min … max):0.00%…75.51%
Time (median):604.661 μs ┊ GC (median):0.00%
Time (mean ± σ):664.965 μs ±262.917 μs ┊ GC (mean ± σ):7.18%±12.41%
▅█▆▄▃▃▃▁ ▁
▃▅▇████████▇█▆▆▆▆▃▄▅▅▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▆▇▆▇██▇▇█▇▆ █
490 μs Histogram:log(frequency) by time 1.96 ms <
Memory estimate:7.67 MiB, allocs estimate:48.
The text was updated successfully, but these errors were encountered:
Thanks for opening this issue -- I've just done a quick benchmark locally, and it looks like my rule for trmv! is actually the culprit -- it appears to be really quite slow, even thought it's type-stable. I'll take a proper look tomorrow.
Hi,
I noticed that mooncake is significantly slower for simple affine expressions that involve large matrices. For instance:
For this function,
Mooncake
is more than an order of magnitude slower thanZygote
, which is a bit surprising to me:The text was updated successfully, but these errors were encountered: