Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue with sum #238

Open
keflavich opened this issue Jun 1, 2021 · 4 comments
Open

Performance issue with sum #238

keflavich opened this issue Jun 1, 2021 · 4 comments

Comments

@keflavich
Copy link

Working on a modest-size cube, I found that scipy.ndimage.sum is ~100-300x faster than dask_image.ndmeasure.sum_labels

import numpy as np, scipy.ndimage
blah = np.random.randn(19,512,512)
msk = blah > 3
lab, ct = scipy.ndimage.label(msk)
%timeit scipy.ndimage.sum(msk, labels=lab, index=range(1, ct+1))
# 117 ms ± 2.85 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

vs

rslt = ndmeasure.sum_labels(msk, label_image=lab, index=range(1, ct+1))
rslt
# dask.array<getitem, shape=(6667,), dtype=float64, chunksize=(1,), chunktype=numpy.ndarray>
rslt.compute()
# [########################################] | 100% Completed | 22.9s

Note also that the task creation takes nontrivial time:

%timeit ndmeasure.sum_labels(msk, label_image=lab, index=range(1, ct+1))
# 15.4 s ± 2.02 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

While I understand that there ought to be some cost to running this processing through a graph with dask, this seems excessively slow. Is there a different approach I should be taking, or is this a bug?

@GenevieveBuckley
Copy link
Collaborator

For modest-size cubes, the recommended approach is to use numpy/scipy directly, instead of Dask. This is the first point on the Dask best practices page.

Because all the individual tasks are very fast to run, the overhead involved with Dask becomes overwhelming. First, the task graph itself is very time consuming to build. Then, the scheduler can't efficiently distribute the tasks onto the workers well, so you'll see lots of blank white spaces between teeny tiny tasks on the task stream graph in the dask dashboard.

I've done some very haphazard profiling for the example you give above, and it seems like possibly some of the slowness might be due to slicing. There is work currently being done with high level graphs in dask that might help this problem, but I couldn't say how much that would impact this specific circumstance.

Also, I notice that label_comprehension has a dask.array.stack call in it, which isn't a very efficient operation. So that might also be worth looking into (or I might just be unreasonably suspicious of this one thing)

@keflavich
Copy link
Author

Thanks for checking. Of course it makes sense that dask would be somewhat slower than scipy for a cube that fits in memory, I was just shocked that it was this big a difference.

The graph building seems to be pretty heavy on its own. I ran a much smaller example and get the graph below. It doesn't look unreasonable, but I guess it's costly to assemble, and for the ~6000-label example above, it was very unwieldy.
image

@GenevieveBuckley
Copy link
Collaborator

Thanks for checking. Of course it makes sense that dask would be somewhat slower than scipy for a cube that fits in memory, I was just shocked that it was this big a difference.

That's pretty reasonable - it is a very big difference!

There is a lot of work happening on high level graphs in Dask, to try and reduce how costly it is to build the task graphs (or at least, shift some of that cost over onto the workers instead of the client). I'll bring this use case up as an example of something we could try and tackle after slicing and array overlaps.

@sommerc
Copy link
Contributor

sommerc commented Nov 26, 2024

Hi @keflavich,

I was running into similar performance issue with the sum_label functions. In addition, my arrays are multi-channel and big...

Here's a little workaround using dask+numba.jit, in case this is still of interest for you.

Cheers,
Chris

https://gist.github.com/sommerc/e72b59bee50a8b01efb53f57b95a1c97

(tagging also @m-albert as we discussed this on I2K)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants