-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel bulk loading #107
Comments
Parallel bulk loading is something I've actually implemented a long while ago. I then discarded the changes as I thought that the break-even point (number of elements where parallel load is faster than sequential load) was too high and that the maintenance burden wasn't worth the additional code. Reading through my old code, I've written that the break-even point would be about 40000 elements... I guess I never considered back then that people would use rtrees with millions of elements. Or maybe this is just a typo. In any case, the parallel bulk load got removed in commit eae7354 . I have reverted that commit and have fixed all compilation issues. Would you be interested in testing the implementation? If this feature benefits your use case, it may be worthwhile to keep it. For testing, checkout this branch: https://github.com/Stoeoef/rstar/tree/bulk-load-parallel Then, enable the "threadpool" feature and use Note that there's still some work to do, even if the parallel algorithm works just fine - e.g. checking the test coverage of the parallel algorithms and probably renaming the feature to something more descriptive. I can look into that if the changes look promising. Performance wise, the threading-split is fairly coarse grained - only the root node is loaded in parallel. However, since the root will have a lot of children with the config you are using, you may still see some good speed up. |
I know its much to ask for, but I think it would be nice if this used Rayon's thread pool to better integrate into code bases already using it for data parallel processing. I am not sure, but the algorithm might even be amenable to using Rayon's |
I'm not bound to the I also tried using I think a real good speed-up could be gained by performing selection in parallel. I believe that that is, with some margin, the part where the implementation spends most time in. Currently, this is done using https://doc.rust-lang.org/stable/std/primitive.slice.html#method.select_nth_unstable . I'm not aware of any library that implements a parallel implementation for that unfortunately. |
FWIW, there's a rayon issue about |
just gave the parallel-bulk-loading branch a test drive and:
gives quite a nice speedup even if it didn't even get close to saturating all 32 threads of my CPU |
Thanks for testing this! Good to see that this still gives a nice speed up. I think that's enough justification to warrant a change, even if the algorithm isn't ideal yet. For me, the remaining TODOs would be:
@adamreichold : Is there anything else you would consider to be necessary? |
In my opinion, the most important part of using Rayon would be to use same thread pool as other parts of a program that already uses Rayon. Of course, if we just want to spawn tasks onto Rayon's thread pool, a dependency on I think this change would be minimal but I could certainly try to provide it as a follow-up PR since I asked for it if there is no reason against it.
For such algorithms, I do like property-based tests of the sequential versus the parallel implementation. I am not sure though how powerful such tests would be as I suspect some non-deterministic ordering of the results of the parallel variant? |
Ah, that's a good argument for using rayon - color me convinced! I agree, the change will be fairly minimal.
I think property test should still be doable. The "only" difference between the outcomes of the parallel and the sequential solution should be the order of the tree's root nodes. That should hardly be noticeable. For this ticket I would suggest to adapt |
Hello,
for a project of mine, a long distance route plotter for the game Elite: Dangerous, I'm using an R*-Tree as my main data structure for nearest-neighbor queries, my data set contains about 87 million 3D-Points with some metadata attached, and constructing an R*-Tree with the following parameters:
takes around 30 seconds on my machine, query performance is excellent at ~23M queries per second distributed across 32 threads
i have done a quick skim over the bulk-loading code and I've also skimmed the paper it is based on bit it goes a bit over my head. my main question is: would it be feasible to parallelize the bulk loading using something like rayon to improve performance?
I've experimented with splitting my data up and constructing multiple trees i can query in parallel but didn't have too much success with that.
best regards,
Earthnuker
The text was updated successfully, but these errors were encountered: