-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to use RTree as a set? #82
Comments
Yeah,
The use-case for
Pls. do provide a PR if possible for |
We could provide a bulk_load_unique method which uses e.g. a HashSet to dedupe the provided vec (elements have to be Copy, and it's an O(n) op) and we wouldn't have to tweak the insertion logic otherwise. It's growing the API, but I tend to think of things in terms of OGC geoms where we don't tend to see much duplication, but that's actually a narrow use case, so if ppl are OK with a new method (no dupes potentially produces a higher-quality tree too)… |
In this case, wouldn't it be preferable to keep the deduplication outside of the |
I think offering a deduplication method based on sorting would be a nice addition. It should also be fairly straight forward to implement. I've also looked briefly into the bulk loading code and don't see a good way to do the de-duplication "on the fly". The algorithm really expects that all elements of the source collection will become an element in the final r-tree. |
One open question for me: Should we implement The latter would give the caller a few more options, e.g.
I would also be fine with an |
I think an entry-style API is definitely preferable but would possibly also imply more work. I think there are also a few complications for set-style in contrast to map-style usage, e.g. you write Also, the standard library's and hashbrown's |
For the record, I went with the simple solution where I just hacked stuff together using the existing API, and I'm getting quite good performance for my use case. For the union we can remove duplicates in fn union(lval: RTree, mut rval: RTree) -> RTree {
lval.iter().for_each(|p| {
rval.remove(p);
});
RTree::bulk_load(lval.iter().chain(rval.iter()).copied().collect());
} And for inserting I just check if it's already a member as I mentioned above. After looking at the code for |
I am trying to do the following operations on an RTree so that I can use it as a set:
RTree
s without duplicatesFor the first operation this can be done with existing methods on
RTree
:However, while I don't know the internals of
RTree
I'm guessing this approach traverses the tree twice when it's only necessary to traverse it once. Am I right that we could do this operation more efficiently if we worked with the internals of RTree, for example adding aninsert_unique
method? The above approach is probably good enough for my use case anyways.The second operation is a little harder. Here is my code for merging two trees if I didn't care about duplicates:
From there I guess I could iterate over the tree checking whether each element equals it's second nearest neighbor to remove duplicates, but this feels hacky. I'd imagine that if we had some sort of
insert_unique
method, then we could rewritebulk_load
usinginsert_unique
in place ofinsert
to get a method that makes a tree without duplicates.Another alternative is I can just accept duplicates in my set. Then in order to pop / remove an element I would have to remove all instances like this:
which wouldn't be too costly if there's only two or three of each element at most.
So I think my options for using RTree as a set are:
RTree
, probably getting a half or a third the performance of what's possible.RTree
to do these operations more efficiently. I could either structure this as a wrapper struct aroundRTree
in my own crate, or contribute this to rstar. I'd imagine I'd have to be the one to contribute this to rstar, since I think this use case is not a very common one.So what are people's thoughts? Is anyone else interested in having an
RTreeSet
in rstar? How much worse will my performance be if I continue using the hacky lazy approach?The text was updated successfully, but these errors were encountered: