Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Large Persisted Trees #50

Open
anlumo opened this issue Aug 26, 2020 · 2 comments
Open

Support for Large Persisted Trees #50

anlumo opened this issue Aug 26, 2020 · 2 comments

Comments

@anlumo
Copy link

anlumo commented Aug 26, 2020

Hello,

My use case involves a data structure that can be 1GB or more in size and has to be persisted in permanent storage (aka a database). Since it's millions of small objects, using bulk insert is probably not a good idea with O(n*log n) complexity.

I could just serialize the whole tree into a single blob and write that to disk. However, besides doubling the memory requirements, every single change would also make it necessary to serialize everything again.

A typical solution for B-trees is to store the non-leaf nodes as separate objects on disk and load them on demand. This should also be possible with R*trees, and ParentNode even implements Serialize and Deserialize. However, I don't see a way to construct an RTree object from this, and this would also need accessors for adding, removing and modifying nodes (and it even has to be async since it involves I/O).

Is there a chance that this feature might be implemented in this crate?

@rmanoka
Copy link
Contributor

rmanoka commented Aug 1, 2021

The RTree struct implements serde traits when "serde" feature is enabled. I suppose you're suggesting a more fine-grained control over which RTreeNode to serialize, and to be able to assemble / modify the tree dynamically? While not impossible, this sounds complicated as many guarantee / invariants would have to be verified or assumed on the pieces to be put together.

For starters, can you refer us to a B-Tree or another RTree library that allows such fine-grained modification of the subtrees? The requirements sounds more like a filesystem data-structure, which tends to be a very specialized implementation.

You may also want to explore PostGIS if you haven't; it is more of a database solution for spatial indexing, and seems more tuned to solve your use-case.

@anlumo
Copy link
Author

anlumo commented Aug 16, 2021

Thank you for the response! I think you understood my idea correctly, and yes, sqlite is an example of this kind of implementation for a B-Tree.

Unfortunately, we cannot ship PostgreSQL with our desktop application, but thank you for the pointer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants