-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NeighborList vs Index + Distance Matrices #25
Comments
IIRC the reason was just ergonomics. In some downstream applications, I had to re-sort by indices instead of distance, and it was easier to do so with a vector of pairs. Also it's easier to ensure that users supply a valid index and distance pair, whereas with separate containers, you always have to validate that each index has an associated distance. Come to think of it, a few releases ago, I had switched to I'm not sure how much improvement to cache locality there might be. In most situations, you would want both the index and the distance, so you end up having to pull them both into cache anyway. With a pair, you'd get them both at once, whereas it might incur two separate look-ups if they were in separate containers. (But I don't really know for sure.) Anyway, if you think that a pair of matrices is better for your application, you can easily just roll your own loop. I do that myself in the R package bindings, at least when Edit: just realized this is an issue on the umappp repo, in which case you might as well know that the resort-by-index occurs in |
Ergonomics makes a lot of sense looking at |
I think so, but I translated it from the uwot C++ code, which in turn was translated from the original Python code; I've never actually looked at the latter myself. If I had to guess, it would have been translated from Also, FWIW, the |
Hi Aaron,
I was curious regarding the choice of the struct
Which is a
std::vector<std::vector<std::pair<Index_, Float_>>>
Over a simple
std::pair<Matrix<Index_>, Matrix<Float_>>
of appropriate dimensions.Maybe it was as result of porting the code, but I would expect the latter option to have better cache locality and maybe even slightly better memory usage.
The text was updated successfully, but these errors were encountered: