-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
distance_2 is problematic for integer types #48
Comments
Having an associated value can help for some cases, but wouldn't that be "postponing" the problem? Let's say that we do something like this: let point_a: [i32; 2] = [-2147483648, -2147483648]; // (-2^31, -2^31)
let point_b: [i32; 2] = [2147483647, 2147483647]; // (2^31-1, 2^31-1)
let points = vec![point_a, point_b];
let t: RTree<[i32; 2]> = RTree::bulk_load(points);
// Setting the return type to i64 to fit the intermediate result
let nn: Option<[i64; 2]> = t.nearest_neighbor(&point_a); The intermediate distance computation wouldn't fit either, and at this point we've run out of options since Disclaimer from my part: dealing with integers is a pain when you have arbitrarily large values.
|
One could use i128, but I see the problem. Btw. maybe my issue is invalid? After all, I could have used i64 to begin with. (And in my use case values never go beyond 64k). |
Re: removing integers, I meant generally throughout the library. That would allow users to handle arbitrarily large or small numbers without caring for the types of any intermediate calculations. But going through this route can be risky, there's a lot of things to consider. As you said, the thing users can do now is use the data type that can handle both the data itself and the intermediate computations that the library performs. But this can be improved IMHO (after all, users should not have to deal with the internals of the library), so I wouldn't flag this issue as invalid. |
The problem with using even f64 for internal calculations is, that there's a loss of precision: I can imagine that for some use cases this bad. And num_bigint might affect performance? |
Yeah, there are several cases in which floating point manifest their weaknesses. @Stoeoef mentioned in spade that the project wasn't focused on performance but rather "on a rich and high quality feature set". Maybe that's the focus for this library as well? If so, we might be allowed to consider other libraries to handle numeric types and abstract the details away from users. We could also allow users to provide their own |
In my use case, performance is quite important as our spatially explicit simulations often spend almost a third of their CPU time on spatial index look-ups. Hence, I am personally rather keen on making the choice of the appropriate data type, e.g. double or single precision floating-point in addition to integer or floating point, and would very probably not be able to stomach the performance hit of using arbitrary precision integers internally.
We already do via the fn distance_2(
&self,
point: &Point
) -> f64 {
self.r.powi(2) + point.r.powi(2) - 2. * self.r * point.r * (self.phi - point.phi).cos()
} |
Interesting... |
The support for integer types is something I'd like to keep - they work really well with rtrees those don't require any multiplication or division operations. However, I do acknowledge that this is problematic for large values. Something like For now, the most cost efficient solution would IMO be:
|
Is it enough for distance_2 to return a type whose ord implementation agrees with Pythagorean distance? X is the bits, I.e. 32 or size you can do this without worrying about overflow by doing a single iteration of karatsuba multiplication (like this https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=3fa5db12977a74edc7c4d79feb70ef8b) Then you impl ord by comparing the high, mid, low of self and other in sequence. this can be done completely generic over the number type |
Ie. a distance of
1,000,000
fits intoi32
, but squared it will not. Maybe have an associated type for the return value, defaulting toPoint::Scalar
?The text was updated successfully, but these errors were encountered: