Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cone_coverage_approx - > unexpected results #14

Open
Theodlz opened this issue Feb 8, 2025 · 6 comments
Open

cone_coverage_approx - > unexpected results #14

Theodlz opened this issue Feb 8, 2025 · 6 comments

Comments

@Theodlz
Copy link

Theodlz commented Feb 8, 2025

Hi! I've been trying to use this crate for cone-search like operations. Here is an issue I ran into: I define my Layer at depth = 4, then use the cone_coverage_approx() methods to find the healpix that overlap with a cone at ra=291.319724, dec=-82.309285, radius = 300.0/3600.0 (300 arcsec). However, the pixels returned are erroneous, and not at the right depth.

Here's a quick example:

use cdshealpix::nested::get;
    let n = get(4 as u8);
    let (ra, dec) = (291.3197242629231_f64, -82.30928537900044_f64);
    let radius: f64 = 300.0 / 3600.0;
    let ipixes = n.cone_coverage_approx(ra.to_radians(), dec.to_radians(), radius.to_radians());
    println!("{:?}", ipixes);
    for healpix in ipixes.iter() {
        println!("healpix: {}", healpix);
        let (ra, dec) = n.center(*healpix);
        println!("ra: {}, dec: {}", ra.to_degrees(), dec.to_degrees());
    }

    // compute the healpix of the ra/dec
    let ipix = n.hash(ra.to_radians(), dec.to_radians());
    println!("ra: {}, dec: {}, healpix: {}", ra, dec, ipix);

This yields the error Wrong hash value: too large., when I try to compute the ra/dec of the resulting healpix, which makes sense as the result of the cone search is [11274, 11298], which clearly isn't at depth 4! even if the result of the cone search says that the max depth is 4.

I also tried doing let n = get(3 as u8) (3 instead of 4) to see if I could get the overlapping pixels at depth 4 this way, and I get [2818,2826], when the results should be [2818, 2824]. So it is indeed at depth + 1, but the results don't look quite right.

I compared these results with healpy and astropy_healpix, and the cdshealpix ones appear incorrect, and to be at the requested depth + 1. Any idea what might be happening? I wonder if this error forward propagates into mocpy as well?

@Theodlz
Copy link
Author

Theodlz commented Feb 8, 2025

Ok so it looks like if I take these results that appear to be at depth+1, get their ra/dec, then recompute the healpix at the requested depth, they are correct.

So the main problem here is that one requests the healpix at order N, and gets them at order N+1. If I request them at N-1 to hopefully get the results I would expect for N, they are incorrect though.

@Theodlz
Copy link
Author

Theodlz commented Feb 8, 2025

I extracted the logic from the cone search approx method and run that, I do find the right pixels that way. I wonder if it's once the resulting pixels gets passed as a BMOC builder object that something funky or undocumented happens?

@fxpineau
Copy link
Member

@Theodlz ,
I think the problem is that cdshealpix is not compliant with the "principle of least astonishment" for people used to other libraries.

A BMOC stores HEALPix cells at various depth/order. The method iter returns the raw values stored in the BMOC. A raw value encodes 3 values:

  • the cell depth/order;
  • the cell hash/ipix value;
  • a flag telling if the cell is fully covered for sure or not by the cone.

I have to check whether the methods to extract the 3 values from a raw BMOC value are public or not (and update the doc accordingly).

The method to get the list of cell number at the BMOC maximum depth (the depth used in the cone_coverage method), is
flat_iter. See the example in the cone_coverage_approx doc.

Thus, if you replace iter by flat_iter in your code, you should get the expected result.
(To lower the number of false positives, you may use cone_coverage_approx_custom)

@Theodlz
Copy link
Author

Theodlz commented Feb 11, 2025

@Theodlz , I think the problem is that cdshealpix is not compliant with the "principle of least astonishment" for people used to other libraries.

A BMOC stores HEALPix cells at various depth/order. The method iter returns the raw values stored in the BMOC. A raw value encodes 3 values:

  • the cell depth/order;
  • the cell hash/ipix value;
  • a flag telling if the cell is fully covered for sure or not by the cone.

I have to check whether the methods to extract the 3 values from a raw BMOC value are public or not (and update the doc accordingly).

The method to get the list of cell number at the BMOC maximum depth (the depth used in the cone_coverage method), is flat_iter. See the example in the cone_coverage_approx doc.

Thus, if you replace iter by flat_iter in your code, you should get the expected result. (To lower the number of false positives, you may use cone_coverage_approx_custom)

thanks a lot for the detailed answer @fxpineau. And I see the flat_iter, that provides the healpix at the max depth right? That said I still don't get why just printing the bmoc shows me the healpix at max depth + 1. I edited the cdshealpix rust code to print the healpix that are being added to the bmoc, and they aren't at max depth + 1, they are indeed at the max depth. So, in that case, why printing them at a depth that isn't the one they were calculated at? Doesn't that mean you need to run one more computation just for that print? Might make sense that just printing the bmoc shows the data up to the max depth or at least show the data that was used to fill said bmoc object?

@fxpineau
Copy link
Member

@Theodlz ,
If you try with a larger radius (leading to cells having a depth < depth_max) and/or different positions, you should see that the cells printed by printing a BMOC do not necessarily fit with the requested cone's cells at order depth_max + 1.

Internally a BMOC is made of two attributes: its maximum depth and the list of raw encoded cells (at possibly various depth, see previous answer for more details).
And cdshealpix so far uses the default Debug display automatically implemented using the macro #[derive(Debug)].

In your specific case, the raw encoded cell numbers are simply made of the cell numbers left shifted of two bits, with the LSB encoding the flag (0 in your case) and the 2nd LSB being a sentinel bit (set to 1) encoding the depth.
It look's very similar to the cell number at depth +1 (also a left shift of two bits, the two bits encoding the 4 possible sub-cell number), hence the confusion I guess.

@Theodlz Theodlz changed the title cone_coverage_approx - > erroneous results cone_coverage_approx - > unexpected results Feb 11, 2025
@Theodlz
Copy link
Author

Theodlz commented Feb 11, 2025

I see, the Debug trait makes sense. What confused me @fxpineau is https://github.com/cds-astro/cds-healpix-rust/blob/master/src/nested/mod.rs#L3559, which converts the healpix found to the original depth, and then pushes the healpix to the bmoc builder https://github.com/cds-astro/cds-healpix-rust/blob/master/src/nested/mod.rs#L3566. By this logic, the healpix that are in there would be at the requested depth (I added a print in that loop adding them to the bmoc builder, and I can confirm that they are). So, is it somewhere in the bmoc builder that they are converted to healpix of another depth?

Perhaps this is what is happening here?

fn build_raw_value(depth: u8, hash: u64, is_full: bool, depth_max: u8) -> u64 {
// Set the sentinel bit
let mut hash = (hash << 1) | 1_u64;
// Shift according to the depth and add space for the flag bit
hash <<= 1 + ((depth_max - depth) << 1);
// Set the flag bit if needed
hash | (is_full as u64) // see https://doc.rust-lang.org/std/primitive.bool.html
}
That is where the left shifting you mentioned happens right? If so, then yes this makes sense to me.

Perhaps, the real "issue" here is that it is not immediately clear when reading the rust documentation that the flat_iter() is necessary, to get the expected values. It's easy to just do an iter() or into_iter() since those are rusts's usual iterators, and to get u64 values that arent't those expected. Maybe just mentioning that it should be used to read the output of said method (say, in the docstring of the method itself) might help? I know it's mentioned here (https://docs.rs/cdshealpix/0.7.3/cdshealpix/nested/bmoc/struct.BMOC.html#method.flat_iter), but if you're just learning how to use the package, but a gently nudge in that direction would avoid a lot of confusion I believe!

Feel free to close that issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants