Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference frames (GEO => gcrs, ICRS and ITRF realizations, etc.) #371

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

FedeMPouzols
Copy link
Collaborator

Fixes #339.


Note well:

This contribution is made under the current ALMA software agreements. 
(c) European Southern Observatory, 2024, 2025
Copyright by ESO (in the framework of the ALMA collaboration)

@FedeMPouzols FedeMPouzols linked an issue Mar 13, 2025 that may be closed by this pull request
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@FedeMPouzols FedeMPouzols mentioned this pull request Mar 13, 2025
@kettenis
Copy link
Contributor

Hi @FedeMPouzols,

I agree the "GEO" -> "gcrs" translation is correct. The Kaplan 2006 paper referenced by the astropy docs explicitly states that

VLBI observations are unique in that they exist only after data from various individual antennas
are combined; therefore they are referred to the GCRS ab initio.
and we currently do set the "spectral frame" to "GEO" for VLBI observations in CASA (in importfitsidi).

As for adding optional ICRS/ITRS frame realizations, this may be "nice to have" but it will be difficult to properly populate this information since I don't think VLBI correlators keep track of the relevant metadata in a systematic way. Note that ITRF relates to ITRS as ICRF relates to ICRS, and astropy uses "itrs" in its API. Doesn't that mean that MSv4 should use "itrs" instead of "ITRF"?

Somewhat related to this discussion, we recently discovered that VLBI datasets in MSv2 format are mislabeled as "J2000" in CASA, resulting in FITS images that are incorrectly labelled as using the "fk5/J2000" reference frame. In reality modern VLBI is all done in ICRS and we're fixing that on the CASA side. @Jan-Willem, that may mean we need to tweak the reference VLBI data sets to reflect this.

This also makes me question the use as "fk5" as the default reference frame in a lot of the MSv4 documentation and code. I realize that the difference between "fk5/J2000" and ICRS are small for most instruments that aren't VLBI arrays. But modern radio instruments really should be using a reference frame based on radio sources (even the IAU says so). The VLA is excused because it pre-dates the establishment of the ICRS, but that isn't true for ALMA. I certainly think that we as a communuty didn't do our job properly if we see datasets coming out of SKA that use the "fk5" reference frame.

@FedeMPouzols
Copy link
Collaborator Author

Are we using "fk5" as default in some places? At least in the schema the defaults seem "icrs" (although those only play a role if one uses those classes to construct an MSv4). One place where I can see a default value being assumed, when no metadata is found in the input MSv2 is here: https://github.com/casangi/xradio/blob/main/src/xradio/measurement_set/_utils/_msv2/create_field_and_source_xds.py#L205 and they assume "ICRS".

Nobody is perfect. Apparently ALMA made a very similar mislabellng in cycles < 3. There is this ticket: https://jira.alma.cl/browse/SACM-2214 (within the "Archive correction and repair effort") about fixing that in the Archive:

Before cycle 3, coordinates in the Source and Field sub-tables in the ASDM display J2000 as the "directionCode" keyword. This keyword shows the reference frame of the source's direction, according to the SDM description.

This is a labeling error, since the phase calibrator catalogued reference frame has been ICRS since the start of ALMA observations. The solution provided to users is simply change J2000 to ICRS in the metadata of ms-files and header of fits images (cf. https://help.almascience.org/kb/articles/how-do-i-combine-cycle-1-and-or-2-measurementsets-and-images-with-ones-from-other-cycles-j2000)
...

As far as I know ALMA uses ICRS in the field/source tables.

@FedeMPouzols
Copy link
Collaborator Author

FedeMPouzols commented Mar 14, 2025

As for adding optional ICRS/ITRS frame realizations, this may be "nice to have" but it will be difficult to properly populate this information since I don't think VLBI correlators keep track of the relevant metadata in a systematic way. Note that ITRF relates to ITRS as ICRF relates to ICRS, and astropy uses "itrs" in its API. Doesn't that mean that MSv4 should use "itrs" instead of "ITRF"?

So perhaps we leave the frame_realization out for now? It can always be added at a later time, when at least one user is ready to populate it with something else than "Unknown"?

About ITRF there is something I must be missing. We currently have in the lists of possible location frame values:

AllowedLocationFrames = Literal["ITRF", "GRS80", "WGS84", "WGS72", "Undefined"]

But in practice we translate casacore ITRF to astropy GRS80 (https://github.com/casangi/xradio/blob/main/src/xradio/_utils/schema.py#L200).
Could we just drop "ITRF" and keep only the 3 astropy "ellipsoid" options?
(this might have been just my oversight when adding those lists of literals, I must have copied the ITRF from docstrings and/or the spreadsheet without realizing that we are translating the "ITRF" name always, it seems)

@kettenis
Copy link
Contributor

Are we using "fk5" as default in some places? At least in the schema the defaults seem "icrs" (although those only play a role if one uses those classes to construct an MSv4). One place where I can see a default value being assumed, when no metadata is found in the input MSv2 is here: https://github.com/casangi/xradio/blob/main/src/xradio/measurement_set/_utils/_msv2/create_field_and_source_xds.py#L205 and they assume "ICRS".

In
5e40825#diff-1bd394b1e0d5fffc69cb3619ea87d24529adc9c93c767f69750655ac74f16f90
you're changing:

frame: Attr[AllowedSkyCoordFrames] = ""

into

frame: Attr[AllowedSkyCoordFrames] = "fk5"

doesn't that result in "fk5" as the default in the schema docs for SkyCoordArray?

Same for PointingBeamArray and LocalSkyCoordArray (where the default already is "fk5" in the schema docs).

Nobody is perfect. Apparently ALMA made a very similar mislabellng in cycles < 3. There is this ticket: https://jira.alma.cl/browse/SACM-2214 (within the "Archive correction and repair effort") about fixing that in the Archive:

Before cycle 3, coordinates in the Source and Field sub-tables in the ASDM display J2000 as the "directionCode" keyword. This keyword shows the reference frame of the source's direction, according to the SDM description.
This is a labeling error, since the phase calibrator catalogued reference frame has been ICRS since the start of ALMA observations. The solution provided to users is simply change J2000 to ICRS in the metadata of ms-files and header of fits images (cf. https://help.almascience.org/kb/articles/how-do-i-combine-cycle-1-and-or-2-measurementsets-and-images-with-ones-from-other-cycles-j2000)
...

As far as I know ALMA uses ICRS in the field/source tables.

Good that this was fixed. The ALMA data set in the ps_vis_full_dataset.ipynb tutorial does seem to have fk5 coordinates though. Maybe that uses an old Cycle 0-2 dataset?

@kettenis
Copy link
Contributor

As for adding optional ICRS/ITRS frame realizations, this may be "nice to have" but it will be difficult to properly populate this information since I don't think VLBI correlators keep track of the relevant metadata in a systematic way. Note that ITRF relates to ITRS as ICRF relates to ICRS, and astropy uses "itrs" in its API. Doesn't that mean that MSv4 should use "itrs" instead of "ITRF"?

So perhaps we leave the frame_realization out for now? It can always be added at a later time, when at least one user is ready to populate it with something else than "Unknown"?

I think that is a sensible approach. I thing the "multuple realizations" thing came up in a discussion when we were trying to sort out the ellipsoid thing below. But I don't think we really thought of it as a requirement.

About ITRF there is something I must be missing. We currently have in the lists of possible location frame values:

AllowedLocationFrames = Literal["ITRF", "GRS80", "WGS84", "WGS72", "Undefined"]

But in practice we translate casacore ITRF to astropy GRS80 (https://github.com/casangi/xradio/blob/main/src/xradio/_utils/schema.py#L200). Could we just drop "ITRF" and keep only the 3 astropy "ellipsoid" options? (this might have been just my oversight when adding those lists of literals, I must have copied the ITRF from docstrings and/or the spreadsheet without realizing that we are translating the "ITRF" name always, it seems)

I'm pretty confident that mapping casacore ITRF to astropy GRS80 is just wrong. Apparently there is an offset between the origin of the two of more than 2m! ITRF should stay ITRF (but maybe named itrs since this is what astropy uses?). And ITRF locations should always be expressed using cartesian coordinates (so using [cartesian_pos_label] as dimension). I think that also means it would always geocentric.

The others (GRS80, WGS84, WGS72) only really make sense for positions relative to an ellipsoid (so using [ellipsoid_pos_label] as dimension). And these are therefore always geodetic? Not entirely confident here as we always use ITRF for our antenna positions in VLBI.

@FedeMPouzols
Copy link
Collaborator Author

Ok, thanks, then we'll leave the frame_realization for a future extension.

About ITRF and GRS80 if I'm not mistaken, the assumption is that in astropy all earth location objects are in ICRS. What is put in the frame field of the MSv4 location metadata is meant as what one then passes to the ellipsoid parameter of EarthLocation: https://docs.astropy.org/en/latest/api/astropy.coordinates.EarthLocation.html. As long as one passes the three x,y,x with the MSv2 / casacore ITRF values those will be interpreted as geocentric. Then the ellipsoid parameter is "The default ellipsoid used to convert to geodetic coordinates.", although in this case it looks a bit confusing. Not sure if it would make sense to leave frame empty or "Undefined" in the casacore:ITRF=>astropy:ICRS case?

But probably I'm confusing things and that's not the whole story, I'd wait for Jan-Willem to clarify.

@kettenis
Copy link
Contributor

About ITRF and GRS80 if I'm not mistaken, the assumption is that in astropy all earth location objects are in ICRS.

I suspect you meant to write ITRS here? And yes, it does seem that astropy's EarthLocation always is ITRS with geocentric cartesian coordinates. It is possible to convert to/from geodetic coordinates (lon, lat, height) by specifying an ellipsoid (WGS84 by default).

What is put in the frame field of the MSv4 location metadata is meant as what one then passes to the ellipsoid parameter of EarthLocation: https://docs.astropy.org/en/latest/api/astropy.coordinates.EarthLocation.html. As long as one passes the three x,y,x with the MSv2 / casacore ITRF values those will be interpreted as geocentric. Then the ellipsoid parameter is "The default ellipsoid used to convert to geodetic coordinates.", although in this case it looks a bit confusing. Not sure if it would make sense to leave frame empty or "Undefined" in the casacore:ITRF=>astropy:ICRS case?

At this point I think calling 'WGS84', 'GRS80' or 'WGS72' a frame is just wrong. The ellipsoid to use to convert from geodetic to geocentric coordinates should be a separate attribute. And for locations on earth the only reasonable frame is 'itrs'. Not even sure it makes sense for the schema to cater of locations to be specified using geodetic coordinates. Can't we make sure those get converted to geocentric when the MSv4 gets created?

Having 'planetodetic' coordinates might still be needed. But for those 'WGS84', 'GRS80' or 'WGS72' are not relevant. I suppose ALMA imposes some requirements here?

But probably I'm confusing things and that's not the whole story, I'd wait for Jan-Willem to clarify.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reference frames
3 participants