DAS-2278 - Handle spatial subsetting in products with all fills in lat/lon coordinates #26

joeyschultz · 2025-01-29T18:29:19Z

Description

Updates to hoss_config.json to add resolution-specific master geotransform attributes to each grid_mapping polar reference
Move the logic that gets the grid mapping attributes out of get_variable_crs into a new function get_grid_mapping_attributes.
New function create_dimension_arrays_from_geotransform to create the dimension arrays from the master geotransform

Jira Issue ID

DAS-2303
DAS-2278

Local Test Steps

Test steps will be detailed when branch is ready for formal PR

PR Acceptance Checklist

Jira ticket acceptance criteria met.
CHANGELOG.md updated to include high level summary of PR changes.
docker/service_version.txt updated if publishing a release.
Tests added/updated and passing.
Documentation updated (if needed).

…ility

joeyschultz · 2025-01-29T18:33:19Z

I've opened this draft PR to get feedback on the analysis completed in DAS-2303 (the analysis ticket that is being worked in preparation of DAS-2278). Unit tests are expected to fail at this moment.

flamingbear

Joey, here's a bunch of comments, I think you're on the right track and like a lot of this. feel free to respond inline or reach out and we can talk through anything. I'll approve the analysis ticket now. and leave this as a comment on the draft pr

flamingbear · 2025-01-29T20:16:33Z

hoss/coordinate_utilities.py

+    column_dimensions = [
+        col_row_to_xy(geotranform, i, 0) for i in range(lat_arr.shape[1])
+    ]
+    row_dimensions = [col_row_to_xy(geotranform, 0, i) for i in range(lat_arr.shape[0])]


This is ~~probably~~ a massive nit. but can you use different temp vars in your comprehensions? I would use i, and j and follow the standard conventions, or I'd probably just use row and col.

I should go look and see if this was in my code too :awkwardsockmonkey:

also just so I understand, this routine doesn't need lat_arr, an array of the latitudes, it just needs the shape of that variable? is there a way to get that without reading the whole array?

There isn't a current way I'm aware of to get the shape of the variable without reading the whole array. DAS-2287 is addressing this.

Updated to use 'row' and 'col' in 7d5e34c

flamingbear · 2025-01-29T20:23:18Z

hoss/coordinate_utilities.py

+    # pull out dimension values
+    x_values = np.array([x for x, y in column_dimensions], dtype=np.float64)
+    y_values = np.array([y for x, y in row_dimensions], dtype=np.float64)
+    projected_y, projected_x = tuple(projected_dimension_names)


I'm guessing you copied this from the other code, but why not use * notation?
of if you know that projected_dimension_names is alway 2 values, just do direct unpacking

projected_y, projected_x = projected_dimension_names

Agree, updated in 7d5e34c

flamingbear · 2025-01-29T20:44:08Z

hoss/hoss_config.json

@@ -119,13 +119,27 @@
    {
      "Applicability": {
        "Mission": "SMAP",
-        "ShortNamePath": "SPL3FT(P|P_E)",
+        "ShortNamePath": "SPL3FTP",
        "VariablePattern": "(?i).*polar.*"


This may need to be

Suggested change

"VariablePattern": "(?i).*polar.*"

"VariablePattern": "(?i).*Polar.*"

Well, I look dumb, the (?i) is the case insensitive flag..

That aside, if this is only for SPL3FTP, the group is defined as Freeze_Thaw_Retrieval_Data_Polar and I'm assuming (from looking at the file) the variables don't have the name polar except in their fully qualified path.

e.g.
Variable full name: Freeze_Thaw_Retrieval_Data_Polar/open_water_body_fraction
So why not just be explicit and make that

"VariablePattern": "Freeze_Thaw_Retrieval_Data_Polar.*"

Would that work and be clearer?

I agree this would make it more clear. I've updated this in 7d5e34c. A leading slash is required so I decided to use:

"VariablePattern": "/Freeze_Thaw_Retrieval_Data_Polar/.*"

flamingbear · 2025-01-29T20:45:14Z

hoss/hoss_config.json

+      "Applicability": {
+        "Mission": "SMAP",
+        "ShortNamePath": "SPL3FTP_E",
+        "VariablePattern": "(?i).*polar.*"


same comment as before

Updated in 7d5e34c

flamingbear · 2025-01-29T21:01:07Z

hoss/projection_utilities.py

@@ -70,7 +77,7 @@ def get_variable_crs(variable: str, varinfo: VarInfoFromDmr) -> CRS:
                cf_attributes = varinfo.get_missing_variable_attributes(grid_mapping)

            if cf_attributes:
-                crs = CRS.from_cf(cf_attributes)
+                return cf_attributes


I was looking at this and going to complain that the complexity is a little out of hand, but I see that none of it was you... So this seem fine.

flamingbear · 2025-01-29T21:05:35Z

hoss/projection_utilities.py

+def get_variable_crs(cf_attributes: str) -> CRS:
+    """Create a `pyproj.CRS` object from the grid mapping variable metadata
+    attributes.
+
+    """
+    return CRS.from_cf(cf_attributes)
+


What if instead, you left this signature as it was, and moved a call to get_grid_mapping_attributes into here, then you wouldn't need to make two calls below and you've still exposed a function to get the cf_attributes.

And then actually, you could have another get_master_geotransform() function you could use down near L259 where you are switching how you get your dimension arrays.

let me know if that makes sense or sounds stupid.

flamingbear · 2025-01-29T21:19:48Z

hoss/spatial.py

-        crs,
-        projected_dimension_names,
-    )
+    if 'master_geotransform' in grid_mapping_attributes:


I think we talked in a huddle about this. I'm seeing that it's used specifically for getting ranges, and this is sorta at the level of thought process in the function, so I don't think it buys you anything to bury this in another function just to hide this if statement.

I do still think you might have a dedicated get_master_geotransform() function that would hide all calls to get_grid_mapping_attributes from this level of abstraction (like I mentioned above) Also why not primary_geotransform instead of master_geotransform?

I'm going to implement a dedicated get_master_geotransform() function as you suggest. As for master_geotransform vs primary_geotransform I went with the naming suggested by @D-Auty.

FYI - "master geotransform" here refers to the notion that the geotransform for a "whole earth" grid - e.g., what is defined in the GPD files, and for the EASE-GRID standard grids. A geotransform as a granule attribute by itself could be considered redundant with the dimension variables - which provide the "coordinate" in meters for the data arrays, but is specific to the granule and the array sizes. A "master geotransform" would be a collection level attribute, applicable across many granules of different sizes (e.g. tiles) and likely, many collections even. Hopefully the reference of master geotransform avoids the confusion with the specific extents of the granule itself. It seemed that master was a better reference than primary in this case.

…config updates

joeyschultz added 4 commits January 27, 2025 14:32

Add master_geotransform attribute to polar grid_mapping variables

dc28ad8

Update master_geotransform values

7b581ec

DAS-2278 - Create dimension arrays from geotransform

70ac694

DAS-2278 - Add --platform linux/amd64 to Docker commands for compatib…

2296395

…ility

joeyschultz requested a review from flamingbear January 29, 2025 18:30

flamingbear reviewed Jan 29, 2025

View reviewed changes

joeyschultz added 8 commits February 3, 2025 14:59

Merge branch 'main' into DAS-2278

8fb7cf6

Update ICESat2 config item to align with recent trajectory subsetter …

e0b43b6

…config updates

Updates for draft PR comments

7d5e34c

Modify for 3D variable cases

a43c291

Modify functions related to getting crs and geotransform

1b23ebd

Updates to varinfo config for errors discovered in testing

2443ed8

Updated unit tests for new and updated functions

5a02020

Update CHANGELOG and docker service version

69fbe69

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DAS-2278 - Handle spatial subsetting in products with all fills in lat/lon coordinates #26

DAS-2278 - Handle spatial subsetting in products with all fills in lat/lon coordinates #26

joeyschultz commented Jan 29, 2025 •

edited

Loading

joeyschultz commented Jan 29, 2025 •

edited

Loading

flamingbear left a comment

flamingbear Jan 29, 2025

flamingbear Jan 29, 2025

flamingbear Jan 29, 2025

joeyschultz Feb 3, 2025

joeyschultz Feb 3, 2025

flamingbear Jan 29, 2025

joeyschultz Feb 3, 2025

flamingbear Jan 29, 2025

joeyschultz Feb 3, 2025

flamingbear Jan 29, 2025

joeyschultz Feb 3, 2025

flamingbear Jan 29, 2025

flamingbear Jan 29, 2025

flamingbear Jan 29, 2025

joeyschultz Feb 3, 2025

D-Auty Feb 4, 2025

	"VariablePattern": "(?i).polar."
	"VariablePattern": "(?i).Polar."

DAS-2278 - Handle spatial subsetting in products with all fills in lat/lon coordinates #26

Are you sure you want to change the base?

DAS-2278 - Handle spatial subsetting in products with all fills in lat/lon coordinates #26

Conversation

joeyschultz commented Jan 29, 2025 • edited Loading

Description

Jira Issue ID

Local Test Steps

PR Acceptance Checklist

joeyschultz commented Jan 29, 2025 • edited Loading

flamingbear left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joeyschultz commented Jan 29, 2025 •

edited

Loading

joeyschultz commented Jan 29, 2025 •

edited

Loading