-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAS-2278 - Handle spatial subsetting in products with all fills in lat/lon coordinates #26
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Joey, here's a bunch of comments, I think you're on the right track and like a lot of this. feel free to respond inline or reach out and we can talk through anything. I'll approve the analysis ticket now. and leave this as a comment on the draft pr
hoss/coordinate_utilities.py
Outdated
column_dimensions = [ | ||
col_row_to_xy(geotranform, i, 0) for i in range(lat_arr.shape[1]) | ||
] | ||
row_dimensions = [col_row_to_xy(geotranform, 0, i) for i in range(lat_arr.shape[0])] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably a massive nit. but can you use different temp vars in your comprehensions? I would use i, and j and follow the standard conventions, or I'd probably just use row and col.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should go look and see if this was in my code too :awkwardsockmonkey:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also just so I understand, this routine doesn't need lat_arr, an array of the latitudes, it just needs the shape of that variable? is there a way to get that without reading the whole array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isn't a current way I'm aware of to get the shape of the variable without reading the whole array. DAS-2287 is addressing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to use 'row' and 'col' in 7d5e34c
hoss/coordinate_utilities.py
Outdated
# pull out dimension values | ||
x_values = np.array([x for x, y in column_dimensions], dtype=np.float64) | ||
y_values = np.array([y for x, y in row_dimensions], dtype=np.float64) | ||
projected_y, projected_x = tuple(projected_dimension_names) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing you copied this from the other code, but why not use * notation?
of if you know that projected_dimension_names is alway 2 values, just do direct unpacking
projected_y, projected_x = projected_dimension_names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, updated in 7d5e34c
hoss/hoss_config.json
Outdated
@@ -119,13 +119,27 @@ | |||
{ | |||
"Applicability": { | |||
"Mission": "SMAP", | |||
"ShortNamePath": "SPL3FT(P|P_E)", | |||
"ShortNamePath": "SPL3FTP", | |||
"VariablePattern": "(?i).*polar.*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may need to be
"VariablePattern": "(?i).*polar.*" | |
"VariablePattern": "(?i).*Polar.*" |
Well, I look dumb, the (?i) is the case insensitive flag..
That aside, if this is only for SPL3FTP, the group is defined as Freeze_Thaw_Retrieval_Data_Polar
and I'm assuming (from looking at the file) the variables don't have the name polar except in their fully qualified path.
e.g.
Variable full name: Freeze_Thaw_Retrieval_Data_Polar/open_water_body_fraction
So why not just be explicit and make that
"VariablePattern": "Freeze_Thaw_Retrieval_Data_Polar.*"
Would that work and be clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this would make it more clear. I've updated this in 7d5e34c. A leading slash is required so I decided to use:
"VariablePattern": "/Freeze_Thaw_Retrieval_Data_Polar/.*"
hoss/hoss_config.json
Outdated
"Applicability": { | ||
"Mission": "SMAP", | ||
"ShortNamePath": "SPL3FTP_E", | ||
"VariablePattern": "(?i).*polar.*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as before
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in 7d5e34c
@@ -70,7 +77,7 @@ def get_variable_crs(variable: str, varinfo: VarInfoFromDmr) -> CRS: | |||
cf_attributes = varinfo.get_missing_variable_attributes(grid_mapping) | |||
|
|||
if cf_attributes: | |||
crs = CRS.from_cf(cf_attributes) | |||
return cf_attributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking at this and going to complain that the complexity is a little out of hand, but I see that none of it was you... So this seem fine.
hoss/projection_utilities.py
Outdated
def get_variable_crs(cf_attributes: str) -> CRS: | ||
"""Create a `pyproj.CRS` object from the grid mapping variable metadata | ||
attributes. | ||
|
||
""" | ||
return CRS.from_cf(cf_attributes) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if instead, you left this signature as it was, and moved a call to get_grid_mapping_attributes into here, then you wouldn't need to make two calls below and you've still exposed a function to get the cf_attributes.
And then actually, you could have another get_master_geotransform() function you could use down near L259 where you are switching how you get your dimension arrays.
let me know if that makes sense or sounds stupid.
hoss/spatial.py
Outdated
crs, | ||
projected_dimension_names, | ||
) | ||
if 'master_geotransform' in grid_mapping_attributes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we talked in a huddle about this. I'm seeing that it's used specifically for getting ranges, and this is sorta at the level of thought process in the function, so I don't think it buys you anything to bury this in another function just to hide this if statement.
I do still think you might have a dedicated get_master_geotransform() function that would hide all calls to get_grid_mapping_attributes from this level of abstraction (like I mentioned above) Also why not primary_geotransform
instead of master_geotransform
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to implement a dedicated get_master_geotransform() function as you suggest. As for master_geotransform
vs primary_geotransform
I went with the naming suggested by @D-Auty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI - "master geotransform" here refers to the notion that the geotransform for a "whole earth" grid - e.g., what is defined in the GPD files, and for the EASE-GRID standard grids. A geotransform as a granule attribute by itself could be considered redundant with the dimension variables - which provide the "coordinate" in meters for the data arrays, but is specific to the granule and the array sizes. A "master geotransform" would be a collection level attribute, applicable across many granules of different sizes (e.g. tiles) and likely, many collections even. Hopefully the reference of master geotransform avoids the confusion with the specific extents of the granule itself. It seemed that master was a better reference than primary in this case.
Description
Jira Issue ID
DAS-2303
DAS-2278
Local Test Steps
Test steps will be detailed when branch is ready for formal PR
PR Acceptance Checklist
CHANGELOG.md
updated to include high level summary of PR changes.docker/service_version.txt
updated if publishing a release.