-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POC: Handle std names and aliases (#5257) #5313
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5313 +/- ##
==========================================
- Coverage 89.31% 89.19% -0.13%
==========================================
Files 89 90 +1
Lines 22375 22430 +55
Branches 5368 5383 +15
==========================================
+ Hits 19985 20007 +22
- Misses 1640 1670 +30
- Partials 750 753 +3 ☔ View full report in Codecov by Sentry. |
https://github.com/cf-convention/discuss/issues/229 Standard names: hard to spot "typo" in `surface_upward_mass_flux_of_methane_due_to_emission_from_fires`
* new optional argument -d/--descr for including standard name descriptions * including the description of all standard names * including standard name table version and related info. * now contains several variables: - variables (starting with underscore) used in the processing - VERSION (dict) - CONVENTIONS_STRING (str) - STD_NAME (dict) - ALIASES (dict) - optionally DESCRIPTION (dict) harmonize quotation marks to single quotes in description string
57fed3f
to
d2216e7
Compare
@SciTools/peloton is this still alive issue, and if so could you sign the CLA @larsbarring ? |
I thought I had already signed the CLA because of one or two previous minor contributions. Anyway, now done. Whether it is alive or not I am not sure. I have not made any further effort since back then as I am not sure whether it is a reasonable approach in the context of Iris. |
Ok thanks!
Well, I guess we'll take a look + see about it |
Yes, I think that this would still be useful. In the context of CF an aliased standard name is typically regarded as deprecated. Software should be able to read data having an aliased standard name, but new data should use the replacement name. Obviously, there are judgements to be made here regarding how to deal with this in practice. But and aliased standard name should not be considered just as an alternative at the same level as standard name. Also, note that there will likely be some minor changes to the standard name xml file format as of next version, possibly also backported to all previous versions. |
A proper way to handle standard name aliases would also be useful for ESMValTool, see ESMValGroup/ESMValCore#1985. One issue we currently face is merging cubes with different standard names, which is (in the current iris version) not even allowed if the standard names are aliases of each other. |
@LisaBock this may be of interest to you. |
@SciTools/peloton We are planning to talk about this on the 3rd July. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SciTools/peloton We are planning to talk about this on the 3rd July.
@pp-mo and @trexfeathers met on 3rd July. This review is the summary of that meeting.
@larsbarring we really appreciate you taking the time to make this suggestion. It's obviously bad that it's taken us this long to give it any attention. If your priorities have shifted and you are unable to action our comments then please let us know and we'll pick it up.
To prompt more attention from the team, we're adding this to the 3.11 milestone.
'_ACCEPT = "accept"\n' | ||
'_WARN = "warn"\n' | ||
'_REPLACE ="replace"\n' | ||
'_ALTERNATIVE_MODES = [_ACCEPT, _WARN, _REPLACE]\n' | ||
'_DEFAULT = "warn"\n' | ||
'_MODE = _DEFAULT\n\n' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can see what you are trying to achieve here, but we would like to achieve it using Iris' existing structure instead:
FUTURE
flag
Replacement is introduced as an opt-in setting. In future - typically the next major release - it will switch to being opt-out.
Always warn if replacing
Since you made the proposal, we have agreed a global approach to Iris warnings: Filtering Warnings — Iris 3.9.0 documentation (scitools-iris.readthedocs.io). The idea is that Iris should warn for EVERYTHING that might be a problem for a user, and the user applies warning filters if they don't wish to see a specific type of warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should consider whether the rename()
method needs a switch to accept aliases, or some other equivalent granular user control.
@pp-mo I can't recall why using a FUTURE
flag would not be enough user control on its own?
version["table_name"] = "USER-StdNameTable" | ||
prettydict(outfile, "VERSION", version) | ||
version_string = "-".join(version[k] for k in ["table_name", "version_number"]) | ||
outfile.write(f'CONVENTIONS_STRING = "{version_string}"\n\n') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this was sorted in #5423, albeit in a different way.
Presumably no longer necessary in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency with existing Iris code, we believe a new attribute should be introduced: alias_standard_name
, used in a similar way to invalid_standard_name
.
This gives users the level of information and control that they are used to.
invalid_standard_name
example
iris/lib/iris/fileformats/_nc_load_rules/helpers.py
Lines 463 to 470 in 8604bb8
if standard_name is not None: | |
try: | |
cube.standard_name = _get_valid_standard_name(standard_name) | |
except ValueError: | |
if cube.long_name is not None: | |
cube.attributes["invalid_standard_name"] = standard_name | |
else: | |
cube.long_name = standard_name |
🚀 Pull Request
Description
This is a "proof-of-concept" PR to address how to better handle standard name aliases. It consists of the following elements:
tools/generate_std_names.py
).It now includes some table version information (mentioned in CF global attribute "Conventions": expose to users, and include CF standard name table version #5255), and a separate dict for the standard name descriptions (optional when generated).
get_convention
-- return a tentative Conventions stringset_alias_processing
-- define how to handle aliases:"keep" - current behaviour, treat aliases in the same way as currently valid standard names,
"warn" - issue a warning (default), otherwise as "keep",
"replace" - silently update aliases to current standard names.
get_description
-- return the standard name description if availablecheck_valid_standard_name
-- check if a name is a standard name or an alias, and do the translation if requested as defined byset_alias_processing
iris.__init__
common/mixin._get_valid_standard_name
is modified to usecheck_valid_std_name
No units test have been added (would be good to first get some feedback whether this POC is a reasonable approach .... )
Consult Iris pull request check list