Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid try/except block in scan_grib? #446

Closed
kmpaul opened this issue Apr 9, 2024 · 8 comments · Fixed by #450
Closed

Invalid try/except block in scan_grib? #446

kmpaul opened this issue Apr 9, 2024 · 8 comments · Fixed by #450

Comments

@kmpaul
Copy link
Contributor

kmpaul commented Apr 9, 2024

In the kerchunk.grib2.scan_grib() function, there is a loop when processing the coordinate variables for a message, and there is a try/except block in this loop:

kerchunk/kerchunk/grib2.py

Lines 258 to 267 in a0c4f3b

try:
x = m.get(coord2)
except eccodes.WrongStepUnitError as e:
logger.warning(
"Ignoring coordinate '%s' for varname '%s', raises: eccodes.WrongStepUnitError(%s)",
coord2,
varName,
e,
)
continue

that appears to be intended to skip coordinate variables that produce errors when retrieving the coordinate information with cfgrib (the m object is a cfgrib.cfmessage.CfMessage object). However, exceptions produced in this try block never get caught because cfgrib only ever raises ValueError or AssertionError exceptions (at least with the recent versions of cfgrib).

Should this try block be modified to catch ValueError exceptions instead?

@martindurant
Copy link
Member

@emfdavid ?

@emfdavid
Copy link
Contributor

emfdavid commented Apr 9, 2024

We added this in #364
I see these errors when reading HRRR SUB Hourly grib2 files.
Are you seeing other CFGRIB errors that are somehow recoverable/fixable in Kerchunk?
This is somewhat unqiue in that we can repair it with care by taking the difference between the runtime and the valid time to get the step.

@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 10, 2024

(Sorry for the delay. I am currently in GTC+2.)

@emfdavid: Thanks for the reply. Let me understand you completely. When you say, "I see these errors when reading HRRR SUB Hourly grib2 files," are you saying you see ValueErrors or eccodes.WrongStepUnitErrors?

I am also running with subhourly GRIB2 files, and I see ValueErrors arising from cfgrib due to the fact that the step attribute of the CfMessage has a str data type, which you expect with subhourly data. In the m.get("step") call, this produces a ValueError when cfgrib tries to compute int(message[step_key]) in cfgrib.cfmessage.from_grib_step(). Full traceback below:

Traceback
In [15]: m.get("step")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[15], line 1
----> 1 m.get("step")

File ~/Software/miniforge3/envs/myenv/lib/python3.10/_collections_abc.py:824, in Mapping.get(self, key, default)
    822 'D.get(k[,d]) -> D[k] if k in D, else d.  d defaults to None.'
    823 try:
--> 824     return self[key]
    825 except KeyError:
    826     return default

File ~/Software/miniforge3/envs/myenv/lib/python3.10/site-packages/cfgrib/messages.py:213, in ComputedKeysMessage.__getitem__(self, item)
    211 if item in self.computed_keys:
    212     getter, _ = self.computed_keys[item]
--> 213     return getter(self)
    214 else:
    215     return super(ComputedKeysMessage, self).__getitem__(item)

File ~/Software/miniforge3/envs/myenv/lib/python3.10/site-packages/cfgrib/cfmessage.py:97, in from_grib_step(message, step_key, step_unit_key)
     95     raise ValueError("unsupported stepUnit %r" % step_unit)
     96 assert isinstance(to_seconds, int)  # mypy misses this
---> 97 return int(message[step_key]) * to_seconds / 3600.0

(Note, this is with Python 3.10, cfgrib 0.9.11.0, and eccodes 1.7.0.)

I never see eccodes.WrongStepUnitErrors.

(Incidentally, I know that there is a PR currently in to cfgrib to fix this: ecmwf/cfgrib#371, but I don't know when it will be merged and released. Note that this is not a bug in eccodes, but a feature of the newest version of eccodes that is not fully accepted in cfgrib yet.)

@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 10, 2024

Are you seeing other CFGRIB errors that are somehow recoverable/fixable in Kerchunk?

I only ever see the ValueErrors, but they can be avoided in 2 ways:

  1. hack cfgrib.cfmessage.from_grib_step() to change the default values of the step_key parameter from "endStep" to "endStep:int" (and for completeness, do the same for the step_unit_key parameter and make the same changes to cfgrib.cfmessage.to_grib_step()).

  2. modify scan_grib to deal with the special edge case of coord2 == "step" and change it to coord2 = "step:int".

The first option pushes the changes to cfgrib (which is coming in ecmwf/cfgrib#371, but who knows when), and the second option hacks kerchuck to deal with the issue temporarily while the upstream issues are resolved.

If you like, I can put together a PR that essentially deals with this edge case, something like:

 coord2 = {"latitude": "latitudes", "longitude": "longitudes", "step": "step:int"}.get(coord, coord)
 try: 
     x = m.get(coord2) 
 except eccodes.WrongStepUnitError as e: 
     logger.warning( 
         "Ignoring coordinate '%s' for varname '%s', raises: eccodes.WrongStepUnitError(%s)", 
         coord2, 
         varName, 
         e, 
     ) 
     continue 

@emfdavid
Copy link
Contributor

That is cool - I poked around for a while and gave up.
This looks much better.

If I understand correctly, we can put your patch in place in kerchunk now, and it should be forward compatible with cfgrib after the problem is corrected there. Then we can leave your patch in kerchunk in place until most users are up to date with cfgrib and eccodes updates.

@martindurant
Copy link
Member

Sounds like a plan!

@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 11, 2024

I just create PR #450.

@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 11, 2024

See comments in #450 for updates.

TL;DR: This bug only comes up because of changes in the most recent versions of eccodes (2.34+).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants