Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add step:int remapping for coords computation #450

Merged
merged 14 commits into from
Apr 12, 2024

Conversation

kmpaul
Copy link
Contributor

@kmpaul kmpaul commented Apr 11, 2024

This is meant to address #446.

closes #446

@martindurant
Copy link
Member

Again, it would be lovely to include a test file, if you have a small one.

@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 11, 2024

I'll add a test. Don't merge until it's there. (I just got some tests to fail on my laptop, and I wasn't sure why. I thought I would see if they failed on CI.)

@dcherian
Copy link
Contributor

(hi kevin!)

Nice. Does this fix

theirs = cfgrib.open_dataset(fn)
if "hrrr" in url:
# for some reason, cfgrib reads `step` as 7.25 hours
# while grib_ls and kerchunk reads `step` as 425 hours.
ours = ours.drop_vars("step")
theirs = theirs.drop_vars("step")

@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 11, 2024

(Hi, Deepak! 😄)

That I do not know! That is a very weird bug! Which value for step is correct, cfgrib or kerchunk?

@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 11, 2024

@dcherian: Sadly, it does not appear to fix the problem. Sorry!

@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 11, 2024

When trying to test this PR, I discovered that my change wasn't fixing anything because in the CI environment (ci/environment-py310.yml) the scan_grib function was producing an eccodes.WrongStepUnitError instead of a ValueError, as I was seeing in my recent runs. I took a look at the difference between the environments I was using, and it essentially boiled down to a difference in eccodes versions. For some reason, some dependency in the CI environment is preventing the download/installation of the most recent version of eccodes (version 2.34.1). And in the most recent version of eccodes, the eccodes.WrongStepUnitError is not raised, and a ValueError is raised instead!

If you force install an update of eccodes in the CI environment, the changes in this PR do what they should. However, with older versions of eccodes (which is installed in the CI environment by default), the change in this PR effectively does nothing because a eccodes.WrongStepUnitError is raised that is caught in the try block.

Sigh. So, I leave it to you to decide if the changes in this PR are worth it. I guess you could argue that these changes allow for someone to use the most recent version of eccodes with kerchunk, which I think is nice. And, again, I think these changes will be made irrelevant when ecmwf/cfgrib#371 is merged.

@dcherian
Copy link
Contributor

Just catch both and log whichever exception was caught? eccodes is a real pain, so I think most users would appreciate kerchunk being flexible with regard to eccodes version

@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 11, 2024

I've also verified that with the latest version of eccodes, the kerchunk tests fail when they encounter subhourly steps. So, this would make the tests a bit more robust.

Edit: So, you could argue that there already is a test for this PR. 😄

Edit 2: Although to truly test this, you need to create an environment in CI that is getting the most recent version of eccodes.

@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 11, 2024

I added another ci/environment-py*.yml file that explicitly installs the most recent version of eccodes for testing. I can back this out, but I'm curious if these tests pass on CI.

@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 11, 2024

Incidentally, the reason eccodes is not being installed with the most recent version in the CI environment is because of the h5py<3.9 dependency. Can this restriction be removed (in one of the environment files, for example) for testing purposes?

@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 11, 2024

I may not be able to work on this any more today. I might have to return to this tomorrow.

@martindurant
Copy link
Member

I might have to return to this tomorrow.

No worries

h5py<3.9 dependency. Can this restriction be removed

Let's try! I don't recall where this comes from.

@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 12, 2024

I've decided to deal with updating the tests in a separate PR so as not to entangle those changes with the changes specific to this PR. I'll get back to this in a moment.

Actually, I take it back. I think all of the changes needed to the test environments will cause tests to fail without the changes made in this PR. So, I'm not separating the PRs.

This was referenced Apr 12, 2024
@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 12, 2024

Note: I have removed the h5py<3.9 restriction from the py310 and py311 CI environments for testing.

Things have gotten a bit more complicated. Let me explain:

  1. With the latest versions of eccodes (2.34+), cfgrib no longer even produces step or valid_time in the resulting Xarray Dataset for subhourly data, so the test_grib.py::test_archives[s3://noaa-hrrr-bdp-pds/hrrr.20140730/conus/hrrr.t23z.wrfsubhf08.grib2] test fails because the step and valid_time coordinates are not in the Dataset for comparison in the test assert. This will be fixed when cfgrib merges Explicitly request value for 'endStep' as integer ecmwf/cfgrib#371, but for now I believe that kerchunk's scan_grib function is actually more correct than cfgrib. So, I am tempted to just skip this test for now. But I am loathe to do that...

  2. Also with the newest version of eccodes, the order of the variable attribute "coordinates" is not the same as it was before because (I presume) the step coordinate was added ad hoc last. But with the changes to eccodes and the change in this PR, the step coordinate doesn't have to be treated as "special", resulting in the step coordinate being added to the "coordinate" attribute in a new order. This causes the assert in test_grib.py::test_grib_tree to fail, but it is easily fixed by changing the assert from a comparison of strings to a comparison of sets, where the sets are constructed from the split() of the strings.

  3. The latest update to Xarray causes all time coordinates to be dtype M8[ns], which causes the test_combine.py::test_cftimes_to_normal test to fail at the dtype == "M8[s]" assert. To account to old and new versions of Xarray, I've changed the assert to dtype.kind == "M", per @martindurant's recommendation. But in the future this should probably be made more firm, and the coo_dtypes may need to be revisited if it doesn't do as expected because of these changes to Xarray.

@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 12, 2024

UPDATE: I have marked the test_grib.py::test_archives[s3://noaa-hrrr-bdp-pds/hrrr.20140730/conus/hrrr.t23z.wrfsubhf08.grib2] test to be skipped if the version of eccodes >= 2.34. This may need to be revisited.

@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 12, 2024

At this point, I have verified locally that all tests pass (or are skipped as expected) with both the old CI environments (i.e., with the h5py<3.9 restriction) and the new CI environments (i.e., without the h5py<3.9 restriction). If the CI tests pass here on GitHub, and the changes are accepted, I think this PR is done, and it should close #446 when merged.

@martindurant
Copy link
Member

Thanks for all the work here, hoping for greens

@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 12, 2024

Thanks for all the work here, hoping for greens

Me, too!

@martindurant
Copy link
Member

Didn't we have the h5py versions labeled as separate runs?

@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 12, 2024

I didn't make them separate. I just removed the restriction in the py310 and py311 environment files. Do you want me to make a separate environment?

@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 12, 2024

I didn't want to add too many test runs for fear of using too many resources, but I can add whatever environments you think are best.

@martindurant
Copy link
Member

OK, that's fine. The 3.9 env seems to be failing to build, so it may need further restrictions on xarray and pandas (+cfgrib, numpy??)

@martindurant
Copy link
Member

(might be a good time to revert micromamba->miniconda, now that libmamba is a thing, perhaps that helps)

@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 12, 2024

Hey, they all passed!

@kmpaul
Copy link
Contributor Author

kmpaul commented Apr 12, 2024

Do you still want me to make any further changes?

@martindurant
Copy link
Member

Oh, well that's good. 20min is still too long for 3.9, though!
I'll merge this, unblock the other PR, and we may choose to come back and add more pins to the env to make the build faster.

@martindurant martindurant merged commit 95f333c into fsspec:main Apr 12, 2024
5 checks passed
@kmpaul kmpaul deleted the kpaul/issue-446 branch April 12, 2024 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Invalid try/except block in scan_grib?
3 participants