Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PP Proleptic Gregorian #5138

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft

Conversation

rcomer
Copy link
Member

@rcomer rcomer commented Jan 18, 2023

🚀 Pull Request

Description

A proposal to resolve #3561, hopefully without creating breaking changes for the user. Quoting myself from that issue:

Noting that

  • 1st January 1970 is the same date in the Proleptic Gregorian and Standard calendars
  • Any time point expressed as hours since 1st January 1970 is also therefore the same in the Proleptic Gregorian and Standard calendars

So, once we have converted our pp time headers into a coordinate with unit "hours since 1970-01-01 00:00:00", it is somewhat arbitrary whether we label it "proleptic_gregorian" or "standard". Would it therefore be reasonable to use the Proleptic Gregorian calendar to do the conversion from pp-header to coordinate, but still label the coordinate with the "standard" calendar? It would be a bit of a fudge, but would mean any change in behaviour would be limited to dates before 15th October 1582, and therefore a bugfix.

Similarly, when saving we can convert a standard calendar unit to Proleptic Gregorian with the cf_units.Unit.change_calendar method, and then use that new unit to do the num2date conversion.

I've done a fair bit of test wrangling. I think the last few failures reflect that the years 0 and 1 are genuinely different in the two calendars, so the expected values could just be updated. Leaving that for now and opening as draft to see what people think.


Consult Iris pull request check list

@trexfeathers
Copy link
Contributor

📢 @SciTools/iris-devs 📢

Need to make sure we get this one right. If you have any experience with calendars / PP-format it would be great to have your informed opinion ❤

@rcomer
Copy link
Member Author

rcomer commented Jan 19, 2023

Full disclosure: this is my first time doing anything in fileformats so, um 😨

@trexfeathers
Copy link
Contributor

@nhsavage are you able to test this branch against any of your scripts, to see if it addresses your concerns?

@nhsavage
Copy link
Contributor

nhsavage commented Jan 25, 2023

I am struggling a bit to do this - the problem is that the bug mainfested when using ANTS not directly in Iris. trying to mimic it just in Iris doesn't seem to work and at the moment, ANTS is at Iris 2.3.0. Would it be useful to see if I can replicate the problem with ANTS 1.0 and go from there?

@nhsavage
Copy link
Contributor

nhsavage commented Jan 25, 2023

Would it be useful to see if I can replicate the problem with ANTS 1.0 and go from there?
OK. I have been able to come up with a simple piece of ANTs code which demonstrates the problem. However, now reading the ANTS code it is clear that fixing this issue is simply a way to unblock the changes needed in ANTS. The ANTS code which Generates pp fields from a cube. has:

if time_coords[0].units.calendar in ["proleptic_gregorian", "standard"]:
       time_coord = ants.utils.cube.find_time_coordinates(cube)[0]
       new_units = cf_units.Unit(time_coord.units.name, calendar="gregorian")
       time_coord.units = new_units

This willl be wrong for proleptic gregorian calendars before 1582 (more specifically, it changes the calendar of the base time "days since 0001-01-01" without changing the time points to ensure that they match. If this PR passes, then some careful refactoring of this part of the ANTS code would then be needed - specifically to raise an error if a standard or gregorian calendar has data before 1582. I probably need to now make a pure Iris version of this to see if I can replicate the bug without ANTS and understand impacts of this change

Our current workaround is to update the basis time of the NetCDF file prior to running the ants.save

new=cf_units.Unit('days since 1850-1-1', calendar='proleptic_gregorian')
sst.coord('time').convert_units(new)

which avoids the bug in ANTS as then gregrian and proleptic gregorian are the same

@nhsavage
Copy link
Contributor

nhsavage commented Jan 25, 2023

so I've boiled the Iris part of the problem down to something simple. Take any netCDF file with a proleptic gregorian calendar and run:

cube=iris.load_cube(infile)
# save to a pp file
iris.save(cube,'sst_out_fix.pp')

if you look at the output of this, then the calendar is set to 0 which is not a valid. Header from pumf:

(13) lbtim : 20

it should be 21 as the calendar is proleptic gregorian.

@trexfeathers
Copy link
Contributor

trexfeathers commented Jan 26, 2023

if you look at the output of this, then the calendar is set to 0 which is not a valid. Header from pumf:

(13) lbtim : 20

it should be 21 as the calendar is proleptic gregorian.

I can't replicate this, I get the below, both on this PR and with v3.4.0:

  (13) lbtim   :                       0
Here's my code
from cf_units import CALENDAR_PROLEPTIC_GREGORIAN, Unit
import numpy as np
from numpy import random

import iris
from iris.coords import AuxCoord, DimCoord
from iris.cube import Cube

dim_names_lens = [("longitude", 2), ("latitude", 3)]
dim_coords = [DimCoord(np.arange(l), standard_name=n) for n, l in dim_names_lens]

time_unit = Unit("days since 2001-01-01", calendar=CALENDAR_PROLEPTIC_GREGORIAN)
time_coord = AuxCoord(1, standard_name="time", units=time_unit)

prep_cube = Cube(
    random.rand(*[c.shape[0] for c in dim_coords]),
    standard_name="sea_surface_temperature",
    units="K",
    dim_coords_and_dims=[(c, ix) for ix, c in enumerate(dim_coords)],
    # Make this a scalar coord.
    aux_coords_and_dims=[(time_coord, None)],
)

infile = "sst_in.nc"
iris.save(prep_cube, infile)

cube=iris.load_cube(infile)
# save to a pp file
iris.save(cube,'sst_out_fix.pp')

@rcomer
Copy link
Member Author

rcomer commented Jan 26, 2023

Looks like I missed a bit 👀

@nhsavage
Copy link
Contributor

nhsavage commented Jan 26, 2023

First of all LBTIM is defined thus (in my well thumbed copy of UMDP F03, excellent reading if you can't sleep)
BTIM is coded as (100IA + 10IB + IC) where:
for time aggregated fields IA is the time interval in hours between the individual fields from which
the mean was computed [zero in both our cases. yours isn't a time mean, the netCDF I read doesn't include this info

IB = 2 if the field is a time mean between T1 and T2 [my data is time mean yours isn't]

IC (the only relevant part for this PR)

= 0 if ’model time’ is used for T1 and T2 (i.e. only day number, hour and minute are set).
– = 1 if the Proleptic Gregorian calendar is used for T1 and T2. This calendar is equivalent to the
Gregorian calendar for dates after 1582. Applications producing data using the Proleptic Grego-
rian calendar should extend back in time to BC years in a way that make the date continuous:
year 1 is 1 AD, year 0 is 1 BC, year -1 is 2 BC etc. (Note some ancillary files use year 0 in this
calendar for ancillary fields that do not vary with time.)
– = 2 if the ’360-day’ calendar (i.e. 12 30-day months) is used for T1 and T2.
– = 3 if ’model time’ is used for T1 and T2 (i.e. only day number, hour and minute are valid; year,
month and day in month are to be ignored if set).
– = 4 if the ’365-day’ (no leap year) calendar is used for T1 and T2

so ignore whether IB is set or not and only consider the IC - this should be 1 in your case and my case.

To do this all in memory, it should be possibel to use the Iris routine iris.fileformats.pp.as_fields adn check the the LBTIM value in that, which should make it a more useful unit test

@nhsavage
Copy link
Contributor

this code can replace the save and pumf steps

fields = iris.fileformats.pp.as_fields(prep_cube)

for field in fields:
    print(field.lbtim)

currently gives 0 (no calendar defined) which is wrong

@rcomer
Copy link
Member Author

rcomer commented Jan 26, 2023

My latest commit gives lbtim of 1 with @trexfeathers' code.

@trexfeathers
Copy link
Contributor

My latest commit gives lbtim of 1 with @trexfeathers' code.

I can replicate this. Thanks @nhsavage for the help debugging.

I'm concerned about whether I would have spotted this even if I were to carefully read UMDP F03 (and any other applicable resources); it's just not where my talents lie. I can do my best (resource allowing), but it would be great to have more users trying this out too (it would be mean to lean exclusively on Nick).

I will see if I can apply the changes here to an older version of Iris to allow more testing by people close to the ANTS tool.

Copy link
Contributor

In order to maintain a backlog of relevant PRs, we automatically label them as stale after 500 days of inactivity.

If this PR is still important to you, then please comment on this PR and the stale label will be removed.

Otherwise this PR will be automatically closed in 28 days time.

@github-actions github-actions bot added the Stale A stale issue/pull-request label Jun 14, 2024
@trexfeathers
Copy link
Contributor

Testing this is still on my personal to-do list

@trexfeathers trexfeathers removed the Stale A stale issue/pull-request label Jun 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Status: 🆕 New
Development

Successfully merging this pull request may close these issues.

BUG: iris pp treats standard calendar as a proleptic gregorian and ignores real proleptic gregorian calendars
3 participants