Skip to content

Commit

Permalink
Update to cell_method parsing (#6083)
Browse files Browse the repository at this point in the history
* Provided error message for malformed `cell_method` attribute

* Made `cell_method` pattern matching more lenient w.r.t. space after colon separator

* Changed `cell_methods` regex slightly to fix failures in unit tests.
Also added new unit test to check cell_methods with:
  1. Name only (no colon and method)
  2. Not space between colon separator and method.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added WhatsNew entry

* Update docs/src/whatsnew/latest.rst

Co-authored-by: Elias <[email protected]>

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Martin Yeo <[email protected]>
Co-authored-by: Elias <[email protected]>
  • Loading branch information
4 people authored Oct 18, 2024
1 parent d3071ff commit 3b8b33c
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 3 deletions.
3 changes: 3 additions & 0 deletions docs/src/whatsnew/latest.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,9 @@ This document explains the changes made to Iris for this release
#. `@rcomer`_ enabled partial collapse of multi-dimensional string coordinates,
fixing :issue:`3653`. (:pull:`5955`)

#. `@ukmo-ccbunney`_ improved error handling for malformed `cell_method`
attribute. Also made cell_method string parsing more lenient w.r.t.
whitespace. (:pull:`6083`)

💣 Incompatible Changes
=======================
Expand Down
12 changes: 9 additions & 3 deletions lib/iris/fileformats/_nc_load_rules/helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -202,11 +202,11 @@
_CM_INTERVAL = "interval"
_CM_METHOD = "method"
_CM_NAME = "name"
_CM_PARSE_NAME = re.compile(r"([\w_]+\s*?:\s+)+")
_CM_PARSE_NAME = re.compile(r"([\w_]+\s*?:\s*)+")
_CM_PARSE = re.compile(
r"""
(?P<name>([\w_]+\s*?:\s+)+)
(?P<method>[\w_\s]+(?![\w_]*\s*?:))\s*
(?P<name>([\w_]+\s*?:\s*)+)
(?P<method>[^\s][\w_\s]+(?![\w_]*\s*?:))\s*
(?:
\(\s*
(?P<extra>.+)
Expand Down Expand Up @@ -296,6 +296,12 @@ def _split_cell_methods(nc_cell_methods: str) -> List[re.Match]:
for m in _CM_PARSE_NAME.finditer(nc_cell_methods):
name_start_inds.append(m.start())

# No matches? Must be malformed cell_method string; warn and return
if not name_start_inds:
msg = f"Failed to parse cell method string: {nc_cell_methods}"
warnings.warn(msg, category=iris.warnings.IrisCfLoadWarning, stacklevel=2)
return []

# Remove those that fall inside brackets
bracket_depth = 0
for ind, cha in enumerate(nc_cell_methods):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ class Test(tests.IrisTest):
def test_simple(self):
cell_method_strings = [
"time: mean",
"time:mean",
"time : mean",
]
expected = (CellMethod(method="mean", coords="time"),)
Expand Down Expand Up @@ -125,6 +126,7 @@ def test_badly_formatted_warning(self):
cell_method_strings = [
# "time: maximum (interval: 1 hr comment: first bit "
# "time: mean (interval: 1 day comment: second bit)",
"time",
"time: (interval: 1 hr comment: first bit) "
"time: mean (interval: 1 day comment: second bit)",
"time: maximum (interval: 1 hr comment: first bit) "
Expand Down

0 comments on commit 3b8b33c

Please sign in to comment.