Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix array_equal behaviour for masked arrays #4457

Merged
merged 8 commits into from
Jun 19, 2024

Conversation

stephenworsley
Copy link
Contributor

Current behaviour will compare the underlying data of a masked array and will sometimes compare masked arrays as unequal when they are otherwise equal for all unmasked points. This would cause an error in an iris-esmf-regrid PR as described here:
SciTools-incubator/iris-esmf-regrid#138 (comment)

@bjlittle bjlittle self-assigned this Dec 12, 2021
@bjlittle bjlittle self-requested a review December 12, 2021 22:37
@rcomer
Copy link
Member

rcomer commented Dec 19, 2021

Seems that ignoring the masks was deliberate, but only because “the chosen approach fixes the problem [of all-masked data raising errors], is simpler, and reflects numpy behaviour.” #905

Perhaps using something like ma.filled would help?

@stephenworsley stephenworsley added this to In Progress in Iris v3.3.0 via automation Jun 29, 2022
@stephenworsley stephenworsley moved this from In Progress to Backlog in Iris v3.3.0 Jun 29, 2022
@bjlittle bjlittle removed their assignment Jul 22, 2022
@bjlittle
Copy link
Member

Related to numpy/numpy#16022 (comment), which was merged into numpy, but then reverted (as it broke scipy and astropy)

lib/iris/util.py Outdated Show resolved Hide resolved
@trexfeathers
Copy link
Contributor

If we're worried about upsetting the default Iris behaviour, perhaps the neatest solution would be an iris.util function that provides the alternative equality behaviour?

As far as I'm aware the only calls for this alternative behaviour are cases where the caller is explicitly checking equality (rather than wanting internal Iris behaviour to be changed).

@rcomer
Copy link
Member

rcomer commented Jul 22, 2022

There is also a relevant thread at numpy/numpy#14624

I think what Iris does in testing is sensible:

def _assertMaskedArray(self, assertion, a, b, strict, **kwargs):
# Define helper function to extract unmasked values as a 1d
# array.
def unmasked_data_as_1d_array(array):
array = ma.asarray(array)
if array.ndim == 0:
if array.mask:
data = np.array([])
else:
data = np.array([array.data])
else:
data = array.data[~ma.getmaskarray(array)]
return data
# Compare masks. This will also check that the array shapes
# match, which is not tested when comparing unmasked values if
# strict is False.
a_mask, b_mask = ma.getmaskarray(a), ma.getmaskarray(b)
np.testing.assert_array_equal(a_mask, b_mask)
if strict:
assertion(a.data, b.data, **kwargs)
else:
assertion(
unmasked_data_as_1d_array(a),
unmasked_data_as_1d_array(b),
**kwargs,
)
def assertMaskedArrayEqual(self, a, b, strict=False):
"""
Check that masked arrays are equal. This requires the
unmasked values and masks to be identical.
Args:
* a, b (array-like):
Two arrays to compare.
Kwargs:
* strict (bool):
If True, perform a complete mask and data array equality check.
If False (default), the data array equality considers only unmasked
elements.
"""
self._assertMaskedArray(np.testing.assert_array_equal, a, b, strict)
def assertArrayAlmostEqual(self, a, b, decimal=6):
np.testing.assert_array_almost_equal(a, b, decimal=decimal)
def assertMaskedArrayAlmostEqual(self, a, b, decimal=6, strict=False):
"""
Check that masked arrays are almost equal. This requires the
masks to be identical, and the unmasked values to be almost
equal.
Args:
* a, b (array-like):
Two arrays to compare.
Kwargs:
* strict (bool):
If True, perform a complete mask and data array equality check.
If False (default), the data array equality considers only unmasked
elements.
* decimal (int):
Equality tolerance level for
:meth:`numpy.testing.assert_array_almost_equal`, with the meaning
'abs(desired-actual) < 0.5 * 10**(-decimal)'
"""
self._assertMaskedArray(
np.testing.assert_array_almost_equal, a, b, strict, decimal=decimal
)

@stephenworsley
Copy link
Contributor Author

There is also a relevant thread at numpy/numpy#14624

@rcomer It's interesting that there is an inconsistency between different numpy functions at the moment, though given that currently iris.util.array_equal behaves in the same way as np.array_equal, that could be enough of a reason to keep this function behaving the same. There's still potentially a question of how we use iris.util.array_equal within iris. It might make sense in places (such ass coord equality) to use an approach of calling ma.filled as you suggest before calling iris.util.array_equal and also checking that the mask is equal.

@stephenworsley stephenworsley removed this from Backlog in Iris v3.3.0 Aug 15, 2022
Copy link
Contributor

github-actions bot commented Dec 9, 2023

In order to maintain a backlog of relevant PRs, we automatically label them as stale after 500 days of inactivity.

If this PR is still important to you, then please comment on this PR and the stale label will be removed.

Otherwise this PR will be automatically closed in 28 days time.

@github-actions github-actions bot added the Stale A stale issue/pull-request label Dec 9, 2023
Copy link
Contributor

github-actions bot commented Jan 7, 2024

This stale PR has been automatically closed due to a lack of community activity.

If you still care about this PR, then please either:

  • Re-open this PR, if you have sufficient permissions, or
  • Add a comment pinging @SciTools/iris-devs who will re-open on your behalf.

@github-actions github-actions bot closed this Jan 7, 2024
@pp-mo
Copy link
Member

pp-mo commented May 24, 2024

@pp-mo pp-mo reopened this May 24, 2024
@github-actions github-actions bot removed the Stale A stale issue/pull-request label May 25, 2024
* main: (759 commits)
  Bump scitools/workflows from 2024.05.1 to 2024.06.0 (SciTools#5986)
  [pre-commit.ci] pre-commit autoupdate (SciTools#5980)
  Updated environment lockfiles (SciTools#5983)
  Bump scitools/workflows from 2024.05.0 to 2024.05.1 (SciTools#5984)
  Make `slices_over` tests go faster (SciTools#5973)
  Updated environment lockfiles (SciTools#5979)
  Update lock files with associated fixes (SciTools#5953)
  List 25 slowest tests (SciTools#5969)
  used a note to highlight some text (SciTools#5971)
  Lazy `iris.cube.Cube.rolling_window` (SciTools#5795)
  Add memory benchmarks (SciTools#5960)
  Whatsnew for several benchmark developments. (SciTools#5961)
  Remove "on-demand" from some benchmarks (SciTools#5959)
  Add bm_runner 'trialrun' subcommand. (SciTools#5957)
  Automatically install iris-test-data for benchmark data generation (SciTools#5958)
  Added benchmarks for collapse and aggregate (SciTools#5954)
  Use tracemalloc for memory measurements. (SciTools#5948)
  Provide a Nox `benchmarks` session as the recommended entry point (SciTools#5951)
  [pre-commit.ci] pre-commit autoupdate (SciTools#5952)
  Remove unit benchmarks (SciTools#5949)
  ...
Copy link

codecov bot commented Jun 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.78%. Comparing base (167d149) to head (9045abc).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #4457   +/-   ##
=======================================
  Coverage   89.78%   89.78%           
=======================================
  Files          90       90           
  Lines       22938    22945    +7     
  Branches     5023     5026    +3     
=======================================
+ Hits        20595    20602    +7     
  Misses       1612     1612           
  Partials      731      731           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@pp-mo pp-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty much OK I think -- just a couple of comments.
New tests are good.
I'm a bit surprised this hasn't triggered test failures elsewhere, but I guess it's a bit niche.

lib/iris/util.py Outdated Show resolved Hide resolved
lib/iris/util.py Show resolved Hide resolved
lib/iris/tests/unit/util/test_array_equal.py Show resolved Hide resolved
Copy link
Member

@pp-mo pp-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All done. apart from the extra point I just added.

lib/iris/tests/unit/util/test_array_equal.py Show resolved Hide resolved
Copy link
Member

@pp-mo pp-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I agreed with your latest, but I shifted my ground so I still think there is something to fix here..
Not long now, I should think !

lib/iris/tests/unit/util/test_array_equal.py Outdated Show resolved Hide resolved
@pp-mo
Copy link
Member

pp-mo commented Jun 19, 2024

Great, thanks for your patience @stephenworsley !

@pp-mo pp-mo enabled auto-merge (squash) June 19, 2024 13:01
@pp-mo pp-mo merged commit d16628a into SciTools:main Jun 19, 2024
21 checks passed
tkknight added a commit to tkknight/iris that referenced this pull request Jun 20, 2024
* upstream/main: (42 commits)
  Mesh saveload fix (SciTools#6004)
  used tabs for the install info (SciTools#6013)
  Fix array_equal behaviour for masked arrays (SciTools#4457)
  Bump scitools/workflows from 2024.06.1 to 2024.06.2 (SciTools#6008)
  [pre-commit.ci] pre-commit autoupdate (SciTools#6007)
  Updated environment lockfiles (SciTools#5996)
  Added more descriptive errors within concatenate (SciTools#6005)
  Bump scitools/workflows from 2024.06.0 to 2024.06.1 (SciTools#5998)
  [pre-commit.ci] pre-commit autoupdate (SciTools#5997)
  Bump scitools/workflows from 2024.05.1 to 2024.06.0 (SciTools#5986)
  [pre-commit.ci] pre-commit autoupdate (SciTools#5980)
  Updated environment lockfiles (SciTools#5983)
  Bump scitools/workflows from 2024.05.0 to 2024.05.1 (SciTools#5984)
  Make `slices_over` tests go faster (SciTools#5973)
  Updated environment lockfiles (SciTools#5979)
  Update lock files with associated fixes (SciTools#5953)
  List 25 slowest tests (SciTools#5969)
  used a note to highlight some text (SciTools#5971)
  Lazy `iris.cube.Cube.rolling_window` (SciTools#5795)
  Add memory benchmarks (SciTools#5960)
  ...
tkknight added a commit to tkknight/iris that referenced this pull request Jun 23, 2024
* upstream/main: (143 commits)
  Updated environment lockfiles (SciTools#6020)
  Add `MeshCoord.collapsed` (SciTools#6003)
  Bump scitools/workflows from 2024.06.2 to 2024.06.3 (SciTools#6015)
  Mesh saveload fix (SciTools#6004)
  used tabs for the install info (SciTools#6013)
  Fix array_equal behaviour for masked arrays (SciTools#4457)
  Bump scitools/workflows from 2024.06.1 to 2024.06.2 (SciTools#6008)
  [pre-commit.ci] pre-commit autoupdate (SciTools#6007)
  Updated environment lockfiles (SciTools#5996)
  Added more descriptive errors within concatenate (SciTools#6005)
  Bump scitools/workflows from 2024.06.0 to 2024.06.1 (SciTools#5998)
  [pre-commit.ci] pre-commit autoupdate (SciTools#5997)
  Bump scitools/workflows from 2024.05.1 to 2024.06.0 (SciTools#5986)
  [pre-commit.ci] pre-commit autoupdate (SciTools#5980)
  Updated environment lockfiles (SciTools#5983)
  Bump scitools/workflows from 2024.05.0 to 2024.05.1 (SciTools#5984)
  Make `slices_over` tests go faster (SciTools#5973)
  Updated environment lockfiles (SciTools#5979)
  Update lock files with associated fixes (SciTools#5953)
  List 25 slowest tests (SciTools#5969)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

6 participants