Fix array_equal behaviour for masked arrays #4457

stephenworsley · 2021-12-10T16:48:04Z

Current behaviour will compare the underlying data of a masked array and will sometimes compare masked arrays as unequal when they are otherwise equal for all unmasked points. This would cause an error in an iris-esmf-regrid PR as described here:
SciTools/iris-esmf-regrid#138 (comment)

rcomer · 2021-12-19T13:27:07Z

Seems that ignoring the masks was deliberate, but only because “the chosen approach fixes the problem [of all-masked data raising errors], is simpler, and reflects numpy behaviour.” #905

Perhaps using something like ma.filled would help?

bjlittle · 2022-07-22T10:01:00Z

Related to numpy/numpy#16022 (comment), which was merged into numpy, but then reverted (as it broke scipy and astropy)

lib/iris/util.py

trexfeathers · 2022-07-22T10:04:37Z

If we're worried about upsetting the default Iris behaviour, perhaps the neatest solution would be an iris.util function that provides the alternative equality behaviour?

As far as I'm aware the only calls for this alternative behaviour are cases where the caller is explicitly checking equality (rather than wanting internal Iris behaviour to be changed).

rcomer · 2022-07-22T10:59:08Z

There is also a relevant thread at numpy/numpy#14624

I think what Iris does in testing is sensible:

iris/lib/iris/tests/__init__.py

Lines 590 to 668 in 56a97d6

    
               def _assertMaskedArray(self, assertion, a, b, strict, **kwargs): 
        
                   # Define helper function to extract unmasked values as a 1d 
        
                   # array. 
        
                   def unmasked_data_as_1d_array(array): 
        
                       array = ma.asarray(array) 
        
                       if array.ndim == 0: 
        
                           if array.mask: 
        
                               data = np.array([]) 
        
                           else: 
        
                               data = np.array([array.data]) 
        
                       else: 
        
                           data = array.data[~ma.getmaskarray(array)] 
        
                       return data 
        
                   # Compare masks. This will also check that the array shapes 
        
                   # match, which is not tested when comparing unmasked values if 
        
                   # strict is False. 
        
                   a_mask, b_mask = ma.getmaskarray(a), ma.getmaskarray(b) 
        
                   np.testing.assert_array_equal(a_mask, b_mask) 
        
                   if strict: 
        
                       assertion(a.data, b.data, **kwargs) 
        
                   else: 
        
                       assertion( 
        
                           unmasked_data_as_1d_array(a), 
        
                           unmasked_data_as_1d_array(b), 
        
                           **kwargs, 
        
                       ) 
        
               def assertMaskedArrayEqual(self, a, b, strict=False): 
        
                   """ 
        
                   Check that masked arrays are equal. This requires the 
        
                   unmasked values and masks to be identical. 
        
                   Args: 
        
                   * a, b (array-like): 
        
                       Two arrays to compare. 
        
                   Kwargs: 
        
                   * strict (bool): 
        
                       If True, perform a complete mask and data array equality check. 
        
                       If False (default), the data array equality considers only unmasked 
        
                       elements. 
        
                   """ 
        
                   self._assertMaskedArray(np.testing.assert_array_equal, a, b, strict) 
        
               def assertArrayAlmostEqual(self, a, b, decimal=6): 
        
                   np.testing.assert_array_almost_equal(a, b, decimal=decimal) 
        
               def assertMaskedArrayAlmostEqual(self, a, b, decimal=6, strict=False): 
        
                   """ 
        
                   Check that masked arrays are almost equal. This requires the 
        
                   masks to be identical, and the unmasked values to be almost 
        
                   equal. 
        
                   Args: 
        
                   * a, b (array-like): 
        
                       Two arrays to compare. 
        
                   Kwargs: 
        
                   * strict (bool): 
        
                       If True, perform a complete mask and data array equality check. 
        
                       If False (default), the data array equality considers only unmasked 
        
                       elements. 
        
                   * decimal (int): 
        
                       Equality tolerance level for 
        
                       :meth:`numpy.testing.assert_array_almost_equal`, with the meaning 
        
                       'abs(desired-actual) < 0.5 * 10**(-decimal)' 
        
                   """ 
        
                   self._assertMaskedArray( 
        
                       np.testing.assert_array_almost_equal, a, b, strict, decimal=decimal 
        
                   )

stephenworsley · 2022-07-26T13:31:46Z

There is also a relevant thread at numpy/numpy#14624

@rcomer It's interesting that there is an inconsistency between different numpy functions at the moment, though given that currently iris.util.array_equal behaves in the same way as np.array_equal, that could be enough of a reason to keep this function behaving the same. There's still potentially a question of how we use iris.util.array_equal within iris. It might make sense in places (such ass coord equality) to use an approach of calling ma.filled as you suggest before calling iris.util.array_equal and also checking that the mask is equal.

github-actions · 2023-12-09T00:13:59Z

In order to maintain a backlog of relevant PRs, we automatically label them as stale after 500 days of inactivity.

If this PR is still important to you, then please comment on this PR and the stale label will be removed.

Otherwise this PR will be automatically closed in 28 days time.

github-actions · 2024-01-07T00:16:02Z

This stale PR has been automatically closed due to a lack of community activity.

If you still care about this PR, then please either:

Re-open this PR, if you have sufficient permissions, or
Add a comment pinging @SciTools/iris-devs who will re-open on your behalf.

pp-mo · 2024-05-24T14:33:51Z

As discussed @trexfeathers @stephenworsley @bjlittle @pp-mo ...
This is probably still relevant
Causes problem here : https://github.com/SciTools-incubator/iris-esmf-regrid/blob/v0.9.0/esmf_regrid/schemes.py#L1377-L1396

* main: (759 commits) Bump scitools/workflows from 2024.05.1 to 2024.06.0 (SciTools#5986) [pre-commit.ci] pre-commit autoupdate (SciTools#5980) Updated environment lockfiles (SciTools#5983) Bump scitools/workflows from 2024.05.0 to 2024.05.1 (SciTools#5984) Make `slices_over` tests go faster (SciTools#5973) Updated environment lockfiles (SciTools#5979) Update lock files with associated fixes (SciTools#5953) List 25 slowest tests (SciTools#5969) used a note to highlight some text (SciTools#5971) Lazy `iris.cube.Cube.rolling_window` (SciTools#5795) Add memory benchmarks (SciTools#5960) Whatsnew for several benchmark developments. (SciTools#5961) Remove "on-demand" from some benchmarks (SciTools#5959) Add bm_runner 'trialrun' subcommand. (SciTools#5957) Automatically install iris-test-data for benchmark data generation (SciTools#5958) Added benchmarks for collapse and aggregate (SciTools#5954) Use tracemalloc for memory measurements. (SciTools#5948) Provide a Nox `benchmarks` session as the recommended entry point (SciTools#5951) [pre-commit.ci] pre-commit autoupdate (SciTools#5952) Remove unit benchmarks (SciTools#5949) ...

codecov · 2024-06-10T14:43:21Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.78%. Comparing base (167d149) to head (9045abc).
Report is 64 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #4457   +/-   ##
=======================================
  Coverage   89.78%   89.78%           
=======================================
  Files          90       90           
  Lines       22938    22945    +7     
  Branches     5023     5026    +3     
=======================================
+ Hits        20595    20602    +7     
  Misses       1612     1612           
  Partials      731      731

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

pp-mo

Pretty much OK I think -- just a couple of comments.
New tests are good.
I'm a bit surprised this hasn't triggered test failures elsewhere, but I guess it's a bit niche.

lib/iris/util.py

lib/iris/tests/unit/util/test_array_equal.py

pp-mo

All done. apart from the extra point I just added.

lib/iris/tests/unit/util/test_array_equal.py

pp-mo

Sorry I agreed with your latest, but I shifted my ground so I still think there is something to fix here..
Not long now, I should think !

lib/iris/tests/unit/util/test_array_equal.py

Co-authored-by: Patrick Peglar <[email protected]>

pp-mo · 2024-06-19T13:01:08Z

Great, thanks for your patience @stephenworsley !

* upstream/main: (42 commits) Mesh saveload fix (SciTools#6004) used tabs for the install info (SciTools#6013) Fix array_equal behaviour for masked arrays (SciTools#4457) Bump scitools/workflows from 2024.06.1 to 2024.06.2 (SciTools#6008) [pre-commit.ci] pre-commit autoupdate (SciTools#6007) Updated environment lockfiles (SciTools#5996) Added more descriptive errors within concatenate (SciTools#6005) Bump scitools/workflows from 2024.06.0 to 2024.06.1 (SciTools#5998) [pre-commit.ci] pre-commit autoupdate (SciTools#5997) Bump scitools/workflows from 2024.05.1 to 2024.06.0 (SciTools#5986) [pre-commit.ci] pre-commit autoupdate (SciTools#5980) Updated environment lockfiles (SciTools#5983) Bump scitools/workflows from 2024.05.0 to 2024.05.1 (SciTools#5984) Make `slices_over` tests go faster (SciTools#5973) Updated environment lockfiles (SciTools#5979) Update lock files with associated fixes (SciTools#5953) List 25 slowest tests (SciTools#5969) used a note to highlight some text (SciTools#5971) Lazy `iris.cube.Cube.rolling_window` (SciTools#5795) Add memory benchmarks (SciTools#5960) ...

* upstream/main: (143 commits) Updated environment lockfiles (SciTools#6020) Add `MeshCoord.collapsed` (SciTools#6003) Bump scitools/workflows from 2024.06.2 to 2024.06.3 (SciTools#6015) Mesh saveload fix (SciTools#6004) used tabs for the install info (SciTools#6013) Fix array_equal behaviour for masked arrays (SciTools#4457) Bump scitools/workflows from 2024.06.1 to 2024.06.2 (SciTools#6008) [pre-commit.ci] pre-commit autoupdate (SciTools#6007) Updated environment lockfiles (SciTools#5996) Added more descriptive errors within concatenate (SciTools#6005) Bump scitools/workflows from 2024.06.0 to 2024.06.1 (SciTools#5998) [pre-commit.ci] pre-commit autoupdate (SciTools#5997) Bump scitools/workflows from 2024.05.1 to 2024.06.0 (SciTools#5986) [pre-commit.ci] pre-commit autoupdate (SciTools#5980) Updated environment lockfiles (SciTools#5983) Bump scitools/workflows from 2024.05.0 to 2024.05.1 (SciTools#5984) Make `slices_over` tests go faster (SciTools#5973) Updated environment lockfiles (SciTools#5979) Update lock files with associated fixes (SciTools#5953) List 25 slowest tests (SciTools#5969) ...

fix array_equal behaviour for masked arrays

474f083

bjlittle self-assigned this Dec 12, 2021

bjlittle self-requested a review December 12, 2021 22:37

bjlittle removed their assignment Jul 22, 2022

pp-mo reviewed Jul 22, 2022

View reviewed changes

lib/iris/util.py Outdated Show resolved Hide resolved

github-actions bot added the Stale A stale issue/pull-request label Dec 9, 2023

github-actions bot closed this Jan 7, 2024

pp-mo reopened this May 24, 2024

github-actions bot removed the Stale A stale issue/pull-request label May 25, 2024

trexfeathers assigned pp-mo and stephenworsley Jun 10, 2024

stephenworsley added 3 commits June 10, 2024 13:26

address review comment

2afdafb

fix tests

e0b6f9e

fix bug, add test and whatsnew

d15295c

HGWright added the Feature: UGRID label Jun 12, 2024

pp-mo requested changes Jun 13, 2024

View reviewed changes

lib/iris/util.py Outdated Show resolved Hide resolved

lib/iris/util.py Show resolved Hide resolved

lib/iris/tests/unit/util/test_array_equal.py Show resolved Hide resolved

address review comments

80cee2e

pp-mo requested changes Jun 17, 2024

View reviewed changes

lib/iris/tests/unit/util/test_array_equal.py Show resolved Hide resolved

pp-mo requested changes Jun 19, 2024

View reviewed changes

lib/iris/tests/unit/util/test_array_equal.py Outdated Show resolved Hide resolved

Update lib/iris/tests/unit/util/test_array_equal.py

e1b237e

Co-authored-by: Patrick Peglar <[email protected]>

pp-mo enabled auto-merge (squash) June 19, 2024 13:01

Merge branch 'main' into array_equal_fix

9045abc

pp-mo approved these changes Jun 19, 2024

View reviewed changes

pp-mo merged commit d16628a into SciTools:main Jun 19, 2024
21 checks passed

trexfeathers mentioned this pull request Oct 25, 2024

iris.util.array_equal ignores differences in masks with dask arrays #6188

Closed

Fix array_equal behaviour for masked arrays #4457

Fix array_equal behaviour for masked arrays #4457

Uh oh!

Conversation

stephenworsley commented Dec 10, 2021

Uh oh!

rcomer commented Dec 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bjlittle commented Jul 22, 2022

Uh oh!

Uh oh!

trexfeathers commented Jul 22, 2022

Uh oh!

rcomer commented Jul 22, 2022

Uh oh!

stephenworsley commented Jul 26, 2022

Uh oh!

github-actions bot commented Dec 9, 2023

Uh oh!

github-actions bot commented Jan 7, 2024

Uh oh!

pp-mo commented May 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jun 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

pp-mo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pp-mo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pp-mo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pp-mo commented Jun 19, 2024

Uh oh!

Uh oh!

Uh oh!

rcomer commented Dec 19, 2021 •

edited

Loading

pp-mo commented May 24, 2024 •

edited

Loading

codecov bot commented Jun 10, 2024 •

edited

Loading