Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel concatenate #5926

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

bouweandela
Copy link
Member

@bouweandela bouweandela commented Apr 25, 2024

🚀 Pull Request

Description

Parallelize the comparison of the values of auxiliary coordinates, cell measures, ancillary variables, and derived coordinates during cube concatenation.

Closes #5750


Consult Iris pull request check list


Add any of the below labels to trigger actions on this PR:

  • benchmark_this Request that this pull request be benchmarked to check if it introduces performance shifts

Copy link

codecov bot commented May 15, 2024

Codecov Report

Attention: Patch coverage is 93.04348% with 8 lines in your changes missing coverage. Please review.

Project coverage is 89.77%. Comparing base (6adac1b) to head (5c68a8e).

Current head 5c68a8e differs from pull request most recent head 3bfea80

Please upload reports for the commit 3bfea80 to get more accurate results.

Files Patch % Lines
lib/iris/_concatenate.py 93.04% 4 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5926      +/-   ##
==========================================
+ Coverage   89.75%   89.77%   +0.02%     
==========================================
  Files          90       93       +3     
  Lines       22929    23072     +143     
  Branches     5020     5024       +4     
==========================================
+ Hits        20580    20714     +134     
- Misses       1618     1624       +6     
- Partials      731      734       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bouweandela bouweandela marked this pull request as ready for review May 15, 2024 08:54
@bouweandela bouweandela added the benchmark_this Request that this pull request be benchmarked to check if it introduces performance shifts label May 16, 2024
Copy link
Contributor

⏱️ Performance Benchmark Report: 11b4199

Performance shifts
| Change   | Before [95b7ffe5]    | After [11b41997]    |   Ratio | Benchmark (Parameter)                                        |
|----------|----------------------|---------------------|---------|--------------------------------------------------------------|
| +        | 4.89±0.09ms          | 26.3±0.7ms          |    5.38 | load.StructuredFF.time_structured_load((1280, 960, 5), True) |
| +        | 3.69±0.04ms          | 25.1±0.4ms          |    6.81 | load.StructuredFF.time_structured_load((2, 2, 2), True)      |
| +        | 192±3ms              | 234±3ms             |    1.21 | merge_concat.Concatenate.time_concatenate                    |
Full benchmark results

Benchmarks that have stayed the same:

| Change   | Before [95b7ffe5]    | After [11b41997]    |   Ratio | Benchmark (Parameter)                                                                                |
|----------|----------------------|---------------------|---------|------------------------------------------------------------------------------------------------------|
|          | 1.14±0.02ms          | 1.15±0.01ms         |    1.01 | cube.CubeCreation.time_create(False, 'construct')                                                    |
|          | 415±4μs              | 407±6μs             |    0.98 | cube.CubeCreation.time_create(False, 'instantiate')                                                  |
|          | 963±10μs             | 977±20μs            |    1.01 | cube.CubeCreation.time_create(True, 'construct')                                                     |
|          | 602±9μs              | 586±10μs            |    0.97 | cube.CubeCreation.time_create(True, 'instantiate')                                                   |
|          | 231±4ms              | 233±4ms             |    1.01 | cube.CubeEquality.time_equality(False, False, 'all_equal')                                           |
|          | 116±2ms              | 115±2ms             |    0.99 | cube.CubeEquality.time_equality(False, False, 'coord_inequality')                                    |
|          | 243±3ms              | 243±6ms             |    1    | cube.CubeEquality.time_equality(False, False, 'data_inequality')                                     |
|          | 16.8±0.3μs           | 16.9±0.4μs          |    1.01 | cube.CubeEquality.time_equality(False, False, 'metadata_inequality')                                 |
|          | 315±6ms              | 320±5ms             |    1.02 | cube.CubeEquality.time_equality(False, True, 'all_equal')                                            |
|          | 206±2ms              | 204±2ms             |    0.99 | cube.CubeEquality.time_equality(False, True, 'coord_inequality')                                     |
|          | 329±5ms              | 331±5ms             |    1.01 | cube.CubeEquality.time_equality(False, True, 'data_inequality')                                      |
|          | 16.9±0.2μs           | 17.1±0.4μs          |    1.01 | cube.CubeEquality.time_equality(False, True, 'metadata_inequality')                                  |
|          | 232±4ms              | 231±5ms             |    1    | cube.CubeEquality.time_equality(True, False, 'all_equal')                                            |
|          | 116±1ms              | 117±2ms             |    1.01 | cube.CubeEquality.time_equality(True, False, 'coord_inequality')                                     |
|          | 242±4ms              | 241±4ms             |    1    | cube.CubeEquality.time_equality(True, False, 'data_inequality')                                      |
|          | 54.0±0.8μs           | 53.9±0.9μs          |    1    | cube.CubeEquality.time_equality(True, False, 'metadata_inequality')                                  |
|          | 319±6ms              | 320±4ms             |    1    | cube.CubeEquality.time_equality(True, True, 'all_equal')                                             |
|          | 206±2ms              | 207±2ms             |    1    | cube.CubeEquality.time_equality(True, True, 'coord_inequality')                                      |
|          | 330±7ms              | 334±6ms             |    1.01 | cube.CubeEquality.time_equality(True, True, 'data_inequality')                                       |
|          | 54.8±0.5μs           | 55.2±0.7μs          |    1.01 | cube.CubeEquality.time_equality(True, True, 'metadata_inequality')                                   |
|          | 403±3ns              | 405±10ns            |    1    | experimental.ugrid.regions_combine.CombineRegionsComputeRealData.time_compute_data(50)               |
|          | 260±2ms              | 261±2ms             |    1    | experimental.ugrid.regions_combine.CombineRegionsComputeRealData.time_compute_data(500)              |
|          | 15.2±0.3ms           | 15.0±0.3ms          |    0.99 | experimental.ugrid.regions_combine.CombineRegionsCreateCube.time_create_combined_cube(50)            |
|          | 17.5±1ms             | 17.2±0.6ms          |    0.98 | experimental.ugrid.regions_combine.CombineRegionsCreateCube.time_create_combined_cube(500)           |
|          | 5.0                  | 5.0                 |    1    | experimental.ugrid.regions_combine.CombineRegionsCreateCube.track_addedmem_create_combined_cube(50)  |
|          | 5.0                  | 5.0                 |    1    | experimental.ugrid.regions_combine.CombineRegionsCreateCube.track_addedmem_create_combined_cube(500) |
|          | 110±0.9ms            | 111±3ms             |    1    | experimental.ugrid.regions_combine.CombineRegionsFileStreamedCalc.time_stream_file2file(50)          |
|          | 723±4ms              | 723±7ms             |    1    | experimental.ugrid.regions_combine.CombineRegionsFileStreamedCalc.time_stream_file2file(500)         |
|          | 70.8±2ms             | 69.0±2ms            |    0.97 | experimental.ugrid.regions_combine.CombineRegionsSaveData.time_save(50)                              |
|          | 677±6ms              | 679±7ms             |    1    | experimental.ugrid.regions_combine.CombineRegionsSaveData.time_save(500)                             |
|          | 2.1752849999999997   | 2.1752849999999997  |    1    | experimental.ugrid.regions_combine.CombineRegionsSaveData.track_filesize_saved(50)                   |
|          | 216.01528499999998   | 216.01528499999998  |    1    | experimental.ugrid.regions_combine.CombineRegionsSaveData.track_filesize_saved(500)                  |
|          | 682±10μs             | 791±10μs            |    1.16 | import_iris.Iris.time__concatenate                                                                   |
|          | 186±6μs              | 188±9μs             |    1.01 | import_iris.Iris.time__constraints                                                                   |
|          | 113±2μs              | 113±2μs             |    1    | import_iris.Iris.time__data_manager                                                                  |
|          | 95.1±2μs             | 96.0±0.9μs          |    1.01 | import_iris.Iris.time__deprecation                                                                   |
|          | 137±2μs              | 139±1μs             |    1.01 | import_iris.Iris.time__lazy_data                                                                     |
|          | 913±20μs             | 930±10μs            |    1.02 | import_iris.Iris.time__merge                                                                         |
|          | 79.5±2μs             | 79.7±0.5μs          |    1    | import_iris.Iris.time__representation                                                                |
|          | 511±20μs             | 506±10μs            |    0.99 | import_iris.Iris.time_analysis                                                                       |
|          | 150±2μs              | 144±2μs             |    0.96 | import_iris.Iris.time_analysis__area_weighted                                                        |
|          | 113±2μs              | 113±1μs             |    1    | import_iris.Iris.time_analysis__grid_angles                                                          |
|          | 250±4μs              | 254±4μs             |    1.01 | import_iris.Iris.time_analysis__interpolation                                                        |
|          | 188±5μs              | 194±4μs             |    1.03 | import_iris.Iris.time_analysis__regrid                                                               |
|          | 114±2μs              | 116±3μs             |    1.02 | import_iris.Iris.time_analysis__scipy_interpolate                                                    |
|          | 143±4μs              | 143±1μs             |    1    | import_iris.Iris.time_analysis_calculus                                                              |
|          | 337±7μs              | 336±3μs             |    1    | import_iris.Iris.time_analysis_cartography                                                           |
|          | 95.2±2μs             | 96.1±0.7μs          |    1.01 | import_iris.Iris.time_analysis_geomerty                                                              |
|          | 224±3μs              | 227±4μs             |    1.02 | import_iris.Iris.time_analysis_maths                                                                 |
|          | 98.7±0.9μs           | 99.6±1μs            |    1.01 | import_iris.Iris.time_analysis_stats                                                                 |
|          | 178±3μs              | 180±3μs             |    1.01 | import_iris.Iris.time_analysis_trajectory                                                            |
|          | 314±7μs              | 325±10μs            |    1.03 | import_iris.Iris.time_aux_factory                                                                    |
|          | 86.4±1μs             | 86.9±0.6μs          |    1.01 | import_iris.Iris.time_common                                                                         |
|          | 169±3μs              | 172±9μs             |    1.02 | import_iris.Iris.time_common_lenient                                                                 |
|          | 992±9μs              | 1.02±0.04ms         |    1.02 | import_iris.Iris.time_common_metadata                                                                |
|          | 140±3μs              | 142±5μs             |    1.02 | import_iris.Iris.time_common_mixin                                                                   |
|          | 1.22±0.01ms          | 1.22±0.02ms         |    1    | import_iris.Iris.time_common_resolve                                                                 |
|          | 207±5μs              | 209±2μs             |    1.01 | import_iris.Iris.time_config                                                                         |
|          | 116±1μs              | 118±1μs             |    1.02 | import_iris.Iris.time_coord_categorisation                                                           |
|          | 366±10μs             | 376±6μs             |    1.03 | import_iris.Iris.time_coord_systems                                                                  |
|          | 775±20μs             | 754±10μs            |    0.97 | import_iris.Iris.time_coords                                                                         |
|          | 699±30μs             | 680±30μs            |    0.97 | import_iris.Iris.time_cube                                                                           |
|          | 229±5μs              | 230±5μs             |    1    | import_iris.Iris.time_exceptions                                                                     |
|          | 80.4±1μs             | 79.9±0.9μs          |    0.99 | import_iris.Iris.time_experimental                                                                   |
|          | 190±2μs              | 192±2μs             |    1.01 | import_iris.Iris.time_fileformats                                                                    |
|          | 259±7μs              | 260±3μs             |    1    | import_iris.Iris.time_fileformats__ff                                                                |
|          | 2.83±0.2ms           | 2.91±0.1ms          |    1.03 | import_iris.Iris.time_fileformats__ff_cross_references                                               |
|          | 80.9±0.7μs           | 81.5±1μs            |    1.01 | import_iris.Iris.time_fileformats__pp_lbproc_pairs                                                   |
|          | 119±1μs              | 118±2μs             |    1    | import_iris.Iris.time_fileformats_abf                                                                |
|          | 367±10μs             | 381±20μs            |    1.04 | import_iris.Iris.time_fileformats_cf                                                                 |
|          | 5.54±0.2ms           | 5.54±0.1ms          |    1    | import_iris.Iris.time_fileformats_dot                                                                |
|          | 77.4±0.5μs           | 77.8±1μs            |    1.01 | import_iris.Iris.time_fileformats_name                                                               |
|          | 263±3μs              | 266±4μs             |    1.01 | import_iris.Iris.time_fileformats_name_loaders                                                       |
|          | 121±1μs              | 124±2μs             |    1.03 | import_iris.Iris.time_fileformats_netcdf                                                             |
|          | 125±4μs              | 127±4μs             |    1.01 | import_iris.Iris.time_fileformats_nimrod                                                             |
|          | 211±2μs              | 215±4μs             |    1.02 | import_iris.Iris.time_fileformats_nimrod_load_rules                                                  |
|          | 793±10μs             | 796±10μs            |    1    | import_iris.Iris.time_fileformats_pp                                                                 |
|          | 186±3μs              | 184±3μs             |    0.99 | import_iris.Iris.time_fileformats_pp_load_rules                                                      |
|          | 137±1μs              | 136±1μs             |    1    | import_iris.Iris.time_fileformats_pp_save_rules                                                      |
|          | 522±10μs             | 523±6μs             |    1    | import_iris.Iris.time_fileformats_rules                                                              |
|          | 223±4μs              | 228±5μs             |    1.02 | import_iris.Iris.time_fileformats_structured_array_identification                                    |
|          | 85.6±1μs             | 85.0±0.8μs          |    0.99 | import_iris.Iris.time_fileformats_um                                                                 |
|          | 166±0.8μs            | 165±2μs             |    0.99 | import_iris.Iris.time_fileformats_um__fast_load                                                      |
|          | 142±3μs              | 140±2μs             |    0.99 | import_iris.Iris.time_fileformats_um__fast_load_structured_fields                                    |
|          | 77.8±0.5μs           | 78.5±1μs            |    1.01 | import_iris.Iris.time_fileformats_um__ff_replacement                                                 |
|          | 84.5±0.8μs           | 84.2±0.6μs          |    1    | import_iris.Iris.time_fileformats_um__optimal_array_structuring                                      |
|          | 997±10μs             | 1.00±0.01ms         |    1.01 | import_iris.Iris.time_fileformats_um_cf_map                                                          |
|          | 140±2μs              | 142±2μs             |    1.01 | import_iris.Iris.time_io                                                                             |
|          | 182±4μs              | 180±4μs             |    0.99 | import_iris.Iris.time_io_format_picker                                                               |
|          | 209±2μs              | 209±3μs             |    1    | import_iris.Iris.time_iris                                                                           |
|          | 131±2μs              | 133±1μs             |    1.01 | import_iris.Iris.time_iterate                                                                        |
|          | 8.66±0.2ms           | 8.70±0.2ms          |    1    | import_iris.Iris.time_palette                                                                        |
|          | 342±3μs              | 347±4μs             |    1.01 | import_iris.Iris.time_plot                                                                           |
|          | 105±1μs              | 110±2μs             |    1.04 | import_iris.Iris.time_quickplot                                                                      |
|          | 2.15±0.06ms          | 2.17±0.09ms         |    1.01 | import_iris.Iris.time_std_names                                                                      |
|          | 1.77±0.01ms          | 1.77±0.02ms         |    1    | import_iris.Iris.time_symbols                                                                        |
|          | 41.2±1ms             | 42.3±1ms            |    1.03 | import_iris.Iris.time_tests                                                                          |
|          | 233±2μs              | 238±5μs             |    1.03 | import_iris.Iris.time_third_party_cartopy                                                            |
|          | 4.97±0.06ms          | 4.99±0.1ms          |    1    | import_iris.Iris.time_third_party_cf_units                                                           |
|          | 108±2μs              | 111±2μs             |    1.02 | import_iris.Iris.time_third_party_cftime                                                             |
|          | 2.84±0.03ms          | 2.83±0.05ms         |    1    | import_iris.Iris.time_third_party_matplotlib                                                         |
|          | 1.07±0.01ms          | 1.06±0.01ms         |    1    | import_iris.Iris.time_third_party_numpy                                                              |
|          | 161±2μs              | 165±4μs             |    1.03 | import_iris.Iris.time_third_party_scipy                                                              |
|          | 102±0.5μs            | 104±2μs             |    1.01 | import_iris.Iris.time_time                                                                           |
|          | 332±6μs              | 324±3μs             |    0.98 | import_iris.Iris.time_util                                                                           |
|          | 74.6±1μs             | 75.2±1μs            |    1.01 | iterate.IZip.time_izip                                                                               |
|          | 8.33±0.2ms           | 8.36±0.3ms          |    1    | load.LoadAndRealise.time_load((1280, 960, 5), False, 'FF')                                           |
|          | 25.3±0.5ms           | 25.6±0.8ms          |    1.01 | load.LoadAndRealise.time_load((1280, 960, 5), False, 'NetCDF')                                       |
|          | 9.05±0.1ms           | 9.38±0.3ms          |    1.04 | load.LoadAndRealise.time_load((1280, 960, 5), False, 'PP')                                           |
|          | 8.43±0.3ms           | 8.18±0.1ms          |    0.97 | load.LoadAndRealise.time_load((1280, 960, 5), True, 'FF')                                            |
|          | 22.2±0.4ms           | 22.3±0.3ms          |    1.01 | load.LoadAndRealise.time_load((1280, 960, 5), True, 'NetCDF')                                        |
|          | 9.07±0.09ms          | 9.06±0.1ms          |    1    | load.LoadAndRealise.time_load((1280, 960, 5), True, 'PP')                                            |
|          | 1.39±0.01s           | 1.39±0.01s          |    1    | load.LoadAndRealise.time_load((2, 2, 1000), False, 'FF')                                             |
|          | 22.2±0.3ms           | 21.9±0.4ms          |    0.99 | load.LoadAndRealise.time_load((2, 2, 1000), False, 'NetCDF')                                         |
|          | 1.56±0.01s           | 1.54±0.02s          |    0.99 | load.LoadAndRealise.time_load((2, 2, 1000), False, 'PP')                                             |
|          | 1.39±0.01s           | 1.36±0.02s          |    0.98 | load.LoadAndRealise.time_load((2, 2, 1000), True, 'FF')                                              |
|          | 21.9±0.4ms           | 21.8±0.5ms          |    1    | load.LoadAndRealise.time_load((2, 2, 1000), True, 'NetCDF')                                          |
|          | 1.55±0.01s           | 1.54±0.03s          |    1    | load.LoadAndRealise.time_load((2, 2, 1000), True, 'PP')                                              |
|          | 3.99±0.06ms          | 4.02±0.1ms          |    1.01 | load.LoadAndRealise.time_load((50, 50, 2), False, 'FF')                                              |
|          | 21.2±0.5ms           | 21.2±0.5ms          |    1    | load.LoadAndRealise.time_load((50, 50, 2), False, 'NetCDF')                                          |
|          | 4.30±0.09ms          | 4.27±0.2ms          |    0.99 | load.LoadAndRealise.time_load((50, 50, 2), False, 'PP')                                              |
|          | 4.00±0.06ms          | 3.98±0.08ms         |    1    | load.LoadAndRealise.time_load((50, 50, 2), True, 'FF')                                               |
|          | 20.5±0.6ms           | 21.2±0.4ms          |    1.03 | load.LoadAndRealise.time_load((50, 50, 2), True, 'NetCDF')                                           |
|          | 4.26±0.06ms          | 4.32±0.08ms         |    1.01 | load.LoadAndRealise.time_load((50, 50, 2), True, 'PP')                                               |
|          | 34.2±1ms             | 34.3±2ms            |    1    | load.LoadAndRealise.time_realise((1280, 960, 5), False, 'FF')                                        |
|          | 20.3±0.7ms           | 20.2±0.9ms          |    1    | load.LoadAndRealise.time_realise((1280, 960, 5), False, 'NetCDF')                                    |
|          | 14.0±3ms             | 14.3±1ms            |    1.02 | load.LoadAndRealise.time_realise((1280, 960, 5), False, 'PP')                                        |
|          | 26.4±2ms             | 26.8±3ms            |    1.02 | load.LoadAndRealise.time_realise((1280, 960, 5), True, 'FF')                                         |
|          | 71.7±2ms             | 71.1±2ms            |    0.99 | load.LoadAndRealise.time_realise((1280, 960, 5), True, 'NetCDF')                                     |
|          | 26.0±0.7ms           | 26.1±1ms            |    1    | load.LoadAndRealise.time_realise((1280, 960, 5), True, 'PP')                                         |
|          | 457±4ms              | 454±2ms             |    0.99 | load.LoadAndRealise.time_realise((2, 2, 1000), False, 'FF')                                          |
|          | 3.11±0.1ms           | 3.13±0.1ms          |    1.01 | load.LoadAndRealise.time_realise((2, 2, 1000), False, 'NetCDF')                                      |
|          | 461±2ms              | 462±2ms             |    1    | load.LoadAndRealise.time_realise((2, 2, 1000), False, 'PP')                                          |
|          | 458±7ms              | 462±4ms             |    1.01 | load.LoadAndRealise.time_realise((2, 2, 1000), True, 'FF')                                           |
|          | 3.06±0.1ms           | 3.16±0.1ms          |    1.03 | load.LoadAndRealise.time_realise((2, 2, 1000), True, 'NetCDF')                                       |
|          | 467±4ms              | 464±4ms             |    0.99 | load.LoadAndRealise.time_realise((2, 2, 1000), True, 'PP')                                           |
|          | 1.52±0.1ms           | 1.58±0.2ms          |    1.05 | load.LoadAndRealise.time_realise((50, 50, 2), False, 'FF')                                           |
|          | 3.22±0.1ms           | 3.11±0.08ms         |    0.96 | load.LoadAndRealise.time_realise((50, 50, 2), False, 'NetCDF')                                       |
|          | 1.61±0.07ms          | 1.67±0.1ms          |    1.03 | load.LoadAndRealise.time_realise((50, 50, 2), False, 'PP')                                           |
|          | 1.65±0.1ms           | 1.68±0.1ms          |    1.02 | load.LoadAndRealise.time_realise((50, 50, 2), True, 'FF')                                            |
|          | 3.16±0.1ms           | 3.29±0.1ms          |    1.04 | load.LoadAndRealise.time_realise((50, 50, 2), True, 'NetCDF')                                        |
|          | 1.64±0.1ms           | 1.70±0.1ms          |    1.04 | load.LoadAndRealise.time_realise((50, 50, 2), True, 'PP')                                            |
|          | 361±5ms              | 358±3ms             |    0.99 | load.ManyVars.time_many_var_load                                                                     |
|          | 8.42±0.2ms           | 8.51±0.2ms          |    1.01 | load.STASHConstraint.time_stash_constraint((1280, 960, 5), 'FF')                                     |
|          | 9.29±0.1ms           | 9.40±0.3ms          |    1.01 | load.STASHConstraint.time_stash_constraint((1280, 960, 5), 'PP')                                     |
|          | 1.40±0.01s           | 1.41±0.02s          |    1.01 | load.STASHConstraint.time_stash_constraint((2, 2, 1000), 'FF')                                       |
|          | 1.59±0.02s           | 1.58±0.02s          |    1    | load.STASHConstraint.time_stash_constraint((2, 2, 1000), 'PP')                                       |
|          | 4.12±0.09ms          | 4.03±0.1ms          |    0.98 | load.STASHConstraint.time_stash_constraint((2, 2, 2), 'FF')                                          |
|          | 4.43±0.1ms           | 4.30±0.08ms         |    0.97 | load.STASHConstraint.time_stash_constraint((2, 2, 2), 'PP')                                          |
|          | 8.38±0.1ms           | 8.40±0.2ms          |    1    | load.StructuredFF.time_structured_load((1280, 960, 5), False)                                        |
|          | 1.37±0.01s           | 1.36±0.02s          |    0.99 | load.StructuredFF.time_structured_load((2, 2, 1000), False)                                          |
|          | 382±5ms              | 406±5ms             |    1.06 | load.StructuredFF.time_structured_load((2, 2, 1000), True)                                           |
|          | 4.04±0.08ms          | 3.97±0.09ms         |    0.98 | load.StructuredFF.time_structured_load((2, 2, 2), False)                                             |
|          | 155±3ms              | 154±2ms             |    0.99 | load.TimeConstraint.time_time_constraint(20, 'FF')                                                   |
|          | 24.9±0.2ms           | 24.0±0.3ms          |    0.97 | load.TimeConstraint.time_time_constraint(20, 'NetCDF')                                               |
|          | 168±0.5ms            | 168±0.9ms           |    1    | load.TimeConstraint.time_time_constraint(20, 'PP')                                                   |
|          | 31.0±0.7ms           | 30.9±0.6ms          |    1    | load.TimeConstraint.time_time_constraint(3, 'FF')                                                    |
|          | 24.6±0.3ms           | 24.2±0.4ms          |    0.98 | load.TimeConstraint.time_time_constraint(3, 'NetCDF')                                                |
|          | 33.1±0.2ms           | 33.0±0.3ms          |    1    | load.TimeConstraint.time_time_constraint(3, 'PP')                                                    |
|          | 18.1±0.5ms           | 18.5±0.5ms          |    1.02 | load.ugrid.BasicLoading.time_load_file(1)                                                            |
|          | 43.3±0.5ms           | 43.5±0.6ms          |    1    | load.ugrid.BasicLoading.time_load_file(200000)                                                       |
|          | 15.1±0.4ms           | 15.0±0.3ms          |    0.99 | load.ugrid.BasicLoading.time_load_mesh(1)                                                            |
|          | 23.5±0.7ms           | 23.9±0.4ms          |    1.02 | load.ugrid.BasicLoading.time_load_mesh(200000)                                                       |
|          | 18.6±0.7ms           | 18.4±0.5ms          |    0.99 | load.ugrid.BasicLoadingTime.time_load_file(1)                                                        |
|          | 20.8±0.7ms           | 21.5±0.2ms          |    1.04 | load.ugrid.BasicLoadingTime.time_load_file(200000)                                                   |
|          | 15.0±0.4ms           | 14.6±0.5ms          |    0.98 | load.ugrid.BasicLoadingTime.time_load_mesh(1)                                                        |
|          | 17.8±0.8ms           | 18.2±0.4ms          |    1.02 | load.ugrid.BasicLoadingTime.time_load_mesh(200000)                                                   |
|          | 19.1±0.5ms           | 19.5±0.5ms          |    1.02 | load.ugrid.Callback.time_load_file_callback(1)                                                       |
|          | 52.3±0.6ms           | 52.2±0.5ms          |    1    | load.ugrid.Callback.time_load_file_callback(200000)                                                  |
|          | 19.5±0.3ms           | 19.5±0.3ms          |    1    | load.ugrid.CallbackTime.time_load_file_callback(1)                                                   |
|          | 23.6±0.8ms           | 23.4±0.4ms          |    0.99 | load.ugrid.CallbackTime.time_load_file_callback(200000)                                              |
|          | 3.16±0.1ms           | 3.09±0.2ms          |    0.98 | load.ugrid.DataRealisation.time_realise_data(10000)                                                  |
|          | 4.59±1ms             | 4.15±0.2ms          |    0.9  | load.ugrid.DataRealisation.time_realise_data(200000)                                                 |
|          | 42.6±2ms             | 43.4±2ms            |    1.02 | load.ugrid.DataRealisationTime.time_realise_data(10000)                                              |
|          | 845±9ms              | 842±9ms             |    1    | load.ugrid.DataRealisationTime.time_realise_data(200000)                                             |
|          | 49.5±2ms             | 50.0±2ms            |    1.01 | merge_concat.Merge.time_merge                                                                        |
|          | 6.78±0.1ms           | 6.87±0.1ms          |    1.01 | plot.AuxSort.time_aux_sort                                                                           |
|          | 82.0±3ms             | 80.9±2ms            |    0.99 | regridding.CurvilinearRegridding.time_regrid_pic                                                     |
|          | 99.8±1ms             | 99.9±0.7ms          |    1    | regridding.HorizontalChunkedRegridding.time_regrid_area_w                                            |
|          | 51.5±3ms             | 53.6±2ms            |    1.04 | regridding.HorizontalChunkedRegridding.time_regrid_area_w_new_grid                                   |
|          | 4.42±0.2ms           | 4.41±0.2ms          |    1    | save.NetcdfSave.time_netcdf_save_cube(50, False)                                                     |
|          | 77.2±1ms             | 76.6±0.9ms          |    0.99 | save.NetcdfSave.time_netcdf_save_cube(50, True)                                                      |
|          | 54.1±0.8ms           | 54.9±2ms            |    1.01 | save.NetcdfSave.time_netcdf_save_cube(600, False)                                                    |
|          | 581±3ms              | 585±6ms             |    1.01 | save.NetcdfSave.time_netcdf_save_cube(600, True)                                                     |
|          | 90.8±0.4ns           | 90.0±0.6ns          |    0.99 | save.NetcdfSave.time_netcdf_save_mesh(50, False)                                                     |
|          | 59.0±0.7ms           | 57.8±0.9ms          |    0.98 | save.NetcdfSave.time_netcdf_save_mesh(50, True)                                                      |
|          | 89.9±0.7ns           | 90.4±1ns            |    1.01 | save.NetcdfSave.time_netcdf_save_mesh(600, False)                                                    |
|          | 518±5ms              | 514±4ms             |    0.99 | save.NetcdfSave.time_netcdf_save_mesh(600, True)                                                     |
|          | 43.3±0.9ms           | 43.5±0.9ms          |    1.01 | stats.PearsonR.time_lazy                                                                             |
|          | 19.5±0.5ms           | 19.4±0.3ms          |    0.99 | stats.PearsonR.time_real                                                                             |
|          | 23.3±1ms             | 22.6±1ms            |    0.97 | trajectory.TrajectoryInterpolation.time_trajectory_linear                                            |
|          | 60.4±0.7ms           | 59.5±0.6ms          |    0.98 | trajectory.TrajectoryInterpolation.time_trajectory_nearest                                           |

Benchmarks that have got worse:

| Change   | Before [95b7ffe5]    | After [11b41997]    |   Ratio | Benchmark (Parameter)                                        |
|----------|----------------------|---------------------|---------|--------------------------------------------------------------|
| +        | 4.89±0.09ms          | 26.3±0.7ms          |    5.38 | load.StructuredFF.time_structured_load((1280, 960, 5), True) |
| +        | 3.69±0.04ms          | 25.1±0.4ms          |    6.81 | load.StructuredFF.time_structured_load((2, 2, 2), True)      |
| +        | 192±3ms              | 234±3ms             |    1.21 | merge_concat.Concatenate.time_concatenate                    |

Generated by GHA run 9112533915

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to fall back on 'normal equality' when hash equality is False?

Sorry to add more work to this, but I've been having some offline conversations with @bjlittle and @pp-mo about equality in general and we're concerned about Iris' strictness. The changes here would make Iris more strict than it is already.

We are therefore keen to use hashing as a way to confirm equality quickly and efficiently, while still retaining the chance for more lenient comparisons such as:

  • Allowing NaN (example).
  • Potentially allowing for floating point differences in future (thanks to @larsbarring for insights).

If this would harm the performance gains you are looking for then we would be open to configurable behaviour in concatenate().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for providing me with feedback! ✨

I've been having some offline conversations with @bjlittle and @pp-mo about equality in general and we're concerned about Iris' strictness.

I agree with this. In ESMValCore we have implemented many workarounds for this to make the life of our users easier.

The changes here would make Iris more strict than it is already.

As far as I'm aware, this pull request does not make any changes to Iris behaviour.
Would you have an example so I can understand when this happens?

I even made the hash comparison work for arrays of different dtypes because I initially expected that that would be allowed, but it turns out that even that is not allowed by the current implementation of concatenate, so I could take that out again. Or we can keep it in case you would be interested in being more lenient w.r.t. this kind of differences in the future.

Allowing NaN

Arrays containing NaNs compare equal with the hashing implementation, I added a test to demonstrate it in 2540fea.

Would it be possible to fall back on 'normal equality' when hash equality is False?

Yes, it would be easy to add the additional comparison here:

iris/lib/iris/_concatenate.py

Lines 1077 to 1078 in 3bfea80

if get_hashes(coord_a.coord) != get_hashes(coord_b.coord):
return False

however, with the current strict implementation of coordinate comparison, there would be no point in doing so because the result would be the same. I'm not too concerned about the performance impact because in our use case, we expect the input to be compatible enough such that the result of the concatenation is a single cube, so the extra comparison would only happen in exceptional cases when there is something wrong with the input data.

@bouweandela
Copy link
Member Author

The benchmark report is a bit puzzling. I will look into it. Is there already a benchmark for concatenate?

@trexfeathers
Copy link
Contributor

The benchmark report is a bit puzzling. I will look into it. Is there already a benchmark for concatenate?

class Concatenate:
# TODO: Improve coverage.
cube_list: CubeList
def setup(self):
source_cube = realistic_4d_w_everything()
second_cube = source_cube.copy()
first_dim_coord = second_cube.coord(dimensions=0, dim_coords=True)
first_dim_coord.points = (
first_dim_coord.points + np.ptp(first_dim_coord.points) + 1
)
self.cube_list = CubeList([source_cube, second_cube])
def time_concatenate(self):
_ = self.cube_list.concatenate_cube()
@TrackAddedMemoryAllocation.decorator_repeating()
def track_mem_merge(self):
_ = self.cube_list.concatenate_cube()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmark_this Request that this pull request be benchmarked to check if it introduces performance shifts
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

Parallelising cubes concatenation
3 participants