Lazy rectilinear interpolator, with `sparse` #6006

fnattino · 2024-06-14T15:18:44Z

🚀 Pull Request

Description

Enable the rectilinear interpolator to run lazily #6002.

The original implementation makes use of a CSR matrix from scipy.sparse to represent the weights. However, this does not seem to work well with Dask arrays, since it does not adhere (completely) to the array interface. Switching to sparse arrays from sparse seems to work well, allowing the sparse matrix to be wrapped in a Dask array.

Would this be an acceptable change? On the long run, it looks like sparse will replace scipy.sparse in the PyData ecosystem..

they seem to better follow the array interface

for more information, see https://pre-commit.ci

fnattino · 2024-06-18T09:16:14Z

Additional note on the use of sparse: this is also the library used by xESMF. The drawback is that it would add numba as a dependency, which I realise it might not be ideal..

…is into lazy-rectilinearinterpolator

fnattino · 2024-06-18T09:37:29Z

Marking it as ready for review to hear your thoughts on this. What is in here for now seems to work (see e.g. code snippet below), provided that sparse is installed.

Test code snippet

import iris

from iris.analysis import RectilinearInterpolator


LATITUDE = [16., 16.1]
LONGITUDE = 226.


def get_cube():
    filename = iris.sample_data_path('E1_north_america.nc')
    return iris.load_cube(filename, 'air_temperature')


def interpolate(method):
    assert method in ('linear', 'nearest')
    cube = get_cube()
    coords = ('latitude', 'longitude')
    points = (LATITUDE, LONGITUDE)
    interpolator = RectilinearInterpolator(cube, coords, method, "mask")
    print('Cube is lazy: ', cube.has_lazy_data())
    # Cube is lazy:  True
    result = interpolator(points, collapse_scalar=True)
    print('Result is lazy: ', result.has_lazy_data())
    # Result is lazy:  True


if __name__ == '__main__':
    interpolate('linear')
    interpolate('nearest')

trexfeathers · 2024-06-18T10:34:25Z

Hi @fnattino, thanks for your hard work. Just to warn you I expect several weeks' delay before we can get to this as we are accumulating a backlog of ESMValTool changes while we prepare GeoVista for SciPy 2024 and complete the mesh-focused Iris 3.10 release.

fnattino · 2024-07-25T09:31:25Z

I have worked on a different approach that makes the intepolator lazy by using a similar approach as used in the regridder, see #6084. This has the disadvantage that it requires merging the chunks along the interpolating dimensions, but it is not as "disruptive" as in it does not add new dependencies, so maybe worth to have a look at #6084 first keeping this for the longer run? In the meantime I can also try to run some benchmarks to compare the performance of sparse vs scipy.sparse.

fnattino and others added 6 commits June 14, 2024 16:51

skip data realization when lazy

3ed3a8f

only pass data or mask to interpolator

46088b9

index array should have int dtype

fbf70a6

use objects from sparse library

ef15383

they seem to better follow the array interface

use vindex for fancy indexing of dask arrays

e191b0f

[pre-commit.ci] auto fixes from pre-commit.com hooks

fa97f7f

for more information, see https://pre-commit.ci

fnattino added 2 commits June 18, 2024 11:24

add sparse to dependencies

0df0ce0

Merge branch 'lazy-rectilinearinterpolator' of github.com:fnattino/ir…

c63f0c0

…is into lazy-rectilinearinterpolator

fnattino marked this pull request as ready for review June 18, 2024 09:37

fnattino changed the title ~~Lazy rectilinear interpolator~~ Lazy rectilinear interpolator, with sparse Jul 25, 2024

fnattino mentioned this pull request Jul 25, 2024

Lazy rectilinear interpolator #6084

Open

Merge branch 'main' into lazy-rectilinearinterpolator

2f6c3fc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lazy rectilinear interpolator, with `sparse` #6006

Lazy rectilinear interpolator, with `sparse` #6006

fnattino commented Jun 14, 2024

fnattino commented Jun 18, 2024 •

edited

Loading

fnattino commented Jun 18, 2024

trexfeathers commented Jun 18, 2024

fnattino commented Jul 25, 2024

Lazy rectilinear interpolator, with sparse #6006

Are you sure you want to change the base?

Lazy rectilinear interpolator, with sparse #6006

Conversation

fnattino commented Jun 14, 2024

🚀 Pull Request

Description

fnattino commented Jun 18, 2024 • edited Loading

fnattino commented Jun 18, 2024

trexfeathers commented Jun 18, 2024

fnattino commented Jul 25, 2024

Lazy rectilinear interpolator, with `sparse` #6006

Lazy rectilinear interpolator, with `sparse` #6006

fnattino commented Jun 18, 2024 •

edited

Loading