handle `numpy` dtypes for scalar values such as in `SetNoDataValue()` #330

davemfish · 2023-07-21T17:53:45Z

As noted in natcap/invest#992, GDAL does not consider numpy Float32/Float64 objects as valid doubles. And that can lead to exceptions like this:

  File "natcap\invest\coastal_vulnerability.py", line 2216, in warp_and_mask_bathymetry
  File "pygeoprocessing\geoprocessing.py", line 426, in raster_calculator
  File "osgeo\gdal.py", line 3847, in SetNoDataValue
TypeError: in method 'Band_SetNoDataValue', argument 2 of type 'double'

I think it would be nice for pygeoprocessing to handle this instead of forcing the user to remember the details of GDAL and its C implementation. pygeoprocessing encourages users to pair it with numpy in all sorts of ways, so I think we can do a bit better here.

There are a few places where we already handle this in pygeoprocessing. There are 5 instances of calling SetNoDataValue and a few different ways of handling things.

`raster_calculator`:

value is passed as-is from a function argument

`new_raster_from_base`: (I propose adopting this solution for all the rest)

uses item() to return a Python scalar representation https://numpy.org/doc/stable/reference/generated/numpy.ndarray.item.html

        try:
            target_band.SetNoDataValue(nodata_value.item())
        except AttributeError:
            target_band.SetNoDataValue(nodata_value)

`create_raster_from_bounding_box`:

always does band.SetNoDataValue(float(target_nodata)) regardless of the target datatype for the raster

`convolve_2d`:

value is passed as-is from a function argument unless it is None, then casts to Python float, regardless of target datatype.

`numpy_array_to_raster`:

value is passed as-is from a function argument

Are there any other cases of numpy Float types being rejected by GDAL, besides SetNoDataValue() that we should consider?

The text was updated successfully, but these errors were encountered:

phargogh · 2023-07-21T19:00:58Z

Are there any other cases of numpy Float types being rejected by GDAL, besides SetNoDataValue() that we should consider?

I just yesterday found that using feature.SetField(name, float_value) where name is a field of ogr.OFTReal expects that float_value is also a double.

In working on that bug (natcap/invest#1350), I also discovered:

If float_value is numpy.float32, it crashes
If float_value is numpy.float64, it does not.

Maybe this is consistent with SetNoDataValue? I don't know that I've ever actually tried setting a numpy.float64 object before.

davemfish added enhancement New feature or request proposal Proposals requiring team feedback labels Jul 21, 2023

dcdenu4 added this to the 2.5.0 milestone Sep 26, 2024

dcdenu4 removed the proposal Proposals requiring team feedback label Sep 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

handle `numpy` dtypes for scalar values such as in `SetNoDataValue()` #330

handle `numpy` dtypes for scalar values such as in `SetNoDataValue()` #330

davemfish commented Jul 21, 2023

phargogh commented Jul 21, 2023

handle numpy dtypes for scalar values such as in SetNoDataValue() #330

handle numpy dtypes for scalar values such as in SetNoDataValue() #330

Comments

davemfish commented Jul 21, 2023

raster_calculator:

new_raster_from_base: (I propose adopting this solution for all the rest)

create_raster_from_bounding_box:

convolve_2d:

numpy_array_to_raster:

phargogh commented Jul 21, 2023

handle `numpy` dtypes for scalar values such as in `SetNoDataValue()` #330

handle `numpy` dtypes for scalar values such as in `SetNoDataValue()` #330

`raster_calculator`:

`new_raster_from_base`: (I propose adopting this solution for all the rest)

`create_raster_from_bounding_box`:

`convolve_2d`:

`numpy_array_to_raster`: