Skip to content

Writing a df with a column having empty string categorical dtype raises FieldError with use_arrow=True #620

@alihamdan

Description

@alihamdan

A minimal example:

import geopandas as gpd
import pyogrio

gdf = gpd.GeoDataFrame({"cat_col": [], "geometry": []}, geometry="geometry", crs="EPSG:4326")
gdf = gdf.astype({"cat_col": "str"}).astype({"cat_col": "category"})
pyogrio.write_dataframe(gdf, "test.gpkg", layer="my_layer", driver="GPKG", use_arrow=True)
Traceback

Traceback (most recent call last):
  File "/project/pyogrio_bug.py", line 6, in <module>
    pyogrio.write_dataframe(gdf, "test.gpkg", layer="my_layer", driver="GPKG", use_arrow=True)
    ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/project/.venv/lib/python3.13/site-packages/pyogrio/geopandas.py", line 841, in write_dataframe
    write_arrow(
    ~~~~~~~~~~~^
        table,
        ^^^^^^
    ...<13 lines>...
        **kwargs,
        ^^^^^^^^^
    )
    ^
  File "/project/.venv/lib/python3.13/site-packages/pyogrio/raw.py", line 883, in write_arrow
    ogr_write_arrow(
    ~~~~~~~~~~~~~~~^
        path,
        ^^^^^
    ...<11 lines>...
        layer_kwargs=layer_kwargs,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "pyogrio/_io.pyx", line 2932, in pyogrio._io.ogr_write_arrow
  File "pyogrio/_io.pyx", line 3095, in pyogrio._io.create_fields_from_arrow_schema
pyogrio.errors.FieldError: Error while creating field from Arrow for field 0 with name 'cat_col' and type c (Type 'n' for field cat_col is not supported.).

The write succeeds if I set use_arrow=False or if the categories are not empty.
I have PYOGRIO_USE_ARROW=1 set because it improves performance for some large databases I have but I need to pass use_arrow=False every time some filtering operation could result in an empty dataframe.

Versions

$ python --version
Python 3.13.8

$ uv pip freeze | grep -E '(pyogrio|pandas|geopandas|pyarrow|shapely)'
geopandas==1.1.1
pandas==2.3.3
pyarrow==22.0.0
pyogrio==0.12.1
shapely==2.1.2

$ python -c 'import pyogrio; print(pyogrio.__gdal_version__, pyogrio.__gdal_geos_version__)'
(3, 11, 4) (3, 14, 0)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions