Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add examples for how to use DuckDB to query geoparquet files #16

Open
rdhyee opened this issue Jun 7, 2024 · 0 comments
Open

Add examples for how to use DuckDB to query geoparquet files #16

rdhyee opened this issue Jun 7, 2024 · 0 comments

Comments

@rdhyee
Copy link
Contributor

rdhyee commented Jun 7, 2024

I managed to export the Smithonsian records with

isample export -j $ISAMPLES_JWT -f geoparquet -d /tmp -q 'source:SMITHSONIAN'

Next challenge is how to do simple spatial queries. I installed duckdb on my Mac:

brew install duckdb

I installed duckdb extensions as suggested by cholmes/duckdb-geoparquet-tutorials:

INSTALL spatial;
LOAD spatial;
INSTALL httpfs;
LOAD httpfs;

I then

CREATE TABLE geoparquet_data AS SELECT * FROM read_parquet('isamples_export_2024_06_07_07_40_00_geo.parquet');
DESCRIBE geoparquet_data
  ;

to get

┌──────────────────────┬─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬─────────┬─────────┬─────────┬─────────┐
│ column_name │ column_type │ null │ key │ default │ extra │
│ varchar │ varchar │ varchar │ varchar │ varchar │ varchar │
├──────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┤
│ sample_identifier │ VARCHAR │ YES │ │ │ │
│ label │ VARCHAR │ YES │ │ │ │
│ description │ VARCHAR │ YES │ │ │ │
│ source_collection │ VARCHAR │ YES │ │ │ │
│ has_specimen_categ… │ STRUCT("label" VARCHAR)[] │ YES │ │ │ │
│ has_material_categ… │ STRUCT("label" VARCHAR)[] │ YES │ │ │ │
│ has_context_category │ STRUCT("label" VARCHAR)[] │ YES │ │ │ │
│ keywords │ STRUCT(keyword VARCHAR)[] │ YES │ │ │ │
│ produced_by │ STRUCT(description VARCHAR, has_feature_of_interest VARCHAR, identifier VARCHAR, "label" VARCHAR, responsibility STRUCT("name" VARCHAR, "role" VARCHAR)[], result_time VARCHAR, sampling_site STRUCT(desc… │ YES │ │ │ │
│ registrant │ STRUCT("name" VARCHAR) │ YES │ │ │ │
│ sample_location_lo… │ DOUBLE │ YES │ │ │ │
│ sample_location_la… │ DOUBLE │ YES │ │ │ │
│ geometry │ BLOB │ YES │ │ │ │
├──────────────────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴─────────┴─────────┴─────────┴─────────┤
│ 13 rows 6 columns │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

but I'm trying to a spatial extent query:

D SELECT ST_Extent(geometry) AS extent FROM geoparquet_data;
Binder Error: No function matches the given name and argument types 'ST_Extent(BLOB)'. You might need to add explicit type casts.
	Candidate functions:
	ST_Extent(GEOMETRY) -> BOX_2D

LINE 1: SELECT ST_Extent(geometry) AS extent FROM geop...
               ^
D SELECT ST_Extent(CAST(geometry AS BOX_2D)) AS extent FROM geoparquet_data;
Conversion Error: Unimplemented type for cast (BLOB -> BOX_2D)
LINE 1: SELECT ST_Extent(CAST(geometry AS BOX_2D)) AS extent FRO...
                         ^
D SELECT ST_Extent(CAST(geometry AS GEOMETRY)) AS extent FROM geoparquet_data;
Conversion Error: Unimplemented type for cast (BLOB -> GEOMETRY)
LINE 1: SELECT ST_Extent(CAST(geometry AS GEOMETRY)) AS extent F...

How to fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant