Skip to content

Allow passing a metadata folder to IcebergTableProviderFactory #840

Open
@gruuya

Description

@gruuya

Currently when using the TableProviderFactory mechanism from DataFusion one needs to specify the full exact path to the metadata file as the location, e.g.

create external table inventory 
stored as iceberg 
location 's3://iceberg/public/inventory/metadata/00001-97ea515a-2d2f-465d-8c74-8daec5ab0023.metadata.json

I think it would be nice if IcebergTableProviderFactory also supported being pointed to a metadata directory as well

create external table inventory 
stored as iceberg 
location 's3://iceberg/public/inventory/metadata

This would then imply listing and parsing the latest metadata file in that directory (e.g. from V in filenames like <V>-<random-uuid>.metadata.json and maybe the legacy v<V>.metadata.json), as that is likely the overwhelming use case, and using that to build the table. That would improve the flexibility and ergonomics of the integration (e.g. by making quick prototyping much simpler).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions