Skip to content

Drop behavioral change for Spark with REST Catalogs #11754

@c-thiel

Description

@c-thiel

Feature Request / Improvement

Currently when purge-dropping tables with Spark and the REST Catalog, Spark deletes all files of the tables before sending the drop request to the REST Catalog. In REST Catalog scenarios, we want the server to perform the drop. This enables scenarios like soft-deletion / undrop without leveraging the S3-only s3.delete-enabled. Thus the client shouldn't delete the files and communicate its intent to the server.

As a temporary workaround a new flag is added in #11317. Long term a behavior change of DROP is desirable, which should happen in Iceberg Version 2.

Discussed in the Iceberg Catalog Sync of Dec. 11th.

@RussellSpitzer could you add the 2.0 Milestone? Feel free to make it more precise as you see fit!

Query engine

Spark

Willingness to contribute

  • I can contribute this improvement/feature independently
  • I would be willing to contribute this improvement/feature with guidance from the Iceberg community
  • I cannot contribute this improvement/feature at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    improvementPR that improves existing functionalitystale

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions