Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Expose Iceberg table statistics in DataFusion interface(s) #869

Open
gruuya opened this issue Jan 3, 2025 · 2 comments · May be fixed by #880
Open

feat: Expose Iceberg table statistics in DataFusion interface(s) #869

gruuya opened this issue Jan 3, 2025 · 2 comments · May be fixed by #880
Assignees

Comments

@gruuya
Copy link
Contributor

gruuya commented Jan 3, 2025

At present the two key DataFusion interfaces for Iceberg lack statistics information, as they rely on default (i.e. missing/unknown) implementations for TableProvider::statistics and ExecutionPlan::statistics.

These can be quite important (particularly the later one) during join planning, as DataFusion uses a number of heuristics that are based on these stats when planning joins, and so they can impact performance.

@gruuya
Copy link
Contributor Author

gruuya commented Jan 3, 2025

I'd be happy to work on developing this.

@Xuanwo
Copy link
Member

Xuanwo commented Jan 3, 2025

Thank you a lot for working on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants