Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introducing PolarsExcelDataset: High-Performance Multi-Sheet Excel Integration for Kedro #1000

Conversation

Stephanieewelu
Copy link

This PR introduces PolarsExcelDataset, a powerful new dataset for Kedro, enabling seamless multi-sheet Excel file processing using Polars. With this addition, data engineers and scientists can now leverage Polars' blazing-fast performance while working with Excel files in Kedro pipelines.

Why This Matters?
Performance Boost: Polars processes large Excel datasets faster than Pandas, optimizing workflows for global talent working with massive datasets.
Scalability: Supports multi-sheet parsing, making data ingestion more efficient for complex machine learning pipelines.
Modern Data Stack: Expands Kedro’s compatibility with Polars, one of the fastest-growing DataFrame libraries.
Future-Ready: Prepares Kedro for next-gen data processing with scalable, high-performance tools.

This contribution empowers global data teams, making Kedro a go-to framework for efficient, scalable data engineering.

Looking forward to feedback and discussions!

  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Updated jsonschema/kedro-catalog-X.XX.json if necessary
  • Added a description of this change in the relevant RELEASE.md file
  • Added tests to cover my changes
  • Received approvals from at least half of the TSC (required for adding a new, non-experimental dataset)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant