Introducing PolarsExcelDataset: High-Performance Multi-Sheet Excel Integration for Kedro #1000

Stephanieewelu · 2025-02-07T04:18:20Z

This PR introduces PolarsExcelDataset, a powerful new dataset for Kedro, enabling seamless multi-sheet Excel file processing using Polars. With this addition, data engineers and scientists can now leverage Polars' blazing-fast performance while working with Excel files in Kedro pipelines.

Why This Matters?
Performance Boost: Polars processes large Excel datasets faster than Pandas, optimizing workflows for global talent working with massive datasets.
Scalability: Supports multi-sheet parsing, making data ingestion more efficient for complex machine learning pipelines.
Modern Data Stack: Expands Kedro’s compatibility with Polars, one of the fastest-growing DataFrame libraries.
Future-Ready: Prepares Kedro for next-gen data processing with scalable, high-performance tools.

This contribution empowers global data teams, making Kedro a go-to framework for efficient, scalable data engineering.

Looking forward to feedback and discussions!

Opened this PR as a 'Draft Pull Request' if it is work-in-progress
Updated the documentation to reflect the code changes
Updated jsonschema/kedro-catalog-X.XX.json if necessary
Added a description of this change in the relevant RELEASE.md file
Added tests to cover my changes
Received approvals from at least half of the TSC (required for adding a new, non-experimental dataset)

Added PolarsExcelDataset implementation

167f819

Stephanieewelu closed this Feb 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introducing PolarsExcelDataset: High-Performance Multi-Sheet Excel Integration for Kedro #1000

Introducing PolarsExcelDataset: High-Performance Multi-Sheet Excel Integration for Kedro #1000

Stephanieewelu commented Feb 7, 2025

Introducing PolarsExcelDataset: High-Performance Multi-Sheet Excel Integration for Kedro #1000

Introducing PolarsExcelDataset: High-Performance Multi-Sheet Excel Integration for Kedro #1000

Conversation

Stephanieewelu commented Feb 7, 2025