Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DNR] Unify existing RLE dictionary decoders #24728

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

imsayari404
Copy link
Contributor

Description

Unifies decoding logic in GenericRLEDictionaryValuesDecoder, improving performance and reliability.

Motivation and Context

Ensures consistent RLE handling, reducing errors and improving Parquet reader efficiency.
fixes #23612

Impact

unifies decoding for RLE decoders. Improves performance and reliability.

Test Plan

Ran ./mvnw clean install -pl presto-parquet to execute all unit and integration tests within the presto-parquet module.

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* ... 
* ... 

Hive Connector Changes
* ... 
* ... 

If release note is NOT required, use:

== NO RELEASE NOTE ==

…iciency

- Reduces redundant code and ensures consistent RLE/Bit-packing decoding logic across all dictionary-based decoders.
- Eliminates unnecessary buffer allocations in each readNext call, improving performance and reducing memory overhead.
- This ensures that the BooleanRLEValuesDecoder is tested with valid
  data, and the test now passes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
from:IBM PR from IBM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unify the Parquet dictionary value decoders
2 participants