-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
writerFooter is not supported for ORC file format #12418
Comments
@rui-mo We need to create/extend the FooterWrapper for the write path similar to the read path. |
Cc: @Yuhta |
@majetideepak I will take further look on how to generate the correct one. Thanks for your suggestion! |
@majetideepak @rui-mo It's not enough to just modify the write of orc file footer, there are other information such as statistics, Stripe Footer, index, etc. We implemented a complete ORC writer internally last year based on Velox's DWRF writer , and I could open source this code if needed. |
@wypb I recognized that when implementing the orc footer writing...
Glad to hear about your open source plan, and that would be great! |
I am in the process of organizing the code. There are many files involved (more than 40). I may be able to submit a PR as early as next week. |
Description
ORC reader needs to parse footer as
proto::orc::Footer
.velox/velox/dwio/dwrf/reader/ReaderBase.cpp
Lines 200 to 204 in 4adec18
Velox now only supports writing footer as the DWRF
proto::Footer
.velox/velox/dwio/dwrf/writer/WriterBase.cpp
Line 23 in 4adec18
When testing the ORC decimal read and write, we find that the ORC reader parses the precision and scale information from footer while the DWRF footer does not contain them. Long decimal type needs to be written as hugeint kind in the ORC footer while the DWRF format does not contain hugeint type.
The text was updated successfully, but these errors were encountered: