Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-serializable dwio::common::WriterOptions in query plan #12437

Open
zhztheplayer opened this issue Feb 24, 2025 · 4 comments
Open

Non-serializable dwio::common::WriterOptions in query plan #12437

zhztheplayer opened this issue Feb 24, 2025 · 4 comments
Labels
bug Something isn't working triage Newly created issue that needs attention.

Comments

@zhztheplayer
Copy link
Contributor

zhztheplayer commented Feb 24, 2025

Description

When building a TableWriteNode from JSON, field TableWriteNode->insertTableHandle->writerOptions is always null since there is no deserialization code written for it. See code.

This blocks user from executing a JSON Velox plan with customized write options (page size, etc.). A fix might be needed.

@zhztheplayer zhztheplayer added bug Something isn't working triage Newly created issue that needs attention. labels Feb 24, 2025
@zhztheplayer zhztheplayer changed the title Non-serializable dwio::common::WriterOptions in query plan Non-deserializable dwio::common::WriterOptions in query plan Feb 24, 2025
@zhztheplayer zhztheplayer changed the title Non-deserializable dwio::common::WriterOptions in query plan Non-serializable dwio::common::WriterOptions in query plan Feb 25, 2025
@zhztheplayer
Copy link
Contributor Author

Hi @pedroerp @majetideepak, do you have any insights on this? Can we remove the argument https://github.com/facebookincubator/velox/pull/10380/files#diff-a1fcb0b1688f565292b01dfa744db481ec2928d5dc3dc4e889dec70968a85e03R200-R201, and instead provide an API to let user specify the default writer options (reader as well perhaps) when registering the writer factory?

@majetideepak
Copy link
Collaborator

@zhztheplayer The writer factory has a createWriter API where you can pass the writerOptions and it also has the createWriterOptions() API. For Prestissimo at least, the plan fragment does not contain the writerOptions. So this current setup worked so far.

@zhztheplayer
Copy link
Contributor Author

zhztheplayer commented Feb 26, 2025

@zhztheplayer The writer factory has a createWriter API where you can pass the writerOptions and it also has the createWriterOptions() API. For Prestissimo at least, the plan fragment does not contain the writerOptions. So this current setup worked so far.

I was thinking to add the options as an argument directly to facebook::velox::parquet::registerParquetWriterFactory, etc. This is just a convenience change which is probably not that important here.

For Prestissimo at least, the plan fragment does not contain the writerOptions. So this current setup worked so far.

If no usages yet, should we remove the options from query plan fragment to maintain the immutability and consistency in serialization / deserialization of the WriteFilesNode? Do we have some design manners or not, to keep the plan fragments immutable?

@majetideepak
Copy link
Collaborator

I agree that we should remove the options. But I will let @pedroerp answer this since he is more familiar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Newly created issue that needs attention.
Projects
None yet
Development

No branches or pull requests

2 participants