Skip to content

[Connector] Pojo To RowData Utility #726

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

MehulBatra
Copy link
Contributor

@MehulBatra MehulBatra commented Apr 8, 2025

Purpose
Linked issue: #723

Utility for PojoToRowData Converter for the end users to use inbuild for a quick startup.

Brief change log:

  • Added utility to convert PojoToRowData
  • Added tests for primitive types (boolean, numeric, floating point)
  • Added tests for complex types (decimal, temporal, binary, char)
  • Added edge case tests for null values and nullable fields

Tests:

  • Added unit tests for all Fluss data types including boolean, numeric, decimal, date/time, and binary types with appropriate value verification.

API and Format
No API or storage format changes.

Documentation
No documentation changes are required.

@MehulBatra
Copy link
Contributor Author

I have covered the complex types for the conversion also.
@polyzos @wuchong Please help me with the review.

@MehulBatra MehulBatra changed the title Pojo To RowData Utility [Connector] Pojo To RowData Utility Apr 8, 2025
@wuchong wuchong linked an issue Apr 9, 2025 that may be closed by this pull request
2 tasks
Copy link
Member

@wuchong wuchong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @MehulBatra , I left some comments about the implementation, please let me know if you have any questions.

initializeFieldMap();
}

private void initializeFieldMap() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of analyze the pojo class using reflection by ourselves, I would suggest directly using Flink Types.POJO(class) to get the PojoTypeInfo and we can get the Java Field from it. If you look into the source code of TypeExtractor.createTypeInfo, you will find devil is in the details...

Besides, we can't flatten all the nested fields, because there may be different fields with the same field name. Such nested POJO fields should be converted into nseted InternalRow. We don't need to support this case and can just throw nested POJO fields, as we currently doesn't support nested row type.

*
* @param <T> The POJO type to convert
*/
public class PojoToRowDataConverter<T> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can learn the implementation from org.apache.flink.table.data.util.DataFormatConverters.PojoConverter which 1. first checks the date type of schema and the Java class of the Field, and determine the concrete converter implementation (e.g., 3 different converters for TIMESTAMP type).
2. use the converter to convert the object into a GenericRow.

@MehulBatra
Copy link
Contributor Author

Thanks for the contribution @MehulBatra , I left some comments about the implementation, please let me know if you have any questions.

Thanks for the feedback. I will address these comments over the coming weekend and get back to you in case I am stuck!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Connector] Inbuilt PojoToRowDataConverter utility
2 participants