-
Notifications
You must be signed in to change notification settings - Fork 5.5k
feat: Add JSON to WriterTarget #26471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Reviewer's GuideEnable JSON serialization support for TableWriterNode.WriterTarget and its subclasses by adding Jackson annotations, adjust planner classes for serialization, update test dependencies, and add comprehensive serialization tests. Class diagram for updated TableWriterNode.WriterTarget JSON serializationclassDiagram
TableWriterNode <|-- WriterTarget
WriterTarget <|-- CreateName
WriterTarget <|-- InsertReference
WriterTarget <|-- DeleteHandle
WriterTarget <|-- RefreshMaterializedViewReference
WriterTarget <|-- UpdateTarget
WriterTarget <|-- CanonicalWriterTarget
class WriterTarget {
<<abstract>>
+ConnectorId getConnectorId()
+SchemaTableName getSchemaTableName()
+Optional<List<OutputColumnMetadata>> getOutputColumns()
}
class CreateName {
+ConnectorId connectorId
+ConnectorTableMetadata tableMetadata
+Optional<NewTableLayout> layout
+Optional<List<OutputColumnMetadata>> columns
+CreateName(connectorId, tableMetadata, layout, columns)
+getConnectorId()
+getTableMetadata()
+getLayout()
+getSchemaTableName()
+getOutputColumns()
+toString()
}
class InsertReference {
+TableHandle handle
+SchemaTableName schemaTableName
+Optional<List<OutputColumnMetadata>> columns
+InsertReference(handle, schemaTableName, columns)
+getHandle()
+getSchemaTableName()
+getConnectorId()
+getOutputColumns()
}
class DeleteHandle {
+TableHandle handle
+SchemaTableName schemaTableName
+DeleteHandle(handle, schemaTableName)
+getHandle()
+getSchemaTableName()
+getConnectorId()
+getOutputColumns()
}
class RefreshMaterializedViewReference {
+TableHandle handle
+SchemaTableName schemaTableName
+RefreshMaterializedViewReference(handle, schemaTableName)
+getHandle()
+getSchemaTableName()
+getConnectorId()
+getOutputColumns()
}
class UpdateTarget {
+TableHandle handle
+SchemaTableName schemaTableName
+List<String> updatedColumns
+List<ColumnHandle> updatedColumnHandles
+UpdateTarget(handle, schemaTableName, updatedColumns, updatedColumnHandles)
+getHandle()
+getSchemaTableName()
+getConnectorId()
+getOutputColumns()
+getUpdatedColumns()
+getUpdatedColumnHandles()
}
class CanonicalWriterTarget {
+ConnectorId connectorId
+String writerTargetType
+CanonicalWriterTarget(connectorId, writerTargetType)
+getConnectorId()
+getSchemaTableName()
+getOutputColumns()
}
Class diagram for CanonicalPlan and CanonicalWriterTarget serialization changesclassDiagram
CanonicalPlan o-- PlanNode
CanonicalPlan o-- PlanCanonicalizationStrategy
CanonicalPlan ..> CanonicalWriterTarget : uses
class CanonicalPlan {
-PlanNode plan
-PlanCanonicalizationStrategy strategy
+toString(ObjectMapper)
}
class CanonicalWriterTarget {
+ConnectorId connectorId
+String writerTargetType
+getConnectorId()
+getSchemaTableName()
+getOutputColumns()
}
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
3c0ca5e to
0303808
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- The catch block in CanonicalPlan.toString logs serialization errors at INFO level, which could be too verbose; consider lowering it to DEBUG or WARN to avoid log noise.
- The WriterTarget subclasses include a lot of repetitive @JsonCreator, @JsonProperty, and @JsonIgnore annotations; consider using Jackson mixins or a shared configuration to reduce boilerplate.
- The test-scope 'javax.inject' dependency added to pom.xml doesn’t appear to be used; please remove it or clarify why it is needed.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The catch block in CanonicalPlan.toString logs serialization errors at INFO level, which could be too verbose; consider lowering it to DEBUG or WARN to avoid log noise.
- The WriterTarget subclasses include a lot of repetitive @JsonCreator, @JsonProperty, and @JsonIgnore annotations; consider using Jackson mixins or a shared configuration to reduce boilerplate.
- The test-scope 'javax.inject' dependency added to pom.xml doesn’t appear to be used; please remove it or clarify why it is needed.
## Individual Comments
### Comment 1
<location> `presto-spi/src/main/java/com/facebook/presto/spi/plan/TableWriterNode.java:309-313` </location>
<code_context>
private final Optional<List<OutputColumnMetadata>> columns;
- public CreateName(ConnectorId connectorId, ConnectorTableMetadata tableMetadata, Optional<NewTableLayout> layout, Optional<List<OutputColumnMetadata>> columns)
+ @JsonCreator
+ public CreateName(ConnectorId connectorId,
+ ConnectorTableMetadata tableMetadata,
+ Optional<NewTableLayout> layout,
</code_context>
<issue_to_address>
**suggestion:** Inconsistent use of @JsonProperty on CreateName constructor parameters.
Adding @JsonProperty to each constructor parameter will help prevent deserialization errors and maintain consistency with other WriterTarget subclasses.
```suggestion
@JsonCreator
public CreateName(
@JsonProperty("connectorId") ConnectorId connectorId,
@JsonProperty("tableMetadata") ConnectorTableMetadata tableMetadata,
@JsonProperty("layout") Optional<NewTableLayout> layout,
@JsonProperty("columns") Optional<List<OutputColumnMetadata>> columns)
```
</issue_to_address>
### Comment 2
<location> `presto-spi/src/test/java/com/facebook/presto/spi/plan/TestTableWriterNode.java:77-78` </location>
<code_context>
+ writerTargetCodec = codecFactory.jsonCodec(TableWriterNode.WriterTarget.class);
+ }
+
+ @Test
+ public void testCreateNameTargetSerialization()
+ {
+ ConnectorId connectorId = new ConnectorId("test_catalog");
</code_context>
<issue_to_address>
**suggestion (testing):** Missing deserialization round-trip assertion for CreateName target.
Please add an assertion to deserialize the JSON and verify the resulting object equals the original CreateName instance, as done in similar tests.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| @JsonCreator | ||
| public CreateName(ConnectorId connectorId, | ||
| ConnectorTableMetadata tableMetadata, | ||
| Optional<NewTableLayout> layout, | ||
| Optional<List<OutputColumnMetadata>> columns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Inconsistent use of @JsonProperty on CreateName constructor parameters.
Adding @JsonProperty to each constructor parameter will help prevent deserialization errors and maintain consistency with other WriterTarget subclasses.
| @JsonCreator | |
| public CreateName(ConnectorId connectorId, | |
| ConnectorTableMetadata tableMetadata, | |
| Optional<NewTableLayout> layout, | |
| Optional<List<OutputColumnMetadata>> columns) | |
| @JsonCreator | |
| public CreateName( | |
| @JsonProperty("connectorId") ConnectorId connectorId, | |
| @JsonProperty("tableMetadata") ConnectorTableMetadata tableMetadata, | |
| @JsonProperty("layout") Optional<NewTableLayout> layout, | |
| @JsonProperty("columns") Optional<List<OutputColumnMetadata>> columns) |
| @Test | ||
| public void testCreateNameTargetSerialization() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Missing deserialization round-trip assertion for CreateName target.
Please add an assertion to deserialize the JSON and verify the resulting object equals the original CreateName instance, as done in similar tests.
hantangwangd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A quick question: is there any scenarios we need to serialize the subclasses of WriteTarget? From what I understand, they seem to be used only during planning and wouldn't need to be sent to workers.
Hi @hantangwangd , we have a use case in presto-on-spark that uses the serialized WriterTarget to extract table information (schema, table name, path) in a context of a different classloader (cross spark driver-executor boundary). We could not rely on cast of the class to get the needed information because of the class loader issue (jvm will not allow the cast because it sees the object class type and the cast class type as two different classes). We have been relying on the same JSON way for table scan but since table writer node does not have the needed information already JSON serializable, we are hence adding them. |
|
@tanjialiang got it! Thanks for your explanation. |
f62048e to
2351780
Compare
aditi-pandit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tanjialiang for this code.
Please can you give more details about the motivation for this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove this diff ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the empty line when coming across this file seeing the lacking of the empty line. But if we don't want it in this PR I can remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this have presto_protocol changes for native side ? Please can you regenerate https://github.com/prestodb/presto/tree/master/presto-native-execution/presto_cpp/presto_protocol#presto-native-worker-protocol-code-generation
Hi @aditi-pandit, thanks for your comments. I have explained above. Let me paste here for your easier reference: we have a use case in presto-on-spark that uses the serialized WriterTarget to extract table information (schema, table name, path) in a context of a different classloader (cross spark driver-executor boundary). We could not rely on cast of the class to get the needed information because of the class loader issue (jvm will not allow the cast because it sees the object class type and the cast class type as two different classes). We have been relying on the same JSON way for table scan but since table writer node does not have the needed information already JSON serializable, we are hence adding them. Please let me know what more details are needed. Thanks! |
|
|
||
| public class CanonicalPlan | ||
| { | ||
| private static final Logger log = Logger.get(CanonicalPlan.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logger is not used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will remove
Description
Allow JSON serialization to TableWriteNode.WriterTarget.