-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-54403][SQL][Metric View] Add YAML serde infrastructure for metric views #53146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
… for metric views This commit adds the complete serialization/deserialization infrastructure for parsing metric view YAML definitions: - Add Jackson YAML dependencies to pom.xml - Implement canonical model for metric views: - Column, Expression (Dimension/Measure), MetricView, Source - YAMLVersion validation and exception types - Implement version-specific serde (v0.1): - YAML deserializer/serializer - Base classes for extensibility - Add JSON utilities for metadata serialization
| <dependency> | ||
| <groupId>com.fasterxml.jackson.dataformat</groupId> | ||
| <artifactId>jackson-dataformat-yaml</artifactId> | ||
| <version>${fasterxml.jackson.version}</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all jackson deps are managed by jackson-bom now, you don't need to declare it again here
|
|
||
| object JsonUtils { | ||
| // Singleton ObjectMapper that can be used across the project | ||
| private lazy val mapper: ObjectMapper = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the comments, should we try to use this singleton as much as possible in the subsequent development?
| // Only parse the "version" field and ignore all others | ||
| @JsonIgnoreProperties(ignoreUnknown = true) | ||
| case class YAMLVersion(version: String) extends Validatable { | ||
| def validYAMLVersions: Set[String] = Set("0.1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def validYAMLVersions: Set[String] = Set("0.1") | |
| private def validYAMLVersions: Set[String] = Set("0.1") |
|
|
||
| val mapper = new ObjectMapper(yamlFactory) | ||
| // Exclude null values from serialized output | ||
| .setSerializationInclusion(JsonInclude.Include.NON_NULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| .setSerializationInclusion(JsonInclude.Include.NON_NULL) | |
| . setDefaultPropertyInclusion(JsonInclude.Include.NON_NULL) |
| // Exclude null values from serialized output | ||
| .setSerializationInclusion(JsonInclude.Include.NON_NULL) | ||
| // Exclude empty collections/strings from serialized output | ||
| .setSerializationInclusion(JsonInclude.Include.NON_EMPTY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| .setSerializationInclusion(JsonInclude.Include.NON_EMPTY) | |
| . setDefaultPropertyInclusion(JsonInclude.Include.NON_EMPTY) |
| ) extends ColumnBase | ||
|
|
||
| object ColumnV01 { | ||
| def fromCanonical(canonical: Column[_ <: Expression], ordinal: Int): ColumnV01 = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is ordinal mandatory?
| Try(CatalystSqlParser.parseQuery(sourceText)) match { | ||
| case Success(_) => SQLSource(sourceText) | ||
| case Failure(queryEx) => | ||
| throw queryEx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should queryEx be wrapped into MetricViewValidationException?
What changes were proposed in this pull request?
This PR adds the complete serialization/deserialization infrastructure for parsing metric view YAML definitions:
Why are the changes needed?
SPIP: Metrics & semantic modeling in Spark
Does this PR introduce any user-facing change?
No
How was this patch tested?
Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code
Co-Authored-By: Claude [email protected]