Skip to content

Conversation

cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

We skip type checking for these ExtractValue expressions because we assume they are always created safely by ExtractValue.apply. However, plan transformations may change the attribute data type and break these expressions. This PR adds type checking for these expressions, so that the plan change vailidation can capture the problemaitc rule earlier, instead of triggering Java cast exception at a late point.

Why are the changes needed?

better error reporting

Does this PR introduce any user-facing change?

no

How was this patch tested?

new test

Was this patch authored or co-authored using generative AI tooling?

no

@github-actions github-actions bot added the SQL label Sep 3, 2025
@cloud-fan
Copy link
Contributor Author

cc @viirya

case ArrayType(_: StructType, _) => TypeCheckResult.TypeCheckSuccess
// This should never happen, unless we hit a bug.
case other =>
TypeCheckResult.TypeCheckFailure("GetArrayStructFields.child must be array of struct type")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: append with , but got ${child.dataType}?

}
// This should never happen, unless we hit a bug.
case other =>
TypeCheckResult.TypeCheckFailure("GetMapValue.child must be map type")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

this(left, right, GetArrayItem(left, right, failOnError = false))
override def inputTypes: Seq[AbstractDataType] = left.dataType match {
case _: ArrayType => Seq(ArrayType, IntegerType)
// Do not apply implicit cast if the first arguement is not array type.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to keep the null type error behavior. #50590 improved the error message for it, but also made GetArrayItem fail for null type input. This usually doesn't matter as GetArrayItem should never get null type input except from Get, but this PR restores it for consistency with other ExtractValue expressions.

@@ -285,8 +293,7 @@ case class GetArrayItem(
with ExtractValue
with SupportQueryContext {

// We have done type checking for child in `ExtractValue`, so only need to check the `ordinal`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment also didn't look correct. Actually we did checking the datatype of the child in checkInputDataTypes.

}

// We have done type checking for child in `ExtractValue`, so only need to check the `key`.
override def inputTypes: Seq[AbstractDataType] = Seq(AnyDataType, keyType)
override def inputTypes: Seq[AbstractDataType] = Seq(MapType, keyType)
Copy link
Member

@viirya viirya Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, if it is possible that the child data type could be other than MapType, the keyType cannot directly call asInstanceOf to cast to MapType, because when we call inputTypes, the child data type is not done checking type.

Copy link
Contributor Author

@cloud-fan cloud-fan Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is why I override def checkInputDataTypes, so that we only access inputTypes when the first child is map type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh okay.

@cloud-fan
Copy link
Contributor Author

thanks for the review, merging to master!

@cloud-fan cloud-fan closed this in c7d8c2b Sep 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants