[SPARK-48035][SQL] Fix try_add/try_multiply being semantic equal to add/multiply #46307

db-scnakandala · 2024-04-30T18:19:31Z

What changes were proposed in this pull request?

This PR fixes a correctness bug in commutative operator canonicalization where we currently do not take into account the evaluation mode during operand reordering.
As a result, the following condition will be incorrectly true:

val l1 = Literal(1)
val l2 = Literal(2)
val l3 = Literal(3)
val expr1 = Add(Add(l1, l2), l3)
val expr2 = Add(Add(l2, l1, EvalMode.TRY), l3)
expr1.semanticEquals(expr2)

To fix the issue, we now reorder commutative operands only if all operators have the same evaluation mode.

Why are the changes needed?

To fix a correctness bug.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Added unit tests

Was this patch authored or co-authored using generative AI tooling?

No

db-scnakandala · 2024-05-01T03:45:42Z

cc: @cloud-fan

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CanonicalizeSuite.scala

…essions/CanonicalizeSuite.scala Co-authored-by: Hyukjin Kwon <[email protected]>

HyukjinKwon · 2024-05-07T01:02:15Z

Merged to master.

cloud-fan · 2024-05-07T01:45:07Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala

- // TODO: do not reorder consecutive `Add`s with different `evalMode`
- val reorderResult = buildCanonicalizedPlan(
+ val evalModes = collectEvalModes(this, {case Add(_, _, evalMode) => Seq(evalMode)})
+ lazy val reorderResult = buildCanonicalizedPlan(
 { case Add(l, r, _) => Seq(l, r) },


shall we simply add check here? case Add(l, r, em) if em == evalMode

That is neat! I will create a follow-up PR.

…equal to add/multiply ### What changes were proposed in this pull request? - This is a follow-up to the previous PR: #46307. - With the new changes we do the evalMode check in the `collectOperands` function instead of introducing a new function. ### Why are the changes needed? - Better code quality and readability. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Existing unit tests. ### Was this patch authored or co-authored using generative AI tooling? - No Closes #46414 from db-scnakandala/db-scnakandala/master. Authored-by: Supun Nakandala <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

…dd/multiply ### What changes were proposed in this pull request? - This PR fixes a correctness bug in commutative operator canonicalization where we currently do not take into account the evaluation mode during operand reordering. - As a result, the following condition will be incorrectly true: ``` val l1 = Literal(1) val l2 = Literal(2) val l3 = Literal(3) val expr1 = Add(Add(l1, l2), l3) val expr2 = Add(Add(l2, l1, EvalMode.TRY), l3) expr1.semanticEquals(expr2) ``` - To fix the issue, we now reorder commutative operands only if all operators have the same evaluation mode. ### Why are the changes needed? - To fix a correctness bug. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Added unit tests ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#46307 from db-scnakandala/db-scnakandala/master. Authored-by: Supun Nakandala <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>

…equal to add/multiply ### What changes were proposed in this pull request? - This is a follow-up to the previous PR: apache#46307. - With the new changes we do the evalMode check in the `collectOperands` function instead of introducing a new function. ### Why are the changes needed? - Better code quality and readability. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Existing unit tests. ### Was this patch authored or co-authored using generative AI tooling? - No Closes apache#46414 from db-scnakandala/db-scnakandala/master. Authored-by: Supun Nakandala <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

github-actions bot added the SQL label Apr 30, 2024

db-scnakandala marked this pull request as draft April 30, 2024 18:19

db-scnakandala force-pushed the db-scnakandala/master branch 2 times, most recently from 87e0dfa to c2a2edd Compare April 30, 2024 18:47

take into account the eval mode before reordering commutative operands

5a64cec

db-scnakandala force-pushed the db-scnakandala/master branch from c2a2edd to 5a64cec Compare April 30, 2024 20:05

db-scnakandala marked this pull request as ready for review April 30, 2024 22:13

db-scnakandala changed the title ~~[SPARK-48035] Fix try_add/try_multiply being semantic equal to add/multiply~~ [SPARK-48035][SQL] Fix try_add/try_multiply being semantic equal to add/multiply May 1, 2024

HyukjinKwon reviewed May 1, 2024

View reviewed changes

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CanonicalizeSuite.scala Outdated Show resolved Hide resolved

Update sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expr…

9f247ab

…essions/CanonicalizeSuite.scala Co-authored-by: Hyukjin Kwon <[email protected]>

db-scnakandala requested a review from HyukjinKwon May 6, 2024 20:57

HyukjinKwon approved these changes May 7, 2024

View reviewed changes

HyukjinKwon closed this in 7290000 May 7, 2024

cloud-fan reviewed May 7, 2024

View reviewed changes

This was referenced May 7, 2024

SPARK-48035][SQL][FOLLOWUP] Fix try_add/try_multiply being semantic equal to add/multiply #46413

Closed

[SPARK-48035][SQL][FOLLOWUP] Fix try_add/try_multiply being semantic equal to add/multiply #46414

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-48035][SQL] Fix try_add/try_multiply being semantic equal to add/multiply #46307

[SPARK-48035][SQL] Fix try_add/try_multiply being semantic equal to add/multiply #46307

db-scnakandala commented Apr 30, 2024 •

edited

db-scnakandala commented May 1, 2024

HyukjinKwon commented May 7, 2024

cloud-fan May 7, 2024

db-scnakandala May 7, 2024

[SPARK-48035][SQL] Fix try_add/try_multiply being semantic equal to add/multiply #46307

[SPARK-48035][SQL] Fix try_add/try_multiply being semantic equal to add/multiply #46307

Conversation

db-scnakandala commented Apr 30, 2024 • edited

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

db-scnakandala commented May 1, 2024

HyukjinKwon commented May 7, 2024

cloud-fan May 7, 2024

Choose a reason for hiding this comment

db-scnakandala May 7, 2024

Choose a reason for hiding this comment

db-scnakandala commented Apr 30, 2024 •

edited