[SPARK-48037][CORE] Fix SortShuffleWriter lacks shuffle write related metrics resulting in potentially inaccurate data #46273

cxzl25 · 2024-04-29T04:34:45Z

What changes were proposed in this pull request?

This PR aims to fix SortShuffleWriter lacks shuffle write related metrics resulting in potentially inaccurate data.

Why are the changes needed?

When the shuffle writer is SortShuffleWriter, it does not use SQLShuffleWriteMetricsReporter to update metrics, which causes AQE to obtain runtime statistics and the rowCount obtained is 0.

Some optimization rules rely on rowCount statistics, such as EliminateLimits. Because rowCount is 0, it removes the limit operator. At this time, we get data results without limit.

spark/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala

Lines 168 to 172 in 59d5946

 override def runtimeStatistics: Statistics = { 

 val dataSize = metrics("dataSize").value 

 val rowCount = metrics(SQLShuffleWriteMetricsReporter.SHUFFLE_RECORDS_WRITTEN).value 

 Statistics(dataSize, Some(rowCount)) 

 }

spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala

Lines 2067 to 2070 in 59d5946

 object EliminateLimits extends Rule[LogicalPlan] { 

 private def canEliminate(limitExpr: Expression, child: LogicalPlan): Boolean = { 

 limitExpr.foldable && child.maxRows.exists { _ <= limitExpr.eval().asInstanceOf[Int] } 

 }

Does this PR introduce any user-facing change?

Yes

How was this patch tested?

Production environment verification.

master metrics

PR metrics

Was this patch authored or co-authored using generative AI tooling?

No

dongjoon-hyun

Could you add a test case for this, @cxzl25 ?

cxzl25 · 2024-04-29T10:19:06Z

sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala

@@ -85,8 +86,10 @@ class AdaptiveQueryExecSuite
 assert(planBefore.toString.startsWith("AdaptiveSparkPlan isFinalPlan=false"))
 val result = dfAdaptive.collect()
 withSQLConf(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "false") {
- val df = sql(query)
- checkAnswer(df, result.toImmutableArraySeq)
+ if (!skipCheckAnswer) {


Sometimes the SQL results are not always ordered, so a parameter has been added to support skip checking.

Could you make a separate PR for this, @cxzl25 ?

mridulm

The changes to core looks fine to me, but I am not very sure of why we are skipping the tests and whether the test is actually testing what we expect.
I will let @dongjoon-hyun review that (along with rest of the PR) better.

dongjoon-hyun · 2024-04-30T19:43:44Z

Thank you for review, @mridulm .

cxzl25 · 2024-05-01T06:33:55Z

why we are skipping the tests

Limits after group by are not guaranteed to be in order.

[info]   == Results ==
[info]   !== Correct Answer - 1 ==            == Spark Answer - 1 ==
[info]    struct<id:bigint,count(1):bigint>   struct<id:bigint,count(1):bigint>
[info]   ![1,1]                               [0,1] (QueryTest.scala:267)
[info]   org.scalatest.exceptions.TestFailedException:
[info]   at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)
[info]   at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471)
[info]   at org.apache.spark.sql.QueryTest$.newAssertionFailedException(QueryTest.scala:257)
[info]   at org.scalatest.Assertions.fail(Assertions.scala:933)
[info]   at org.scalatest.Assertions.fail$(Assertions.scala:929)
[info]   at org.apache.spark.sql.QueryTest$.fail(QueryTest.scala:257)
[info]   at org.apache.spark.sql.QueryTest$.checkAnswer(QueryTest.scala:267)
[info]   at org.apache.spark.sql.QueryTest.checkAnswer(QueryTest.scala:153)
[info]   at org.apache.spark.sql.execution.adaptive.AdaptiveQueryExecSuite.$anonfun$runAdaptiveAndVerifyResult$1(AdaptiveQueryExecSuite.scala:91)

whether the test is actually testing what we expect

In this added UT check if the final execution plan of AQE contains the limit operator.

./bin/spark-sql --conf spark.driver.memory=6g

set spark.sql.shuffle.partitions=16777217;
create table foo as select id from range(2);
select id, count(*) from foo group by id limit 1;

== Physical Plan ==
AdaptiveSparkPlan (7)
+- == Final Plan ==
   LocalTableScan (1)
+- == Initial Plan ==
   CollectLimit (6)
   +- HashAggregate (5)
      +- Exchange (4)
         +- HashAggregate (3)
            +- Scan hive spark_catalog.default.foo (2)

…AndVerifyResult` to skip check results ### What changes were proposed in this pull request? This PR aims to support AdaptiveQueryExecSuite to skip check results. ### Why are the changes needed? #46273 (comment) ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? GA ### Was this patch authored or co-authored using generative AI tooling? No Closes #46316 from cxzl25/SPARK-48070. Authored-by: sychen <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

dongjoon-hyun · 2024-05-01T22:05:22Z

I merged #46316 . Could you rebase this PR to the master branch, @cxzl25 ?

dongjoon-hyun · 2024-05-01T22:05:35Z

Also, cc @viirya , too.

viirya · 2024-05-01T22:21:22Z

sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala

+ |LIMIT 1
+ |""".stripMargin, skipCheckAnswer = true)
+ assert(findTopLevelLimit(plan).size == 1)
+ assert(findTopLevelLimit(adaptivePlan).size == 1)


Hmm, this only verifies if there is specific operator (i.e., Limit) in the query plan, how is it related to the metrics you want to fix?

Oh, I see. It is due to AQE usage of runtime metrics.

viirya · 2024-05-01T22:29:12Z

core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala

@@ -710,7 +711,7 @@ private[spark] class ExternalSorter[K, V, C](
 serializerManager,
 serInstance,
 blockId,
- context.taskMetrics().shuffleWriteMetrics,


Hmm, the metrics given at getWriter looks like also coming from context.taskMetrics().shuffleWriteMetrics, isn't it looking the same?

It may be SQLShuffleWriteMetricsReporter, which may not be the same as context.taskMetrics().shuffleWriteMetrics.

spark/core/src/main/scala/org/apache/spark/shuffle/ShuffleWriteProcessor.scala

Lines 51 to 56 in 9e12bd7

writer = manager.getWriter[Any, Any](

dep.shuffleHandle,

mapId,

context,

createMetricsReporter(context))

writer.write(inputs.asInstanceOf[Iterator[_ <: Product2[Any, Any]]])

spark/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala

Lines 449 to 455 in 9e12bd7

def createShuffleWriteProcessor(metrics: Map[String, SQLMetric]): ShuffleWriteProcessor = {

new ShuffleWriteProcessor {

override protected def createMetricsReporter(

context: TaskContext): ShuffleWriteMetricsReporter = {

new SQLShuffleWriteMetricsReporter(context.taskMetrics().shuffleWriteMetrics, metrics)

}

}

mridulm · 2024-05-07T04:47:48Z

Looks like this missed the 3.4 release by a month @dongjoon-hyun ... might have been a nice addition to it !

dongjoon-hyun · 2024-05-07T05:27:25Z

Looks like this missed the 3.4 release by a month @dongjoon-hyun ... might have been a nice addition to it !

Ya, I agree and this is still a good addition to 4.0.0-preview.

Cc @cloud-fan as the release manager of 4.0.0-preview

cloud-fan · 2024-05-07T09:35:18Z

sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala

+ |FROM t3
+ |GROUP BY id
+ |LIMIT 1
+ |""".stripMargin, skipCheckAnswer = true)


do we have to skip checking the answer? I think the query can be LIMIT 10 so that the result is deterministic.

Because AQE is disabled when checking the results, it will have many partitions, and the order is not guaranteed at this time. When AQE is enabled, it will be merged into one partition.

set spark.sql.adaptive.enabled=false; set spark.sql.shuffle.partitions=1000; create table foo as select id from range(2); select id, count(*) from foo group by id limit 1;

output

1 1

AFAIK the checkAnswer util will sort the data before comparison.

It does sort, but the final result is sorted locally. Because the result of the limit is uncertain, the local sort has no effect.

spark/sql/core/src/test/scala/org/apache/spark/sql/QueryTest.scala

Lines 282 to 293 in 08c6bb9

def getErrorMessageInCheckAnswer(

df: DataFrame,

expectedAnswer: Seq[Row],

checkToRDD: Boolean = true): Option[String] = {

val isSorted = df.logicalPlan.collect { case s: logical.Sort => s }.nonEmpty

if (checkToRDD) {

SQLExecution.withSQLConfPropagated(df.sparkSession) {

df.rdd.count() // Also attempt to deserialize as an RDD [SPARK-15791]

}

}

val sparkAnswer = try df.collect().toSeq catch {

we can make it certain if the limit is larger than the number of result rows?

In this test case, because spark.sql.adaptive.enabled=false, partitions will not be merged.
It has a large number of partitions and tasks, so it requires a large amount of driver memory to execute successfully.

sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala

dongjoon-hyun

+1, LGTM.

dongjoon-hyun · 2024-05-07T17:15:36Z

Merged to master for Apache Spark 4.0.0-preview.

Could you make backporting PRs to the release branches, @cxzl25 ?

… metrics resulting in potentially inaccurate data ### What changes were proposed in this pull request? This PR aims to fix SortShuffleWriter lacks shuffle write related metrics resulting in potentially inaccurate data. ### Why are the changes needed? When the shuffle writer is SortShuffleWriter, it does not use SQLShuffleWriteMetricsReporter to update metrics, which causes AQE to obtain runtime statistics and the rowCount obtained is 0. Some optimization rules rely on rowCount statistics, such as `EliminateLimits`. Because rowCount is 0, it removes the limit operator. At this time, we get data results without limit. https://github.com/apache/spark/blob/59d5946cfd377e9203ccf572deb34f87fab7510c/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala#L168-L172 https://github.com/apache/spark/blob/59d5946cfd377e9203ccf572deb34f87fab7510c/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala#L2067-L2070 ### Does this PR introduce _any_ user-facing change? Yes ### How was this patch tested? Production environment verification. **master metrics** <img width="296" alt="image" src="https://github.com/apache/spark/assets/3898450/dc9b6e8a-93ec-4f59-a903-71aa5b11962c"> **PR metrics** <img width="276" alt="image" src="https://github.com/apache/spark/assets/3898450/2d73b773-2dcc-4d23-81de-25dcadac86c1"> ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#46273 from cxzl25/SPARK-48037. Authored-by: sychen <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit e24f896)

…AndVerifyResult` to skip check results ### What changes were proposed in this pull request? This PR aims to support AdaptiveQueryExecSuite to skip check results. ### Why are the changes needed? apache#46273 (comment) ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? GA ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#46316 from cxzl25/SPARK-48070. Authored-by: sychen <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 35767bb)

… metrics resulting in potentially inaccurate data ### What changes were proposed in this pull request? This PR aims to fix SortShuffleWriter lacks shuffle write related metrics resulting in potentially inaccurate data. ### Why are the changes needed? When the shuffle writer is SortShuffleWriter, it does not use SQLShuffleWriteMetricsReporter to update metrics, which causes AQE to obtain runtime statistics and the rowCount obtained is 0. Some optimization rules rely on rowCount statistics, such as `EliminateLimits`. Because rowCount is 0, it removes the limit operator. At this time, we get data results without limit. https://github.com/apache/spark/blob/59d5946cfd377e9203ccf572deb34f87fab7510c/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala#L168-L172 https://github.com/apache/spark/blob/59d5946cfd377e9203ccf572deb34f87fab7510c/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala#L2067-L2070 ### Does this PR introduce _any_ user-facing change? Yes ### How was this patch tested? Production environment verification. **master metrics** <img width="296" alt="image" src="https://github.com/apache/spark/assets/3898450/dc9b6e8a-93ec-4f59-a903-71aa5b11962c"> **PR metrics** <img width="276" alt="image" src="https://github.com/apache/spark/assets/3898450/2d73b773-2dcc-4d23-81de-25dcadac86c1"> ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#46273 from cxzl25/SPARK-48037. Authored-by: sychen <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit e24f896)

…AndVerifyResult` to skip check results ### What changes were proposed in this pull request? This PR aims to support AdaptiveQueryExecSuite to skip check results. ### Why are the changes needed? apache#46273 (comment) ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? GA ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#46316 from cxzl25/SPARK-48070. Authored-by: sychen <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 35767bb)

…AndVerifyResult` to skip check results ### What changes were proposed in this pull request? This PR aims to support AdaptiveQueryExecSuite to skip check results. ### Why are the changes needed? apache#46273 (comment) ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? GA ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#46316 from cxzl25/SPARK-48070. Authored-by: sychen <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

… metrics resulting in potentially inaccurate data ### What changes were proposed in this pull request? This PR aims to fix SortShuffleWriter lacks shuffle write related metrics resulting in potentially inaccurate data. ### Why are the changes needed? When the shuffle writer is SortShuffleWriter, it does not use SQLShuffleWriteMetricsReporter to update metrics, which causes AQE to obtain runtime statistics and the rowCount obtained is 0. Some optimization rules rely on rowCount statistics, such as `EliminateLimits`. Because rowCount is 0, it removes the limit operator. At this time, we get data results without limit. https://github.com/apache/spark/blob/59d5946cfd377e9203ccf572deb34f87fab7510c/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala#L168-L172 https://github.com/apache/spark/blob/59d5946cfd377e9203ccf572deb34f87fab7510c/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala#L2067-L2070 ### Does this PR introduce _any_ user-facing change? Yes ### How was this patch tested? Production environment verification. **master metrics** <img width="296" alt="image" src="https://github.com/apache/spark/assets/3898450/dc9b6e8a-93ec-4f59-a903-71aa5b11962c"> **PR metrics** <img width="276" alt="image" src="https://github.com/apache/spark/assets/3898450/2d73b773-2dcc-4d23-81de-25dcadac86c1"> ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#46273 from cxzl25/SPARK-48037. Authored-by: sychen <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

github-actions bot added CORE SQL labels Apr 29, 2024

dongjoon-hyun reviewed Apr 29, 2024

View reviewed changes

cxzl25 commented Apr 29, 2024

View reviewed changes

mridulm reviewed Apr 30, 2024

View reviewed changes

cxzl25 mentioned this pull request May 1, 2024

[SPARK-48070][SQL][TESTS] Support AdaptiveQueryExecSuite.runAdaptiveAndVerifyResult to skip check results #46316

Closed

viirya reviewed May 1, 2024

View reviewed changes

cxzl25 added 3 commits May 2, 2024 11:36

metrics

0d40018

compile

90012db

Add UT

43b78f9

cxzl25 force-pushed the SPARK-48037 branch from 5e0f17e to 43b78f9 Compare May 2, 2024 03:36

cloud-fan reviewed May 7, 2024

View reviewed changes

sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala Outdated Show resolved Hide resolved

cloud-fan approved these changes May 7, 2024

View reviewed changes

cloud-fan reviewed May 7, 2024

View reviewed changes

sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala Show resolved Hide resolved

comment

2e46498

dongjoon-hyun approved these changes May 7, 2024

View reviewed changes

dongjoon-hyun closed this in e24f896 May 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-48037][CORE] Fix SortShuffleWriter lacks shuffle write related metrics resulting in potentially inaccurate data #46273

[SPARK-48037][CORE] Fix SortShuffleWriter lacks shuffle write related metrics resulting in potentially inaccurate data #46273

cxzl25 commented Apr 29, 2024

dongjoon-hyun left a comment

cxzl25 Apr 29, 2024

dongjoon-hyun Apr 30, 2024

cxzl25 May 1, 2024

mridulm left a comment

dongjoon-hyun commented Apr 30, 2024

cxzl25 commented May 1, 2024

dongjoon-hyun commented May 1, 2024

dongjoon-hyun commented May 1, 2024

viirya May 1, 2024

viirya May 1, 2024

viirya May 1, 2024

cxzl25 May 2, 2024

mridulm commented May 7, 2024 •

edited

dongjoon-hyun commented May 7, 2024

cloud-fan May 7, 2024

cxzl25 May 7, 2024

cloud-fan May 7, 2024

cxzl25 May 7, 2024 •

edited

cloud-fan May 7, 2024

cxzl25 May 8, 2024

dongjoon-hyun left a comment

dongjoon-hyun commented May 7, 2024

	override def runtimeStatistics: Statistics = {
	val dataSize = metrics("dataSize").value
	val rowCount = metrics(SQLShuffleWriteMetricsReporter.SHUFFLE_RECORDS_WRITTEN).value
	Statistics(dataSize, Some(rowCount))
	}

	object EliminateLimits extends Rule[LogicalPlan] {
	private def canEliminate(limitExpr: Expression, child: LogicalPlan): Boolean = {
	limitExpr.foldable && child.maxRows.exists { _ <= limitExpr.eval().asInstanceOf[Int] }
	}

	writer = manager.getWriter[Any, Any](
	dep.shuffleHandle,
	mapId,
	context,
	createMetricsReporter(context))
	writer.write(inputs.asInstanceOf[Iterator[_ <: Product2[Any, Any]]])

	def createShuffleWriteProcessor(metrics: Map[String, SQLMetric]): ShuffleWriteProcessor = {
	new ShuffleWriteProcessor {
	override protected def createMetricsReporter(
	context: TaskContext): ShuffleWriteMetricsReporter = {
	new SQLShuffleWriteMetricsReporter(context.taskMetrics().shuffleWriteMetrics, metrics)
	}
	}

	def getErrorMessageInCheckAnswer(
	df: DataFrame,
	expectedAnswer: Seq[Row],
	checkToRDD: Boolean = true): Option[String] = {
	val isSorted = df.logicalPlan.collect { case s: logical.Sort => s }.nonEmpty
	if (checkToRDD) {
	SQLExecution.withSQLConfPropagated(df.sparkSession) {
	df.rdd.count() // Also attempt to deserialize as an RDD [SPARK-15791]
	}
	}

	val sparkAnswer = try df.collect().toSeq catch {

[SPARK-48037][CORE] Fix SortShuffleWriter lacks shuffle write related metrics resulting in potentially inaccurate data #46273

[SPARK-48037][CORE] Fix SortShuffleWriter lacks shuffle write related metrics resulting in potentially inaccurate data #46273

Conversation

cxzl25 commented Apr 29, 2024

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mridulm left a comment

Choose a reason for hiding this comment

dongjoon-hyun commented Apr 30, 2024

cxzl25 commented May 1, 2024

dongjoon-hyun commented May 1, 2024

dongjoon-hyun commented May 1, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mridulm commented May 7, 2024 • edited

dongjoon-hyun commented May 7, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cxzl25 May 7, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dongjoon-hyun left a comment

Choose a reason for hiding this comment

dongjoon-hyun commented May 7, 2024

mridulm commented May 7, 2024 •

edited

cxzl25 May 7, 2024 •

edited