[GLUTEN-11088][VL] Fix the Spark4.0 storage partition join by jinchengchenghh · Pull Request #11184 · apache/gluten

jinchengchenghh · 2025-11-25T12:35:15Z

Related issue: #11088

github-actions · 2025-11-25T12:35:45Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-11-25T16:05:39Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-11-26T09:57:42Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-11-26T09:59:23Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-11-26T10:04:48Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-11-26T12:02:55Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-11-26T14:17:58Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-11-26T16:07:24Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-11-26T18:05:53Z

Run Gluten Clickhouse CI on x86

github-actions · 2025-11-26T19:05:54Z

Run Gluten Clickhouse CI on x86

jinchengchenghh · 2025-11-27T11:18:25Z

shims/spark40/src/main/scala/org/apache/gluten/sql/shims/spark40/Spark40Shims.scala

      applyPartialClustering: Boolean,
      replicatePartitions: Boolean,
      joinKeyPositions: Option[Seq[Int]] = None): Seq[Seq[InputPartition]] = {
+    val original = batchScan.asInstanceOf[BatchScanExecShim]


Could you help do a refactor for the orderPartitions function, this is mostly copied from BatchScanExec::inputRDD, and inputPartitionsShim may can be replaced by inputPartitions @beliefer Thanks! https://github.com/apache/incubator-gluten/blob/49389cd05ea07356f71bfdfe660410604c1461ea/shims/spark40/src/main/scala/org/apache/spark/sql/execution/datasources/v2/AbstractBatchScanExec.scala#L60
https://github.com/apache/incubator-gluten/blob/d636fa77c49e991eb02159a0c25431eb499c6da2/shims/spark40/src/main/scala/org/apache/spark/sql/execution/datasources/v2/AbstractBatchScanExec.scala#L142

@jinchengchenghh would you please help to create a issue to track on this?

Created #11207

jinchengchenghh · 2025-11-27T11:22:51Z

...park40/src/test/scala/org/apache/spark/sql/connector/GlutenKeyGroupedPartitioningSuite.scala

    }.flatMap(smj => collect(smj) { case s: ColumnarShuffleExchangeExec => s })
  }
+
+  private def collectShuffles(plan: SparkPlan): Seq[ShuffleExchangeLike] = {


Will create a PR in apache/Spark to make the function from private to protected, then we can only override the function to check the plan

github-actions bot added the CORE works for Gluten Core label Nov 25, 2025

jinchengchenghh force-pushed the batchScan branch from fcf56d4 to 8594ab9 Compare November 25, 2025 16:05

jinchengchenghh added 3 commits November 26, 2025 10:04

fix SPJ

8af4783

fix test

e3fef9f

fix

381591e

jinchengchenghh force-pushed the batchScan branch from b28e232 to 381591e Compare November 26, 2025 10:04

fallback keygroup partitioning

be55764

fix

e82201d

fix code style

93a3bb2

fix

d28749f

fix code style

1cccaa6

jinchengchenghh requested a review from zhouyuan November 27, 2025 10:54

zhouyuan approved these changes Nov 27, 2025

View reviewed changes

jinchengchenghh merged commit d636fa7 into apache:main Nov 27, 2025
105 of 107 checks passed

jinchengchenghh commented Nov 27, 2025

View reviewed changes

This was referenced Nov 27, 2025

[VL] Track on Spark-4.0 failed unit tests #11088

Open

[CORE] Refactor SparkShims.orderPartitions and AbstractBatchScanExec.inputRDD #11207

Open

Conversation

jinchengchenghh commented Nov 25, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 25, 2025

Uh oh!

github-actions bot commented Nov 25, 2025

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

Uh oh!

jinchengchenghh Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

zhouyuan Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

jinchengchenghh Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

jinchengchenghh Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jinchengchenghh commented Nov 25, 2025 •

edited by github-actions bot

Loading