-
Notifications
You must be signed in to change notification settings - Fork 588
[RAS] GroupLeafExec does not preserve outputPartitioning #11468
Description
Description
GroupLeafExec in RAS (Ras-based Adaptive Search optimizer) always returns UnknownPartitioning for outputPartitioning, which causes incorrect behavior when spark.sql.unionOutputPartitioning=true (default in Spark 4.1).
Background
Spark 4.1 introduced apache/spark#51623 which allows UnionExec to preserve child partitioning. When all children have identical partitioning, Spark's optimizer trusts this information and may omit downstream Exchange operators.
However, in RAS, GroupLeafExec does not preserve the outputPartitioning from its wrapped plan, always returning UnknownPartitioning. This breaks the partitioning contract and can lead to incorrect query results.
Workaround
Disable the feature by setting spark.sql.unionOutputPartitioning=false when using RAS.
Proposed Solution
GroupLeafExec should override outputPartitioning to return the correct partitioning from the underlying plan.