[data] Add more shuffle fusion rules by iamjustinhsu · Pull Request #59985 · ray-project/ray

iamjustinhsu · 2026-01-09T00:03:21Z

Description

Add redundant shuffle fusion rules by dropping the 1st shuffle

Repartition -> Aggregate
StreamingRepartition -> Repartition
Repartition -> StreamingRepartition
Sort -> Sort

Related issues

None

Additional information

None

Signed-off-by: iamjustinhsu <jhsu@anyscale.com>

gemini-code-assist

Code Review

This pull request introduces new fusion rules to optimize redundant shuffles, specifically for Repartition -> Aggregate, StreamingRepartition -> Repartition, and Sort -> Sort operator pairs. The changes involve renaming CombineRepartitions to CombineShuffles and adding the logic for the new fusion rules.

My review found a critical issue in the implementation of the new fusion rules. The use of cp.copy(op) does not correctly modify the operator graph to remove the redundant shuffle. I've provided a detailed comment with a suggested fix to correctly implement the operator fusion by creating new operators with updated input dependencies. Addressing this is crucial for the feature to work as intended.

python/ray/data/_internal/logical/rules/combine_repartitions.py

Signed-off-by: iamjustinhsu <jhsu@anyscale.com>

iamjustinhsu · 2026-01-09T00:26:53Z

python/ray/data/_internal/logical/rules/combine_shuffles.py

+            elif isinstance(input_op, Repartition) and isinstance(op, Aggregate):
+                return Aggregate(
+                    input_op=input_op.input_dependencies[0],
+                    key=op._key,
+                    aggs=op._aggs,
+                    num_partitions=op._num_partitions,
+                    batch_format=op._batch_format,
+                )
+            elif isinstance(input_op, StreamingRepartition) and isinstance(
+                op, Repartition
+            ):
+                return Repartition(
+                    input_op.input_dependencies[0],
+                    num_outputs=op._num_outputs,
+                    shuffle=op._shuffle,
+                    keys=op._keys,
+                    sort=op._sort,
+                )
+            elif isinstance(input_op, Sort) and isinstance(op, Sort):
+                return Sort(
+                    input_op.input_dependencies[0],
+                    sort_key=op._sort_key,
+                    batch_format=op._batch_format,
+                )
+
+            return op


These were the 2 new rules that got added. I just renamed the file from combine_repartitions -> combine_shuffles

Signed-off-by: iamjustinhsu <jhsu@anyscale.com>

python/ray/data/_internal/logical/rules/combine_shuffles.py

alexeykudinkin · 2026-01-13T00:15:28Z

python/ray/data/_internal/logical/rules/combine_shuffles.py

+                    target_num_rows_per_block=op.target_num_rows_per_block,
+                )
+            elif isinstance(input_op, Repartition) and isinstance(op, Aggregate):
+                return Aggregate(


We can just return op, right?

Actually, no -- we'd inherit num_partitions from Repartition

We can just return op, right?

This would contain the original input_dependency (we want to get rid of Repartition)

alexeykudinkin · 2026-01-13T00:16:47Z

python/ray/data/_internal/logical/rules/combine_shuffles.py

+            if isinstance(input_op, Repartition) and isinstance(op, Repartition):
+                shuffle = input_op._shuffle or op._shuffle
+                return Repartition(
+                    input_op.input_dependencies[0],
+                    num_outputs=op._num_outputs,
+                    shuffle=shuffle,
+                    keys=op._keys,
+                    sort=op._sort,
+                )
+            elif isinstance(input_op, StreamingRepartition) and isinstance(
+                op, StreamingRepartition
+            ):
+                return StreamingRepartition(
+                    input_op.input_dependencies[0],
+                    target_num_rows_per_block=op.target_num_rows_per_block,
+                )


We'd really unify Repartition and Streaming logical ops

Hmm yes I somewhat agree, although I don't know the original intentions behind splitting them up (different physical operators?) Anyways, i'd prefer to do that refactor later since it's unrelated to this PR

alexeykudinkin · 2026-01-13T00:17:13Z

python/ray/data/_internal/logical/rules/combine_shuffles.py

+            elif isinstance(input_op, StreamingRepartition) and isinstance(
+                op, Repartition
+            ):


We'd also fuse the other way around

alexeykudinkin · 2026-01-13T00:20:45Z

python/ray/data/tests/test_operator_fusion.py

+    assert "Repartition[Repartition]" in logical_plan
+    assert "Aggregate[Aggregate]" in logical_plan
+    # Check that in the Logical Plan (Optimized), Repartition is removed (combined)
+    optimized_logical = captured.split("-------- Logical Plan (Optimized) --------")[
+        1
+    ].split("-------- Physical Plan --------")[0]
+    assert "Repartition[Repartition]" not in optimized_logical
+    assert "Aggregate[Aggregate]" in optimized_logical


This kind of asserts are not robust:

We want to assert not only presence but also relationship b/w ops

Ideally assertions should look like logical == Aggregate(Repartition(Any), ...) but it'd require quite a bit of work on our operators to make it possible

For now just assert a sub-plan (textual) is present in the dag

Signed-off-by: iamjustinhsu <jhsu@anyscale.com>

…/more-shuffle-fusion-rules

Signed-off-by: iamjustinhsu <jhsu@anyscale.com>

python/ray/data/_internal/logical/rules/combine_shuffles.py

Signed-off-by: iamjustinhsu <jhsu@anyscale.com>

alexeykudinkin · 2026-01-14T21:08:50Z

python/ray/data/_internal/logical/rules/combine_shuffles.py

+        elif isinstance(input_op, Repartition) and isinstance(op, StreamingRepartition):
+            return StreamingRepartition(
+                input_op.input_dependencies[0],
+                target_num_rows_per_block=op._target_num_rows_per_block,
+            )


Now that i'm thinking more about it, i think we can't do that as we'd be getting substantially different output in that case

Repartition > SR, will shuffle by key and so will get data nicely clustered, so we can't really replace it by SR

Signed-off-by: iamjustinhsu <jhsu@anyscale.com>

## Description Add redundant shuffle fusion rules by dropping the 1st shuffle - Repartition -> Aggregate - StreamingRepartition -> Repartition - Repartition -> StreamingRepartition - Sort -> Sort ## Related issues None ## Additional information None --------- Signed-off-by: iamjustinhsu <jhsu@anyscale.com> Signed-off-by: jeffery4011 <jefferyshen1015@gmail.com>

## Description Add redundant shuffle fusion rules by dropping the 1st shuffle - Repartition -> Aggregate - StreamingRepartition -> Repartition - Repartition -> StreamingRepartition - Sort -> Sort ## Related issues None ## Additional information None --------- Signed-off-by: iamjustinhsu <jhsu@anyscale.com>

## Description Add redundant shuffle fusion rules by dropping the 1st shuffle - Repartition -> Aggregate - StreamingRepartition -> Repartition - Repartition -> StreamingRepartition - Sort -> Sort ## Related issues None ## Additional information None --------- Signed-off-by: iamjustinhsu <jhsu@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>

[data] Add more shuffle fusion rules

098b606

Signed-off-by: iamjustinhsu <jhsu@anyscale.com>

gemini-code-assist bot reviewed Jan 9, 2026

View reviewed changes

python/ray/data/_internal/logical/rules/combine_repartitions.py Outdated Show resolved Hide resolved

add tests

d81b9c1

Signed-off-by: iamjustinhsu <jhsu@anyscale.com>

iamjustinhsu commented Jan 9, 2026

View reviewed changes

iamjustinhsu added 2 commits January 8, 2026 16:27

remove rando test

c4204e0

Signed-off-by: iamjustinhsu <jhsu@anyscale.com>

this one too

ff60f1e

Signed-off-by: iamjustinhsu <jhsu@anyscale.com>

iamjustinhsu marked this pull request as ready for review January 9, 2026 00:28

iamjustinhsu requested a review from a team as a code owner January 9, 2026 00:28

cursor bot reviewed Jan 9, 2026

View reviewed changes

python/ray/data/_internal/logical/rules/combine_shuffles.py Outdated Show resolved Hide resolved

ray-gardener bot added performance data Ray Data-related issues labels Jan 9, 2026

alexeykudinkin reviewed Jan 13, 2026

View reviewed changes

iamjustinhsu added 3 commits January 12, 2026 16:58

refactor + another streaming repartition rule

893e492

Signed-off-by: iamjustinhsu <jhsu@anyscale.com>

Merge branch 'master' of https://github.com/ray-project/ray into jhsu…

1654c6d

…/more-shuffle-fusion-rules

make it more robust

18aed4f

Signed-off-by: iamjustinhsu <jhsu@anyscale.com>

cursor bot reviewed Jan 13, 2026

View reviewed changes

python/ray/data/_internal/logical/rules/combine_shuffles.py Outdated Show resolved Hide resolved

add one more test

3891bd6

Signed-off-by: iamjustinhsu <jhsu@anyscale.com>

alexeykudinkin approved these changes Jan 14, 2026

View reviewed changes

remove repartition -> SR

94c8585

Signed-off-by: iamjustinhsu <jhsu@anyscale.com>

iamjustinhsu added the go add ONLY when ready to merge, run all tests label Jan 15, 2026

alexeykudinkin merged commit bfa6f0b into ray-project:master Jan 15, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[data] Add more shuffle fusion rules#59985

[data] Add more shuffle fusion rules#59985
alexeykudinkin merged 9 commits intoray-project:masterfrom
iamjustinhsu:jhsu/more-shuffle-fusion-rules

iamjustinhsu commented Jan 9, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

iamjustinhsu Jan 9, 2026

Uh oh!

Uh oh!

Uh oh!

alexeykudinkin Jan 13, 2026

Uh oh!

alexeykudinkin Jan 13, 2026

Uh oh!

iamjustinhsu Jan 13, 2026

Uh oh!

alexeykudinkin Jan 13, 2026

Uh oh!

iamjustinhsu Jan 13, 2026

Uh oh!

alexeykudinkin Jan 13, 2026

Uh oh!

alexeykudinkin Jan 13, 2026

Uh oh!

Uh oh!

alexeykudinkin Jan 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

iamjustinhsu commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related issues

Additional information

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

iamjustinhsu commented Jan 9, 2026 •

edited

Loading