Skip to content

[Data] Demote Sort from requiring preserve_order#60555

Merged
alexeykudinkin merged 2 commits intomasterfrom
ak/srt-prsv-ord-fix
Jan 28, 2026
Merged

[Data] Demote Sort from requiring preserve_order#60555
alexeykudinkin merged 2 commits intomasterfrom
ak/srt-prsv-ord-fix

Conversation

@alexeykudinkin
Copy link
Copy Markdown
Contributor

@alexeykudinkin alexeykudinkin commented Jan 28, 2026

Description

This removes the requirement for pipelines having Sort operations to actually require preserve_order=True.

This is an unnecessary strict requirement that has adversarial side-effects, and is strictly not required as there's no global ordering between the blocks established.

Related issues

Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234".

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
@alexeykudinkin alexeykudinkin requested a review from a team as a code owner January 28, 2026 01:11
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request successfully removes Sort from the list of logical operators that implicitly require order preservation. This change aligns with the stated goal of demoting Sort from always enforcing preserve_order=True in the execution plan. The modifications are clear and directly implement the described intent.

Copy link
Copy Markdown
Contributor

@iamjustinhsu iamjustinhsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do u know why Sort was included originally?

@alexeykudinkin alexeykudinkin enabled auto-merge (squash) January 28, 2026 01:22
@github-actions github-actions bot added the go add ONLY when ready to merge, run all tests label Jan 28, 2026
@ray-gardener ray-gardener bot added the data Ray Data-related issues label Jan 28, 2026
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
@github-actions github-actions bot disabled auto-merge January 28, 2026 02:41
@alexeykudinkin alexeykudinkin enabled auto-merge (squash) January 28, 2026 02:41
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.


for op in self._logical_plan.dag.post_order_iter():
if isinstance(op, (Zip, Sort)):
if isinstance(op, Zip):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort output may return blocks out of order

Medium Severity

Removing Sort from require_preserve_order() can cause sorted data to be returned out of order when downstream MapOperators process blocks in parallel. The Sort operation establishes global ordering (block 0 has smallest values, block 1 has next range, etc.), but without preserve_order=True, downstream operators use FIFOBundleQueue which returns blocks in completion order rather than input order. If a later block completes before an earlier one, the sorted order is broken.

Fix in Cursor Fix in Web

@alexeykudinkin alexeykudinkin merged commit 6a90e08 into master Jan 28, 2026
7 checks passed
@alexeykudinkin alexeykudinkin deleted the ak/srt-prsv-ord-fix branch January 28, 2026 03:49
jinbum-kim pushed a commit to jinbum-kim/ray that referenced this pull request Jan 29, 2026
## Description

This removes the requirement for pipelines having `Sort` operations to
actually require `preserve_order=True`.

This is an unnecessary strict requirement that has adversarial
side-effects, and is strictly not required as there's no global ordering
between the blocks established.

## Related issues
> Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to
ray-project#1234".

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: jinbum-kim <jinbum9958@gmail.com>
limarkdcunha pushed a commit to limarkdcunha/ray that referenced this pull request Jan 29, 2026
## Description

This removes the requirement for pipelines having `Sort` operations to
actually require `preserve_order=True`.

This is an unnecessary strict requirement that has adversarial
side-effects, and is strictly not required as there's no global ordering
between the blocks established.

## Related issues
> Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to
ray-project#1234".

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
400Ping pushed a commit to 400Ping/ray that referenced this pull request Feb 1, 2026
## Description

This removes the requirement for pipelines having `Sort` operations to
actually require `preserve_order=True`.

This is an unnecessary strict requirement that has adversarial
side-effects, and is strictly not required as there's no global ordering
between the blocks established.

## Related issues
> Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to
ray-project#1234".

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: 400Ping <jiekaichang@apache.org>
ans9868 pushed a commit to ans9868/ray that referenced this pull request Feb 18, 2026
## Description

This removes the requirement for pipelines having `Sort` operations to
actually require `preserve_order=True`.

This is an unnecessary strict requirement that has adversarial
side-effects, and is strictly not required as there's no global ordering
between the blocks established.

## Related issues
> Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to
ray-project#1234".

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Adel Nour <ans9868@nyu.edu>
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
## Description

This removes the requirement for pipelines having `Sort` operations to
actually require `preserve_order=True`.

This is an unnecessary strict requirement that has adversarial
side-effects, and is strictly not required as there's no global ordering
between the blocks established.

## Related issues
> Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to
ray-project#1234".

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
## Description

This removes the requirement for pipelines having `Sort` operations to
actually require `preserve_order=True`.

This is an unnecessary strict requirement that has adversarial
side-effects, and is strictly not required as there's no global ordering
between the blocks established.

## Related issues
> Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to
ray-project#1234".

## Additional information
> Optional: Add implementation details, API changes, usage examples,
screenshots, etc.

---------

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data Ray Data-related issues go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ray fails to serialize self-reference objects

2 participants