Skip to content

[Data] Prevent Limit from getting pushed past map_groups#60881

Merged
bveeramani merged 2 commits intomasterfrom
fix-udf-modifying-row-count-map-groups
Feb 9, 2026
Merged

[Data] Prevent Limit from getting pushed past map_groups#60881
bveeramani merged 2 commits intomasterfrom
fix-udf-modifying-row-count-map-groups

Conversation

@bveeramani
Copy link
Copy Markdown
Member

This PR updates map_groups to assume that the UDF might change the row count. This change is necessary to fix a bug where Limit gets incorrectly pushed past the map_groups (fixes #60872).

For more context, see:

Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
@bveeramani bveeramani requested a review from a team as a code owner February 9, 2026 20:43
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly fixes a bug in map_groups where an incorrect optimization could be applied. By setting udf_modifying_row_count=True, it acknowledges that the user-defined function in map_groups can change the number of rows. This prevents the Limit operator from being incorrectly pushed down, ensuring correct behavior. The change is simple, targeted, and correct.

@gvspraveen
Copy link
Copy Markdown
Contributor

do we have existing test for this?

@bveeramani
Copy link
Copy Markdown
Member Author

do we have existing test for this?

Ah, good catch. Forgot to push my change with the test

Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
@bveeramani bveeramani added the go add ONLY when ready to merge, run all tests label Feb 9, 2026
@bveeramani bveeramani enabled auto-merge (squash) February 9, 2026 21:52
@bveeramani bveeramani merged commit 7faa24f into master Feb 9, 2026
6 of 8 checks passed
@bveeramani bveeramani deleted the fix-udf-modifying-row-count-map-groups branch February 9, 2026 22:35
bveeramani added a commit that referenced this pull request Feb 9, 2026
This PR updates `map_groups` to assume that the UDF might change the row
count. This change is necessary to fix a bug where `Limit` gets
incorrectly pushed past the `map_groups` (fixes
#60872).

For more context, see:
* #60448
* #60756

---------

Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
aslonnie added a commit that referenced this pull request Feb 10, 2026
…ups` (#60881) (#60893)

Cherry-pick of #60881

Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Co-authored-by: Lonnie Liu <95255098+aslonnie@users.noreply.github.com>
preneond pushed a commit to preneond/ray that referenced this pull request Feb 15, 2026
…ject#60881)

This PR updates `map_groups` to assume that the UDF might change the row
count. This change is necessary to fix a bug where `Limit` gets
incorrectly pushed past the `map_groups` (fixes
ray-project#60872).

For more context, see:
* ray-project#60448
* ray-project#60756

---------

Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Signed-off-by: Ondrej Prenek <ondra.prenek@gmail.com>
limarkdcunha pushed a commit to limarkdcunha/ray that referenced this pull request Feb 17, 2026
…ject#60881)

This PR updates `map_groups` to assume that the UDF might change the row
count. This change is necessary to fix a bug where `Limit` gets
incorrectly pushed past the `map_groups` (fixes
ray-project#60872).

For more context, see:
* ray-project#60448
* ray-project#60756

---------

Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
MuhammadSaif700 pushed a commit to MuhammadSaif700/ray that referenced this pull request Feb 17, 2026
…ject#60881)

This PR updates `map_groups` to assume that the UDF might change the row
count. This change is necessary to fix a bug where `Limit` gets
incorrectly pushed past the `map_groups` (fixes
ray-project#60872).

For more context, see:
* ray-project#60448
* ray-project#60756

---------

Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Signed-off-by: Muhammad Saif <2024BBIT200@student.Uet.edu.pk>
Kunchd pushed a commit to Kunchd/ray that referenced this pull request Feb 17, 2026
…ject#60881)

This PR updates `map_groups` to assume that the UDF might change the row
count. This change is necessary to fix a bug where `Limit` gets
incorrectly pushed past the `map_groups` (fixes
ray-project#60872).

For more context, see:
* ray-project#60448
* ray-project#60756

---------

Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
ans9868 pushed a commit to ans9868/ray that referenced this pull request Feb 18, 2026
…ject#60881)

This PR updates `map_groups` to assume that the UDF might change the row
count. This change is necessary to fix a bug where `Limit` gets
incorrectly pushed past the `map_groups` (fixes
ray-project#60872).

For more context, see:
* ray-project#60448
* ray-project#60756

---------

Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Signed-off-by: Adel Nour <ans9868@nyu.edu>
Aydin-ab pushed a commit to kunling-anyscale/ray that referenced this pull request Feb 20, 2026
…ject#60881)

This PR updates `map_groups` to assume that the UDF might change the row
count. This change is necessary to fix a bug where `Limit` gets
incorrectly pushed past the `map_groups` (fixes
ray-project#60872).

For more context, see:
* ray-project#60448
* ray-project#60756

---------

Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
…ject#60881)

This PR updates `map_groups` to assume that the UDF might change the row
count. This change is necessary to fix a bug where `Limit` gets
incorrectly pushed past the `map_groups` (fixes
ray-project#60872).

For more context, see:
* ray-project#60448
* ray-project#60756

---------

Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Signed-off-by: peterxcli <peterxcli@gmail.com>
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
…ject#60881)

This PR updates `map_groups` to assume that the UDF might change the row
count. This change is necessary to fix a bug where `Limit` gets
incorrectly pushed past the `map_groups` (fixes
ray-project#60872).

For more context, see:
* ray-project#60448
* ray-project#60756

---------

Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
Signed-off-by: peterxcli <peterxcli@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Data] Limit operator is applied before map_groups()

3 participants