[Data] Revising resource allocator task scheduling decision to factor in pending task outputs by alexeykudinkin · Pull Request #60639 · ray-project/ray

alexeykudinkin · 2026-01-31T19:31:58Z

Description

This change reverts back to behavior where

can_submit_new_task still evaluates whether to schedule task based on estimated task outputs pending in the buffer.
Estimate of the pending task outputs to rely on target_max_block_size until actual estimate becomes available

Related issues

Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234".

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

gemini-code-assist

Code Review

This pull request revises the resource allocator's task scheduling decision to be more robust, especially concerning object store memory. It introduces a fallback mechanism to estimate the size of pending task outputs when no historical data is available, using target_max_block_size. This estimate is then used to check if there is sufficient object store memory budget before submitting a new task.

The overall logic is sound and improves scheduling decisions. However, I found a critical issue in resource_manager.py where self._metrics is used instead of op.metrics, which will lead to an AttributeError.

python/ray/data/_internal/execution/resource_manager.py

.buildkite/_images.rayci.yml

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

python/ray/data/_internal/execution/resource_manager.py

…max_block_size` until estimate becomes available Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

…e budget to hold pending task outputs Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

… in pending task outputs (ray-project#60639) ## Description This change reverts back to behavior where 1. `can_submit_new_task` still evaluates whether to schedule task based on estimated task outputs pending in the buffer. 2. Estimate of the pending task outputs to rely on `target_max_block_size` until actual estimate becomes available ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com> Signed-off-by: Sirui Huang <ray.huang@anyscale.com>

… in pending task outputs (ray-project#60639) ## Description This change reverts back to behavior where 1. `can_submit_new_task` still evaluates whether to schedule task based on estimated task outputs pending in the buffer. 2. Estimate of the pending task outputs to rely on `target_max_block_size` until actual estimate becomes available ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

… in pending task outputs (#60639) ## Description This change reverts back to behavior where 1. `can_submit_new_task` still evaluates whether to schedule task based on estimated task outputs pending in the buffer. 2. Estimate of the pending task outputs to rely on `target_max_block_size` until actual estimate becomes available ## Related issues > Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>

… in pending task outputs (#60639) ## Description This change reverts back to behavior where 1. `can_submit_new_task` still evaluates whether to schedule task based on estimated task outputs pending in the buffer. 2. Estimate of the pending task outputs to rely on `target_max_block_size` until actual estimate becomes available ## Related issues > Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

… in pending task outputs (ray-project#60639) ## Description This change reverts back to behavior where 1. `can_submit_new_task` still evaluates whether to schedule task based on estimated task outputs pending in the buffer. 2. Estimate of the pending task outputs to rely on `target_max_block_size` until actual estimate becomes available ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com> Signed-off-by: Adel Nour <ans9868@nyu.edu>

… in pending task outputs (ray-project#60639) ## Description This change reverts back to behavior where 1. `can_submit_new_task` still evaluates whether to schedule task based on estimated task outputs pending in the buffer. 2. Estimate of the pending task outputs to rely on `target_max_block_size` until actual estimate becomes available ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>

alexeykudinkin requested a review from a team as a code owner January 31, 2026 19:31

gemini-code-assist bot reviewed Jan 31, 2026

View reviewed changes

python/ray/data/_internal/execution/resource_manager.py Outdated Show resolved Hide resolved

alexeykudinkin added the go add ONLY when ready to merge, run all tests label Jan 31, 2026

cursor bot reviewed Jan 31, 2026

View reviewed changes

python/ray/data/_internal/execution/resource_manager.py Outdated Show resolved Hide resolved

.buildkite/_images.rayci.yml Outdated Show resolved Hide resolved

alexeykudinkin force-pushed the ak/res-allc-fup-1 branch from 1cf767c to 33cf32d Compare January 31, 2026 19:42

cursor bot reviewed Jan 31, 2026

View reviewed changes

python/ray/data/_internal/execution/resource_manager.py Outdated Show resolved Hide resolved

ray-gardener bot added the data Ray Data-related issues label Feb 1, 2026

alexeykudinkin changed the title ~~[WIP][Data] Revising resource allocator task scheduling decision~~ [Data] Revising resource allocator task scheduling decision to factor in pending task outputs Feb 2, 2026

bveeramani approved these changes Feb 3, 2026

View reviewed changes

alexeykudinkin enabled auto-merge (squash) February 3, 2026 00:56

alexeykudinkin added 5 commits February 2, 2026 17:57

Reverting obj_store_mem_max_pending_output_per_task to use `target_…

5af63ea

…max_block_size` until estimate becomes available Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

Updated can_submit_task to check whether there's enough object stor…

a5c270b

…e budget to hold pending task outputs Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

Fixing invalid ref

2714458

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

Fixed tests

2e0e16b

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

lint

4b89bcf

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

alexeykudinkin force-pushed the ak/res-allc-fup-1 branch from 076d055 to 4b89bcf Compare February 3, 2026 01:57

github-actions bot disabled auto-merge February 3, 2026 01:57

alexeykudinkin added 2 commits February 2, 2026 23:47

Updated test

0de4df8

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

lint

38b2393

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

alexeykudinkin enabled auto-merge (squash) February 3, 2026 07:55

Fixed conditional

dd26b34

Signed-off-by: Alexey Kudinkin <ak@anyscale.com>

github-actions bot disabled auto-merge February 3, 2026 07:59

alexeykudinkin merged commit 1315d95 into master Feb 3, 2026
6 checks passed

alexeykudinkin deleted the ak/res-allc-fup-1 branch February 3, 2026 17:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Data] Revising resource allocator task scheduling decision to factor in pending task outputs#60639

[Data] Revising resource allocator task scheduling decision to factor in pending task outputs#60639
alexeykudinkin merged 8 commits intomasterfrom
ak/res-allc-fup-1

alexeykudinkin commented Jan 31, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

alexeykudinkin commented Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related issues

Additional information

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alexeykudinkin commented Jan 31, 2026 •

edited

Loading