Skip to content

[Autoscaler] Add bundle_label_selector to request_resources sdk#54843

Merged
jjyao merged 15 commits intoray-project:masterfrom
ryanaoleary:request-resources-labels
Oct 15, 2025
Merged

[Autoscaler] Add bundle_label_selector to request_resources sdk#54843
jjyao merged 15 commits intoray-project:masterfrom
ryanaoleary:request-resources-labels

Conversation

@ryanaoleary
Copy link
Copy Markdown
Contributor

@ryanaoleary ryanaoleary commented Jul 22, 2025

Why are these changes needed?

This change adds a bundle_label_selector argument to the request_resources sdk command for the v2 Ray autoscaler. This command is used by several Ray libraries. The bundle_label_selector is a parallel list to the bundles of resource shapes specified by the user and are applied per-bundle. These label selectors are passed to the repeated LabelSelector label_selectors field in the ResourceRequest message that gets built by RequestClusterResourceConstraint.

This change depends on #53578.

Related issue number

Contributes to #51564

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Note

Adds per-bundle label selectors to resource requests and wires them through Python SDK, autoscaler v2, Cython GCS client, and GCS RPC, with tests.

  • Autoscaler SDK/API:
    • python/ray/autoscaler/sdk/sdk.py: Extend request_resources with bundle_label_selectors; validate types/length and label syntax; update docs/examples.
    • python/ray/autoscaler/_private/commands.py: Emit requests as {"resources": ..., "label_selector": ...}; support selectors per bundle; forward to v2 SDK.
    • python/ray/autoscaler/v2/sdk.py: Accept per-bundle label_selector, normalize legacy format, aggregate by (resources, selector), and forward selectors to GCS.
  • GCS client & RPC plumbing:
    • python/ray/includes/common.pxd, python/ray/includes/gcs_client.pxi: Add label_selectors param to request_cluster_resource_constraint.
    • src/ray/gcs_rpc_client/accessor.{h,cc}: Method signature updated to take selectors; build proto LabelSelector per bundle.
  • Label selector utils:
    • src/ray/common/scheduling/label_selector.{h,cc}: Add generic ctor from map and keep ToProto; hashing/equality unchanged.
  • Tests:
    • python/ray/autoscaler/v2/tests/test_sdk.py: New test verifying per-bundle label selectors and operators; imports LabelSelectorOperator.
    • python/ray/tests/test_multi_node_2.py: Adjust expected resource_requests format to include resources and empty label_selector.

Written by Cursor Bugbot for commit 4409f1d. This will update automatically on new commits. Configure here.

@ryanaoleary ryanaoleary requested a review from a team as a code owner July 22, 2025 22:30
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@ryanaoleary
Copy link
Copy Markdown
Contributor Author

cc: @MengjinYan

@ray-gardener ray-gardener bot added community-contribution Contributed by the community docs An issue or change related to documentation core Issues that should be addressed in Ray Core labels Jul 23, 2025
@github-actions
Copy link
Copy Markdown

github-actions bot commented Aug 7, 2025

This pull request has been automatically marked as stale because it has not had
any activity for 14 days. It will be closed in another 14 days if no further activity occurs.
Thank you for your contributions.

You can always ask for help on our discussion forum or Ray's public slack channel.

If you'd like to keep this open, just leave any comment, and the stale label will be removed.

@github-actions github-actions bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Aug 7, 2025
@edoakes
Copy link
Copy Markdown
Collaborator

edoakes commented Aug 7, 2025

Tons of merge conflicts @ryanaoleary

@github-actions github-actions bot added unstale A PR that has been marked unstale. It will not get marked stale again if this label is on it. and removed stale The issue is stale. It will be closed within 7 days unless there are further conversation labels Aug 8, 2025
@ryanaoleary ryanaoleary force-pushed the request-resources-labels branch 3 times, most recently from 83d8bcc to a0199b0 Compare August 11, 2025 14:36
@ryanaoleary
Copy link
Copy Markdown
Contributor Author

Rebased and re-tested, should all be fixed now cc: @edoakes @MengjinYan

@edoakes
Copy link
Copy Markdown
Collaborator

edoakes commented Aug 15, 2025

ping @MengjinYan

@MengjinYan MengjinYan self-assigned this Sep 9, 2025
@jjyao jjyao requested a review from MengjinYan September 16, 2025 23:48
@jjyao
Copy link
Copy Markdown
Contributor

jjyao commented Sep 16, 2025

You need to rebase now.

@ryanaoleary ryanaoleary force-pushed the request-resources-labels branch from 912f181 to a88a6a2 Compare September 18, 2025 20:21
cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

@ryanaoleary
Copy link
Copy Markdown
Contributor Author

cc: @MengjinYan fixed merge conflicts and other comments

cursor[bot]

This comment was marked as outdated.

Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
@ryanaoleary ryanaoleary requested a review from jjyao September 29, 2025 22:41
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
ryanaoleary and others added 2 commits October 3, 2025 04:29
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Copy link
Copy Markdown
Contributor

@jjyao jjyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com>
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Autoscaler v2 Missing Label Selectors

When autoscaler v2 is enabled, the request_resources function calls request_cluster_resources without passing the bundle_label_selectors. This means the v2 autoscaler won't receive label selector information, potentially causing a TypeError or incorrect resource provisioning.

python/ray/autoscaler/_private/commands.py#L234-L236

gcs_address = internal_kv_get_gcs_client().address
request_cluster_resources(gcs_address, to_request)

Fix in Cursor Fix in Web


@ryanaoleary
Copy link
Copy Markdown
Contributor Author

Bug: Autoscaler v2 Missing Label Selectors

When autoscaler v2 is enabled, the request_resources function calls request_cluster_resources without passing the bundle_label_selectors. This means the v2 autoscaler won't receive label selector information, potentially causing a TypeError or incorrect resource provisioning.

python/ray/autoscaler/_private/commands.py#L234-L236

gcs_address = internal_kv_get_gcs_client().address
request_cluster_resources(gcs_address, to_request)

Fix in Cursor Fix in Web

Invalid comment, the label selectors are part of to_request and the tests are with the v2 autoscaler

@ryanaoleary ryanaoleary requested a review from jjyao October 7, 2025 10:47
@MengjinYan
Copy link
Copy Markdown
Contributor

@jjyao Ping to merge

Each bundle is a dict of resource name to resource quantity, e.g:
[{"CPU": 1}, {"GPU": 1}].
to_request: A list of resource requests to request the cluster to have.
Each resource request is a tuple of resources and a label_selector
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each resource request is a tuple

It's a dict not tuple?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was previously a dict, but we changed it to be a tuple of two dicts (resources and labels) here:

class ResourceRequest(NamedTuple):
.

@jjyao jjyao merged commit 4036252 into ray-project:master Oct 15, 2025
6 checks passed
justinyeh1995 pushed a commit to justinyeh1995/ray that referenced this pull request Oct 20, 2025
…project#54843)

Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com>
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com>
xinyuangui2 pushed a commit to xinyuangui2/ray that referenced this pull request Oct 22, 2025
…project#54843)

Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com>
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com>
Signed-off-by: xgui <xgui@anyscale.com>
elliot-barn pushed a commit that referenced this pull request Oct 23, 2025
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com>
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
…project#54843)

Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com>
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com>
Aydin-ab pushed a commit to Aydin-ab/ray-aydin that referenced this pull request Nov 19, 2025
…project#54843)

Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com>
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Future-Outlier pushed a commit to Future-Outlier/ray that referenced this pull request Dec 7, 2025
…project#54843)

Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com>
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Issues that should be addressed in Ray Core docs An issue or change related to documentation go add ONLY when ready to merge, run all tests unstale A PR that has been marked unstale. It will not get marked stale again if this label is on it.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants