Skip to content

[Core][Autoscaler] Add labels to KubeRay autoscaling config#56532

Merged
jjyao merged 8 commits intoray-project:masterfrom
ryanaoleary:autoscaling-config-labels
Sep 16, 2025
Merged

[Core][Autoscaler] Add labels to KubeRay autoscaling config#56532
jjyao merged 8 commits intoray-project:masterfrom
ryanaoleary:autoscaling-config-labels

Conversation

@ryanaoleary
Copy link
Copy Markdown
Contributor

@ryanaoleary ryanaoleary commented Sep 15, 2025

Why are these changes needed?

This PR adds a new labels field to the autoscaling config for the KubeRay autoscaler. This PR adds logic to detect labels from the worker group spec of a Ray CR. Ray node labels are specified per worker-group in KubeRay by passing them to the rayStartParams as follows:

workerGroupSpecs:
  - replicas: 0
    minReplicas: 0
    maxReplicas: 10
    groupName: a100-group
    rayStartParams:
        labels: "ray.io/availability-region=us-central2, ray.io/accelerator-type=A100"

This change is required for the Ray autoscaler to scale available node types using labels.

Related issue number

#51564

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
@ryanaoleary ryanaoleary requested a review from a team as a code owner September 15, 2025 11:08
@ryanaoleary
Copy link
Copy Markdown
Contributor Author

cc: @MengjinYan

@ryanaoleary ryanaoleary changed the title [Autoscaler] Add labels to autoscaling config [Core][Autoscaler] Add labels to autoscaling config Sep 15, 2025
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for labels in the KubeRay autoscaler configuration by introducing a new labels field. This field is parsed from the rayStartParams of a worker group spec in the Ray CR. The implementation includes a new function for parsing these labels with error handling, and the changes are well-supported by new unit tests. My feedback focuses on making the exception handling more specific to improve code robustness.

@ryanaoleary ryanaoleary changed the title [Core][Autoscaler] Add labels to autoscaling config [Core][Autoscaler] Add labels to KubeRay autoscaling config Sep 15, 2025
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com>
@ray-gardener ray-gardener bot added docs An issue or change related to documentation core Issues that should be addressed in Ray Core community-contribution Contributed by the community kubernetes labels Sep 15, 2025
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Copy link
Copy Markdown
Contributor

@MengjinYan MengjinYan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! All are nit comments.

Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
@MengjinYan MengjinYan added the go add ONLY when ready to merge, run all tests label Sep 16, 2025
ryanaoleary and others added 2 commits September 16, 2025 08:55
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
@jjyao jjyao enabled auto-merge (squash) September 16, 2025 18:24
@github-actions github-actions bot disabled auto-merge September 16, 2025 20:09
@MengjinYan
Copy link
Copy Markdown
Contributor

The java test failure should be unrelated. cc: @jjyao

@jjyao jjyao merged commit f95c202 into ray-project:master Sep 16, 2025
3 of 5 checks passed
jmajety-dev pushed a commit to jmajety-dev/ray that referenced this pull request Sep 16, 2025
…oject#56532)

Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com>
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
ZacAttack pushed a commit to ZacAttack/ray that referenced this pull request Sep 24, 2025
…oject#56532)

Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com>
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
Signed-off-by: zac <zac@anyscale.com>
marcostephan pushed a commit to marcostephan/ray that referenced this pull request Sep 24, 2025
…oject#56532)

Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com>
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
Signed-off-by: Marco Stephan <marco@magic.dev>
dstrodtman pushed a commit to dstrodtman/ray that referenced this pull request Oct 6, 2025
…oject#56532)

Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com>
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
justinyeh1995 pushed a commit to justinyeh1995/ray that referenced this pull request Oct 20, 2025
…oject#56532)

Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com>
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
…oject#56532)

Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com>
Co-authored-by: Jiajun Yao <jeromeyjj@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community core Issues that should be addressed in Ray Core docs An issue or change related to documentation go add ONLY when ready to merge, run all tests kubernetes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants