Skip to content

[Core][Autoscaler] API for getting max cluster resources #49501

@matthewdeng

Description

@matthewdeng

Description

Add a new API method to Ray that provides the max possible resources in the cluster, taking into account autoscaling.

This would complement the existing ray.cluster_resources (and ray.available_resources) functionality by considering not only the current cluster shape, but that of the fully upscaled cluster.

Example

Create a cluster with the following config:

head_node: {"CPU": 16}
worker_node[min=0, max=4]: {"CPU": 4, "GPU": 1}

The cluster will start with the head node and 0 worker nodes.

Desired behavior:

>>> ray.cluster_resources()
{"CPU": 16}
>>> ray.max_cluster_resources()
{"CPU": 32, "GPU": 4}

Use case

This would be useful for being able to programmatically detect and validate if resource requests are feasible, without having to submit a scheduling request directly.

See #49372 as an example.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1Issue that should be fixed within a few weekscommunity-backlogcoreIssues that should be addressed in Ray CoreenhancementRequest for new feature and/or capability

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions