-
Notifications
You must be signed in to change notification settings - Fork 7.4k
[Core][Autoscaler] API for getting max cluster resources #49501
Copy link
Copy link
Closed
Labels
P1Issue that should be fixed within a few weeksIssue that should be fixed within a few weekscommunity-backlogcoreIssues that should be addressed in Ray CoreIssues that should be addressed in Ray CoreenhancementRequest for new feature and/or capabilityRequest for new feature and/or capability
Description
Description
Add a new API method to Ray that provides the max possible resources in the cluster, taking into account autoscaling.
This would complement the existing ray.cluster_resources (and ray.available_resources) functionality by considering not only the current cluster shape, but that of the fully upscaled cluster.
Example
Create a cluster with the following config:
head_node: {"CPU": 16}
worker_node[min=0, max=4]: {"CPU": 4, "GPU": 1}
The cluster will start with the head node and 0 worker nodes.
Desired behavior:
>>> ray.cluster_resources()
{"CPU": 16}
>>> ray.max_cluster_resources()
{"CPU": 32, "GPU": 4}Use case
This would be useful for being able to programmatically detect and validate if resource requests are feasible, without having to submit a scheduling request directly.
See #49372 as an example.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
P1Issue that should be fixed within a few weeksIssue that should be fixed within a few weekscommunity-backlogcoreIssues that should be addressed in Ray CoreIssues that should be addressed in Ray CoreenhancementRequest for new feature and/or capabilityRequest for new feature and/or capability