[Autoscaler][v1] Autoscaler launches extra nodes despite fulfilled resource demand

### What happened + What you expected to happen

### Summary:

We have set up Ray with a custom external node provider and are running the autoscaler as a separate process. The goal is for the autoscaler to detect resource demands from a Ray Serve deployment and request a specific node type from the external node provider accordingly.

### Configuration:
- The cluster config defines multiple node types, including ray.worker.4090.standard, ray.worker.4090.highmem, and ray.worker.4090.ultra.
	- A Ray Serve deployment requests the following resources:

#### autoscaler-config.yaml
```yaml
available_node_types:
  ray.worker.4090.standard:
    min_workers: 0
    max_workers: 5
    resources: {"CPU": 16, "GPU": 1, "memory": 30107260928 , "gram": 24 }
    node_config: {}

  ray.worker.4090.highmem:
    min_workers: 0
    max_workers: 5
    resources: {"CPU": 16, "GPU": 1, "memory": 62277025792 , "gram": 24}
    node_config: {}

  ray.worker.4090.ultra:
    min_workers: 0
    max_workers: 5
    resources: {"CPU": 32, "GPU": 1, "memory": 130997290496 , "gram": 24 }
    node_config: {}
```
#### serve code
```python
# for ray.worker.4090.standard
@serve.deployment(ray_actor_options={"num_cpus": 16, "num_gpus": 1, "memory": 30107260928 ,"resources": {"gram": 24}})
def CustomResourceTask(*args):
    return "ray.worker.4090.standard"


serve.run(CustomResourceTask.bind())

print("Requested additional resources...")
```


## Expected Behavior:
- When the Python code is executed, the autoscaler detects the demand and requests a single node of type ray.worker.4090.standard({"num_cpus": 16, "num_gpus": 1, "memory": 30107260928 ,"resources": {"gram": 24}}) from the node provider.
- Only that node type should be launched, as it satisfies the resource requirements.

## Actual Behavior:
- The autoscaler does initially request a ray.worker.4090.standard node as expected.
- However, immediately after, it also sends a launch request for a higher-spec node type (ray.worker.4090.highmem), even though the demand was already satisfied by the standard node.
  - When the deployment directly requests the highest-spec node type (e.g., ultra), two nodes are requested by the autoscaler.

## Observation:
- This issue does not occur if the appropriate node (ray.worker.4090.standard) already exists in the cluster before the actor is scheduled.
- In that case, the actor gets scheduled correctly, and no additional autoscaling occurs.


Could this be a race condition between the autoscaler and the node provider, where the updated resource availability has not yet propagated before the scheduler makes a decision?
Or is it a scheduling policy issue, where the autoscaler aggressively launches additional nodes before confirming that the demand has been fulfilled?

Any insights on how to avoid this type of over-provisioning (e.g., through autoscaler delay settings, demand evaluation thresholds, or conservative scheduling options) would be greatly appreciated.

### Versions / Dependencies

ray 2.45.0

### Reproduction script

## Actor list (After Serve Deployment)
```
Stats:
------------------------------
Total: 4

Table:
------------------------------
    ACTOR_ID                          CLASS_NAME                               STATE      JOB_ID  NAME                                                                                               NODE_ID                                                     PID  RAY_NAMESPACE
 0  1cb2b2b79fbc11225b59313d01000000  ServeController                          ALIVE    01000000  SERVE_CONTROLLER_ACTOR                                                                             8e07ab05c55f81ea3e84d30f12da275848e10d958c8100787eb9e317   1148  serve
 1  5130f9e62717ca880d64c62701000000  ProxyActor                               ALIVE    01000000  SERVE_CONTROLLER_ACTOR:SERVE_PROXY_ACTOR-7fd36d41eb420d38e6bf122401df4fe88d4c29f2a4fd52d6d8783f0b  7fd36d41eb420d38e6bf122401df4fe88d4c29f2a4fd52d6d8783f0b    430  serve
 2  816b5cd4454e58298016f3b501000000  ServeReplica:default:CustomResourceTask  ALIVE    01000000  SERVE_REPLICA::default#CustomResourceTask#nBaiFX                                                   7fd36d41eb420d38e6bf122401df4fe88d4c29f2a4fd52d6d8783f0b    253  serve
 3  dd8062a92fd4d7eeca1757c701000000  ProxyActor                               ALIVE    01000000  SERVE_CONTROLLER_ACTOR:SERVE_PROXY_ACTOR-8e07ab05c55f81ea3e84d30f12da275848e10d958c8100787eb9e317  8e07ab05c55f81ea3e84d30f12da275848e10d958c8100787eb9e317   1199  serve

```

## Node list (After Serve Deployment)
```
Stats:
------------------------------
Total: 2

Table:
------------------------------
    NODE_ID                                                   NODE_IP      IS_HEAD_NODE    STATE    NODE_NAME    RESOURCES_TOTAL                 LABELS
 0  7fd36d41eb420d38e6bf122401df4fe88d4c29f2a4fd52d6d8783f0b  172.28.0.16  False           ALIVE    172.28.0.16  CPU: 16.0                       ray.io/node_id: 7fd36d41eb420d38e6bf122401df4fe88d4c29f2a4fd52d6d8783f0b
                                                                                                                 GPU: 1.0
                                                                                                                 gram: 24.0
                                                                                                                 memory: 28.040 GiB
                                                                                                                 node:172.28.0.16: 1.0
                                                                                                                 object_store_memory: 4.595 GiB
 1  8e07ab05c55f81ea3e84d30f12da275848e10d958c8100787eb9e317  172.28.0.10  True            ALIVE    172.28.0.10  CPU: 16.0                       ray.io/node_id: 8e07ab05c55f81ea3e84d30f12da275848e10d958c8100787eb9e317
                                                                                                                 memory: 9.094 GiB
                                                                                                                 node:172.28.0.10: 1.0
                                                                                                                 node:__internal_head__: 1.0
                                                                                                                 object_store_memory: 4.547 GiB
```

### Issue Severity

High: It blocks me from completing my task.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Autoscaler][v1] Autoscaler launches extra nodes despite fulfilled resource demand #52864

What happened + What you expected to happen

Summary:

Configuration:

autoscaler-config.yaml

serve code

Expected Behavior:

Actual Behavior:

Observation:

Versions / Dependencies

Reproduction script

Actor list (After Serve Deployment)

Node list (After Serve Deployment)

Issue Severity

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Autoscaler][v1] Autoscaler launches extra nodes despite fulfilled resource demand #52864

Description

What happened + What you expected to happen

Summary:

Configuration:

autoscaler-config.yaml

serve code

Expected Behavior:

Actual Behavior:

Observation:

Versions / Dependencies

Reproduction script

Actor list (After Serve Deployment)

Node list (After Serve Deployment)

Issue Severity

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions