Skip to content

[Serve]Application Level Autoscaling Context State Management #59008

@vaishdho1

Description

@vaishdho1

What happened + What you expected to happen

Problem Description

In application level autoscaling, when a policy returns state, it is stored globally in ApplicationAutoscalingState._policy_state . However, this returned state is not distributed back to individual deployment contexts. Each deployment's AutoscalingContext continues to use the same shared state reference that was passed in .

This means application level policies cannot maintain per deployment internal state like decision counters for delay logic, because the state changes are never propagated back to the individual deployment contexts that would need them.

What is expected?
Like deployment level policies where the returned state updates the deployment's context , application level policies should have their returned state distributed back to individual deployment contexts. Each deployment should receive its own updated state, enabling per deployment features like delay counters while preserving coordinated decision making across deployments.

Impact

Because the returned state isn't distributed back to each deployment's context, application level policies cannot:

  • Track per deployment decision counters for delay logic
  • Maintain deployment specific internal state across iterations
  • Use features like upscale_delay_s and downscale_delay_s that require persistent counters

Related issues:
#58622
#58857

Related files:
https://github.com/ray-project/ray/blob/master/python/ray/serve/_private/autoscaling_state.py

Versions / Dependencies

version = "3.0.0.dev0"

Reproduction script

Current implementation

In ApplicationAutoscalingState.get_decision_num_replicas()
Lines 752-760

autoscaling_contexts = {  
    deployment_id: state.get_autoscaling_context(  
        deployment_to_target_num_replicas[deployment_id]  
    )  
    for deployment_id, state in self._deployment_autoscaling_states.items()  
}  
  
# Policy returns {deployment_name -> decision}  
decisions, self._policy_state = self._policy(autoscaling_contexts)

Possible fix

       # Call policy  
       decisions, policy_state = self._policy(autoscaling_contexts)  
         
       # NEW: Update individual deployment states with returned state  
       if policy_state:  
           for deployment_id, state in self._deployment_autoscaling_states.items():  
               if deployment_id in policy_state:  
                   state._policy_state.update(policy_state[deployment_id])  

Issue Severity

None

Metadata

Metadata

Assignees

Labels

bugSomething that is supposed to be working; but isn'tcommunity-backlogserveRay Serve Related IssuestabilitytriageNeeds triage (eg: priority, bug/not-bug, and owning component)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions