-
Notifications
You must be signed in to change notification settings - Fork 7.4k
[Serve]Application Level Autoscaling Context State Management #59008
Description
What happened + What you expected to happen
Problem Description
In application level autoscaling, when a policy returns state, it is stored globally in ApplicationAutoscalingState._policy_state . However, this returned state is not distributed back to individual deployment contexts. Each deployment's AutoscalingContext continues to use the same shared state reference that was passed in .
This means application level policies cannot maintain per deployment internal state like decision counters for delay logic, because the state changes are never propagated back to the individual deployment contexts that would need them.
What is expected?
Like deployment level policies where the returned state updates the deployment's context , application level policies should have their returned state distributed back to individual deployment contexts. Each deployment should receive its own updated state, enabling per deployment features like delay counters while preserving coordinated decision making across deployments.
Impact
Because the returned state isn't distributed back to each deployment's context, application level policies cannot:
- Track per deployment decision counters for delay logic
- Maintain deployment specific internal state across iterations
- Use features like
upscale_delay_sanddownscale_delay_sthat require persistent counters
Related files:
https://github.com/ray-project/ray/blob/master/python/ray/serve/_private/autoscaling_state.py
Versions / Dependencies
version = "3.0.0.dev0"
Reproduction script
Current implementation
In ApplicationAutoscalingState.get_decision_num_replicas()
Lines 752-760
autoscaling_contexts = {
deployment_id: state.get_autoscaling_context(
deployment_to_target_num_replicas[deployment_id]
)
for deployment_id, state in self._deployment_autoscaling_states.items()
}
# Policy returns {deployment_name -> decision}
decisions, self._policy_state = self._policy(autoscaling_contexts)
Possible fix
# Call policy
decisions, policy_state = self._policy(autoscaling_contexts)
# NEW: Update individual deployment states with returned state
if policy_state:
for deployment_id, state in self._deployment_autoscaling_states.items():
if deployment_id in policy_state:
state._policy_state.update(policy_state[deployment_id])
Issue Severity
None