[serve] Added policy state persistence for application level autoscaling#59118
Conversation
…utoscaling policy Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request introduces a valuable feature for stateful application-level autoscaling in Ray Serve. By adding a mechanism to persist policy state between control loop iterations, it enables more sophisticated, stateful autoscaling behaviors. The implementation thoughtfully separates Ray Serve's internal state from user-defined state using a private key, ensuring that internal logic doesn't interfere with user policies. The changes are well-documented and include a solid test case that verifies the state persistence. My review includes a few minor suggestions to improve comment clarity and simplify a piece of test code.
| # Separate internal state from custom policy state(Internal state is used when default autoscaling is enabled over custom policies) | ||
| current_state = (self._policy_state or {}).copy() | ||
| internal_state = current_state.get( | ||
| SERVE_AUTOSCALING_DECISION_COUNTERS_KEY, {} |
There was a problem hiding this comment.
what is the purpose of SERVE_AUTOSCALING_DECISION_COUNTERS_KEY?
There was a problem hiding this comment.
since _policy_state is with an AutoscalingContext. Should we enforce that a application level policy returns Dict[deploymentID, Dict] for policy state. Does that solve the problem?
There was a problem hiding this comment.
That can solve this problem since we need to just store the state of each deployment back to the deployment context we can directly do this. But the custom policy should return state per deployment there cannot be any overall state persisted.
The SERVE_AUTOSCALING_DECISION_COUNTERS_KEY just an alias for decision counters.
SERVE_AUTOSCALING_DECISION_COUNTERS_KEY = "__decision_counters"
Since this will be a private key I created a name for this.
There was a problem hiding this comment.
That can solve this problem since we need to just store the state of each deployment back to the deployment context we can directly do this. But the custom policy should return state per deployment there cannot be any overall state persisted.
Let's go with this, we should run a light validation on the policy state returned by the users app level policy to verify it contains valid keys.
Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
abrarsheikh
left a comment
There was a problem hiding this comment.
lg2m, left some nits
Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
…ing (ray-project#59118) ## Description Application level autoscaling policies in Ray Serve have no mechanism to persist policy state between control-loop iterations, preventing stateful autoscaling behavior. Additionally, per-deployment internal state needed for applying standard autoscaling config parameters over custom policies cannot be maintained.(ray-project#58622) This PR adds the following: - The user policy state should return a `Dict[DeploymentID,Dict]`. - Each deployment's `AutoscalingContext` now receives the user state returned by the custom policy per deployment ID enabling policies to maintain their state across iterations. Each deployment gets its own state. ## Related issues Fixes: ray-project#59008 Related to: ray-project#58622 ray-project#58857 ## Additional information The implementation modifies the `ApplicationAutoscalingState.get_decision_num_replicas()` method to implement: - User returned policy state validation - Returning the policy state back into each deployment --------- Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
Description
Application level autoscaling policies in Ray Serve have no mechanism to persist policy state between control-loop iterations, preventing stateful autoscaling behavior. Additionally, per-deployment internal state needed for applying standard autoscaling config parameters over custom policies cannot be maintained.(#58622)
This PR adds the following:
Dict[DeploymentID,Dict].AutoscalingContextnow receives the user state returned by the custom policy per deployment ID enabling policies to maintain their state across iterations. Each deployment gets its own state.Related issues
Fixes: #59008
Related to: #58622 #58857
Additional information
The implementation modifies the
ApplicationAutoscalingState.get_decision_num_replicas()method to implement: