Skip to content

[serve] Added policy state persistence for application level autoscaling#59118

Merged
abrarsheikh merged 10 commits intoray-project:masterfrom
vaishdho1:app-level-context-management
Dec 10, 2025
Merged

[serve] Added policy state persistence for application level autoscaling#59118
abrarsheikh merged 10 commits intoray-project:masterfrom
vaishdho1:app-level-context-management

Conversation

@vaishdho1
Copy link
Copy Markdown
Contributor

@vaishdho1 vaishdho1 commented Dec 2, 2025

Description

Application level autoscaling policies in Ray Serve have no mechanism to persist policy state between control-loop iterations, preventing stateful autoscaling behavior. Additionally, per-deployment internal state needed for applying standard autoscaling config parameters over custom policies cannot be maintained.(#58622)

This PR adds the following:

  • The user policy state should return a Dict[DeploymentID,Dict].
  • Each deployment's AutoscalingContext now receives the user state returned by the custom policy per deployment ID enabling policies to maintain their state across iterations. Each deployment gets its own state.

Related issues

Fixes: #59008
Related to: #58622 #58857

Additional information

The implementation modifies the ApplicationAutoscalingState.get_decision_num_replicas() method to implement:

  • User returned policy state validation
  • Returning the policy state back into each deployment

…utoscaling policy

Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
@vaishdho1 vaishdho1 requested review from a team as code owners December 2, 2025 21:20
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable feature for stateful application-level autoscaling in Ray Serve. By adding a mechanism to persist policy state between control loop iterations, it enables more sophisticated, stateful autoscaling behaviors. The implementation thoughtfully separates Ray Serve's internal state from user-defined state using a private key, ensuring that internal logic doesn't interfere with user policies. The changes are well-documented and include a solid test case that verifies the state persistence. My review includes a few minor suggestions to improve comment clarity and simplify a piece of test code.

@ray-gardener ray-gardener bot added serve Ray Serve Related Issue community-contribution Contributed by the community labels Dec 3, 2025
# Separate internal state from custom policy state(Internal state is used when default autoscaling is enabled over custom policies)
current_state = (self._policy_state or {}).copy()
internal_state = current_state.get(
SERVE_AUTOSCALING_DECISION_COUNTERS_KEY, {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the purpose of SERVE_AUTOSCALING_DECISION_COUNTERS_KEY?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since _policy_state is with an AutoscalingContext. Should we enforce that a application level policy returns Dict[deploymentID, Dict] for policy state. Does that solve the problem?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can solve this problem since we need to just store the state of each deployment back to the deployment context we can directly do this. But the custom policy should return state per deployment there cannot be any overall state persisted.
The SERVE_AUTOSCALING_DECISION_COUNTERS_KEY just an alias for decision counters.
SERVE_AUTOSCALING_DECISION_COUNTERS_KEY = "__decision_counters"
Since this will be a private key I created a name for this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can solve this problem since we need to just store the state of each deployment back to the deployment context we can directly do this. But the custom policy should return state per deployment there cannot be any overall state persisted.

Let's go with this, we should run a light validation on the policy state returned by the users app level policy to verify it contains valid keys.

Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
@abrarsheikh abrarsheikh added the go add ONLY when ready to merge, run all tests label Dec 9, 2025
Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
Copy link
Copy Markdown
Contributor

@abrarsheikh abrarsheikh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lg2m, left some nits

Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
@abrarsheikh abrarsheikh merged commit 9c37058 into ray-project:master Dec 10, 2025
6 checks passed
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
…ing (ray-project#59118)

## Description
Application level autoscaling policies in Ray Serve have no mechanism to
persist policy state between control-loop iterations, preventing
stateful autoscaling behavior. Additionally, per-deployment internal
state needed for applying standard autoscaling config parameters over
custom policies cannot be maintained.(ray-project#58622)

This PR adds the following:
- The user policy state should return a `Dict[DeploymentID,Dict]`.
- Each deployment's `AutoscalingContext` now receives the user state
returned by the custom policy per deployment ID enabling policies to
maintain their state across iterations. Each deployment gets its own
state.

## Related issues
Fixes: ray-project#59008
Related to:  ray-project#58622 ray-project#58857

## Additional information
The implementation modifies the
`ApplicationAutoscalingState.get_decision_num_replicas()` method to
implement:
- User returned policy state validation
- Returning the policy state back into each deployment

---------

Signed-off-by: Vaishnavi Panchavati <vaishdho10@gmail.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Serve]Application Level Autoscaling Context State Management

2 participants