[serve][llm] Isolate prefix trees among deployments by eicherseiji · Pull Request #58835 · ray-project/ray

eicherseiji · 2025-11-19T23:32:53Z

Description

This PR fixes PrefixCacheAffinityRouter to use deployment-specific prefix tree actors instead of a single shared global actor. This resolves replica ID conflicts that occur when multiple deployments use the router (e.g., in prefill-decode disaggregation with data parallelism).

Problem

Previously, all PrefixCacheAffinityRouter instances shared a single detached actor named LlmPrefixTreeActor. In multi-deployment scenarios like PD disaggregation with DP, this caused:

KeyError: Replica(id='bzw6m3yr', deployment='Decode:deepseek', app='deepseek-pd-nccl')

This happened because Prefill and Decode deployments (each with 16 DP replicas) were all tracked in the same prefix tree, causing replica ID collisions when the router tried to route requests.

Solution

Modified PrefixCacheAffinityRouter.initialize_state() to create deployment-specific prefix tree actors using namespaces derived from SERVE_NAMESPACE, app name, and deployment name:

Single deployment: serve::LlmPrefixTreeActor
PD scenario: serve::deepseek-pd-nccl::Prefill:deepseek::LlmPrefixTreeActor and serve::deepseek-pd-nccl::Decode:deepseek::LlmPrefixTreeActor

Each deployment now maintains its own isolated prefix tree state, preventing replica ID conflicts.

Changes

python/ray/llm/_internal/serve/routing_policies/prefix_aware/prefix_aware_router.py
- Imports SERVE_NAMESPACE from ray.serve._private.constants
- Builds a namespace from SERVE_NAMESPACE, app_name, and deployment_name (e.g., serve::app::deployment)
- Creates the actor with this deployment-specific namespace

Testing

Validated manually with PD + DP deployments using DeepSeek-V2-Lite

Impact

Enables PrefixCacheAffinityRouter to work correctly with PD disaggregation + DP
No breaking changes for single deployment scenarios (backward compatible)
Users can now use prefix-aware routing in complex multi-deployment scenarios

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

python/ray/llm/_internal/serve/routing_policies/prefix_aware/prefix_aware_router.py

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

…iji/ray into fix-prefix-router-pd-dp Signed-off-by: Seiji Eicher <seiji@anyscale.com>

nrghosh · 2025-12-03T00:08:54Z

/gemini review

gemini-code-assist

Code Review

This pull request correctly addresses an issue with PrefixCacheAffinityRouter using a shared global prefix tree actor, which caused replica ID conflicts in multi-deployment scenarios. By introducing deployment-specific namespaces for the LlmPrefixTreeActor, each deployment now gets an isolated prefix tree, resolving the conflicts. The implementation is sound, and the new tests in TestMultiDeploymentIsolation thoroughly validate the fix by ensuring that prefix trees for different deployments are indeed isolated. I have one minor suggestion to simplify the namespace construction logic for improved conciseness.

python/ray/llm/_internal/serve/routing_policies/prefix_aware/prefix_aware_router.py

## Description This PR fixes `PrefixCacheAffinityRouter` to use deployment-specific prefix tree actors instead of a single shared global actor. This resolves replica ID conflicts that occur when multiple deployments use the router (e.g., in prefill-decode disaggregation with data parallelism). ### Problem Previously, all `PrefixCacheAffinityRouter` instances shared a single detached actor named `LlmPrefixTreeActor`. In multi-deployment scenarios like PD disaggregation with DP, this caused: ``` KeyError: Replica(id='bzw6m3yr', deployment='Decode:deepseek', app='deepseek-pd-nccl') ``` This happened because Prefill and Decode deployments (each with 16 DP replicas) were all tracked in the same prefix tree, causing replica ID collisions when the router tried to route requests. ### Solution Modified `PrefixCacheAffinityRouter.initialize_state()` to create deployment-specific prefix tree actors using **namespaces** derived from `SERVE_NAMESPACE`, app name, and deployment name: - Single deployment: `serve::LlmPrefixTreeActor` - PD scenario: `serve::deepseek-pd-nccl::Prefill:deepseek::LlmPrefixTreeActor` and `serve::deepseek-pd-nccl::Decode:deepseek::LlmPrefixTreeActor` Each deployment now maintains its own isolated prefix tree state, preventing replica ID conflicts. ## Changes - `python/ray/llm/_internal/serve/routing_policies/prefix_aware/prefix_aware_router.py` - Imports `SERVE_NAMESPACE` from `ray.serve._private.constants` - Builds a namespace from `SERVE_NAMESPACE`, `app_name`, and `deployment_name` (e.g., `serve::app::deployment`) - Creates the actor with this deployment-specific namespace ## Testing - Validated manually with PD + DP deployments using DeepSeek-V2-Lite ## Impact - Enables `PrefixCacheAffinityRouter` to work correctly with PD disaggregation + DP - No breaking changes for single deployment scenarios (backward compatible) - Users can now use prefix-aware routing in complex multi-deployment scenarios --------- Signed-off-by: Seiji Eicher <seiji@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>

Isolate prefix trees among deployments

c257ce6

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

eicherseiji added the go add ONLY when ready to merge, run all tests label Nov 19, 2025

eicherseiji mentioned this pull request Nov 19, 2025

Disaggregated Wide-EP example: Use NIXL and minimal builder anyscale/ray-serve-llm-perf-examples#4

Merged

eicherseiji marked this pull request as ready for review November 24, 2025 20:36

eicherseiji requested a review from a team as a code owner November 24, 2025 20:36

cursor bot reviewed Nov 24, 2025

View reviewed changes

python/ray/llm/_internal/serve/routing_policies/prefix_aware/prefix_aware_router.py Outdated Show resolved Hide resolved

python/ray/llm/_internal/serve/routing_policies/prefix_aware/prefix_aware_router.py Outdated Show resolved Hide resolved

Use namespace

299da94

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

ray-gardener bot added serve Ray Serve Related Issue llm labels Nov 25, 2025

Merge branch 'master' into fix-prefix-router-pd-dp

38c3d77

eicherseiji requested a review from akyang-anyscale December 2, 2025 01:13

akyang-anyscale approved these changes Dec 2, 2025

View reviewed changes

eicherseiji added 2 commits December 2, 2025 14:52

Fixes

ebb6b1e

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Merge branch 'fix-prefix-router-pd-dp' of https://github.com/eicherse…

e0165d2

…iji/ray into fix-prefix-router-pd-dp Signed-off-by: Seiji Eicher <seiji@anyscale.com>

eicherseiji requested a review from nrghosh December 2, 2025 22:53

gemini-code-assist bot reviewed Dec 3, 2025

View reviewed changes

python/ray/llm/_internal/serve/routing_policies/prefix_aware/prefix_aware_router.py Show resolved Hide resolved

nrghosh approved these changes Dec 3, 2025

View reviewed changes

ruisearch42 merged commit 5a0ce23 into ray-project:master Dec 3, 2025
6 checks passed

eicherseiji mentioned this pull request Jan 12, 2026

[Serve] Add generic actor registration API for shutdown cleanup #60067

Merged

7 tasks

eicherseiji mentioned this pull request Jan 21, 2026

[Serve] Add controller-managed deployment-scoped actors #60359

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[serve][llm] Isolate prefix trees among deployments#58835

[serve][llm] Isolate prefix trees among deployments#58835
ruisearch42 merged 5 commits intoray-project:masterfrom
eicherseiji:fix-prefix-router-pd-dp

eicherseiji commented Nov 19, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

nrghosh commented Dec 3, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

eicherseiji commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Problem

Solution

Changes

Testing

Impact

Uh oh!

Uh oh!

Uh oh!

nrghosh commented Dec 3, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

eicherseiji commented Nov 19, 2025 •

edited

Loading