Skip to content

[model-gateway] Replace PolicyRegistry RwLock with DashMap for lock-free policy lookups#15361

Merged
slin1237 merged 1 commit intomainfrom
metric-n/15
Dec 18, 2025
Merged

[model-gateway] Replace PolicyRegistry RwLock with DashMap for lock-free policy lookups#15361
slin1237 merged 1 commit intomainfrom
metric-n/15

Conversation

@slin1237
Copy link
Copy Markdown
Collaborator

The PolicyRegistry was using Arc<RwLock<HashMap<...>>> which required acquiring a std::sync::RwLock on every get_policy() call - this happens on every single request during worker selection.

Changed model_policies and model_worker_counts to use DashMap:

  • get_policy() is now lock-free (sharded concurrent HashMap)
  • on_worker_added/removed use DashMap's efficient entry API
  • Eliminates lock contention on the hot path

Kept RwLock for prefill/decode policies since they're set once at startup and only used in PD mode (less frequent access pattern).

Checklist

@slin1237 slin1237 requested a review from ByronHsu as a code owner December 18, 2025 02:56
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@slin1237
Copy link
Copy Markdown
Collaborator Author

/gemini review

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

The PolicyRegistry was using Arc<RwLock<HashMap<...>>> which required
acquiring a std::sync::RwLock on every get_policy() call - this happens
on every single request during worker selection.

Changed model_policies and model_worker_counts to use DashMap:
- get_policy() is now lock-free (sharded concurrent HashMap)
- on_worker_added/removed use DashMap's efficient entry API
- Eliminates lock contention on the hot path

Kept RwLock for prefill/decode policies since they're set once at startup
and only used in PD mode (less frequent access pattern).
@slin1237
Copy link
Copy Markdown
Collaborator Author

/gemini review

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@slin1237 slin1237 merged commit 70607e5 into main Dec 18, 2025
71 of 74 checks passed
@slin1237 slin1237 deleted the metric-n/15 branch December 18, 2025 08:11
Liwansi added a commit to iforgetmyname/sglang that referenced this pull request Dec 19, 2025
…n3_pp

* 'main' of https://github.com/sgl-project/sglang: (74 commits)
  [bug fix][pp] fix inconsistent latency between tp (sgl-project#15379)
  Fix warp illegal instruction in kimi k2 thinking PCG (sgl-project#15306)
  Fix gpt-oss yarn with `truncate` argument (sgl-project#14270)
  Monkey patch deepseek-ocr's `v_head_dim` (sgl-project#15384)
  [model-gateway] Replace PolicyRegistry RwLock with DashMap for lock-free policy lookups (sgl-project#15361)
  [PP] Fix dynamic chunking strategy for PP (sgl-project#15372)
  Fix issue: ENABLE_BELOW_SM90 cannot be enabled on aarch64 CPU (sgl-project#12967)
  Split test_piecewise_cuda_graph.py to optimize CI resource usage (sgl-project#15290)
  unified management of environment variables for vlm cuda ipc transport  (sgl-project#14501)
  Mistral Large 3 NVFP4 TRTLLM MoE support (sgl-project#15049)
  fix: adjust time for test_epd_disaggregation.py (sgl-project#15354)
  Add doc for qwen3 next (sgl-project#15337)
  feat: DeepSeek-V3.2 Streaming tool call output (sgl-project#15278)
  Feature/trtllm mha workspace size configurable sgl-project#15089 (sgl-project#15131)
  [VLM] Support cos sin cache for Qwen3-VL & GLM-4.1V (sgl-project#15205)
  [Deepseek V3.2] Support Overlap Spec + NSA (sgl-project#15307)
  Add request-level timestamp for when prefill finishes (sgl-project#14860)
  [CI] Migrate LoRA tests to test/registered/lora/ (sgl-project#15176)
  Reserve more memory for DeepSeekOCR model and adjust server start timeout for DeepGEMM to reduce flakiness (sgl-project#15277)
  Fix condition check for require_gathered_buffer (sgl-project#15328)
  ...
Prozac614 pushed a commit to Prozac614/sglang that referenced this pull request Dec 23, 2025
jiaming1130 pushed a commit to zhuyijie88/sglang that referenced this pull request Dec 25, 2025
YChange01 pushed a commit to YChange01/sglang that referenced this pull request Jan 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant