Skip to content

[model-gateway] Optimize router selection with lock-free snapshots#15672

Merged
slin1237 merged 13 commits intosgl-project:mainfrom
ppraneth:per-req
Dec 23, 2025
Merged

[model-gateway] Optimize router selection with lock-free snapshots#15672
slin1237 merged 13 commits intosgl-project:mainfrom
ppraneth:per-req

Conversation

@ppraneth
Copy link
Copy Markdown
Contributor

Motivation

The RouterManager::select_router_for_request function is on the critical hot path of the gateway, executed for every incoming inference request. The previous implementation suffered from two performance bottlenecks:

  1. Per-Request Allocations: The use of collect::<Vec<_>>() and vec![router] triggered heap allocations on every request, leading to system memory allocator overhead and potential lock contention under high concurrency.
  2. Map Iteration Locks: Iterating over DashMap entries requires acquiring internal shard-level locks. While efficient for point lookups, this added microsecond-scale latency and jitter during high-throughput routing.

This PR introduces a lock-free snapshot mechanism using the arc_swap crate. By caching the router list in a contiguous, read-optimized Vec that is updated only during registration, we achieve nanosecond-scale routing performance.

Modifications

  • Snapshot Architecture: Integrated routers_snapshot: ArcSwap<Vec<Arc<dyn RouterTrait>>> into the RouterManager struct to provide zero-allocation access to candidate routers.

  • Atomic Updates: Modified register_router to pre-calculate and store a fresh snapshot whenever a new router joins the registry, shifting overhead from the "hot" request path to the "cold" registration path.

  • Zero-Allocation Selection: Refactored select_router_for_request to use lock-free .load() calls.

  • Logic Refinement:

    • Simplified model-specific lookups to return matching candidates immediately, bypassing redundant scoring logic when only one valid router exists.
  • Dependencies: Added arc_swap = "1.7.1" to Cargo.toml.

Accuracy Tests

Benchmarking and Profiling

benchmarks were conducted ($1\mu s = 1,000ns$). The results show a dramatic reduction in routing overhead.

Configuration Before (Allocating) After (Snapshot) Performance Gain
2 Routers $19,340ns$ ($19.3\mu s$) $32ns$ ~604x Faster
10 Routers $18,648ns$ ($18.6\mu s$) $36ns$ ~518x Faster
100 Routers $22,799ns$ ($22.7\mu s$) $257ns$ ~88x Faster

Checklist

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@github-actions github-actions Bot added dependencies Pull requests that update a dependency file model-gateway labels Dec 23, 2025
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@slin1237
Copy link
Copy Markdown
Collaborator

/gemini review

@slin1237
Copy link
Copy Markdown
Collaborator

Please remove the benchmark

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and well-executed performance optimization for the router selection logic. By leveraging arc_swap to implement a lock-free snapshot mechanism, you've successfully eliminated per-request heap allocations and lock contention on the hot path, leading to impressive performance gains as demonstrated by the new benchmarks. The code is clear, and the changes are logically sound. I have one minor suggestion to refactor a small piece of duplicated logic to enhance maintainability. Overall, this is an excellent improvement.

Comment thread sgl-model-gateway/src/routers/router_manager.rs
@ppraneth
Copy link
Copy Markdown
Contributor Author

Please remove the benchmark

I have removed the file

@slin1237 slin1237 merged commit 80ae222 into sgl-project:main Dec 23, 2025
62 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file model-gateway router-benchmark run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants