[model-gateway] Fix IGW routing and optimize RouterManager by slin1237 · Pull Request #15741 · sgl-project/sglang

slin1237 · 2025-12-24T09:14:48Z

Add static RouterId constants to avoid heap allocations in hot paths
Replace RwLock .unwrap() with .unwrap_or_else() for poison safety
Add model ID validation in IGW mode for chat/completion/generate
Add OpenAI router creation in IGW mode
Simplify non-IGW mode routing (router handles validation)

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.
Work with maintainers to merge your PR. See the PR Merge Process

- Add static RouterId constants to avoid heap allocations in hot paths - Replace RwLock .unwrap() with .unwrap_or_else() for poison safety - Add model ID validation in IGW mode for chat/completion/generate - Add OpenAI router creation in IGW mode - Simplify non-IGW mode routing (router handles validation)

gemini-code-assist · 2025-12-24T09:15:07Z

Summary of Changes

Hello @slin1237, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refines the model-gateway's routing mechanism, focusing on performance, concurrency safety, and specialized handling for Intelligent Gateway (IGW) operations. It introduces static router identifiers to reduce memory allocations, improves RwLock error handling, and implements distinct model ID validation and router creation logic tailored for IGW and non-IGW environments.

Highlights

Performance Optimization: Introduced static RouterId constants using &'static str to eliminate heap allocations in hot paths within the RouterManager, improving overall performance.
Concurrency Safety: Enhanced robustness by replacing RwLock.unwrap() calls with unwrap_or_else(|e| e.into_inner()) to gracefully handle poisoned locks.
Intelligent Gateway (IGW) Routing Enhancements: Implemented explicit model ID validation and early failure for chat, completion, and generate requests when operating in IGW mode. Also, integrated the creation and registration of an OpenAI-compatible router within the RouterManager for IGW mode.
Simplified Non-IGW Routing: Streamlined model ID handling for non-IGW modes, delegating validation responsibilities to individual routers, thus simplifying the RouterManager's logic in these cases.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist · 2025-12-24T09:46:22Z

Summary of Changes

Hello @slin1237, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refines the model gateway's routing mechanism, focusing on performance, stability, and intelligent gateway (IGW) functionality. It introduces static router identifiers to minimize memory overhead, improves concurrency safety by handling RwLock poisoning, and enhances IGW mode with explicit model ID validation and OpenAI router support. These changes collectively aim to make the gateway more efficient and robust.

Highlights

Performance Optimization: Introduced static RouterId constants to reduce heap allocations in critical routing paths, improving overall performance.
Concurrency Safety: Replaced RwLock.unwrap() calls with unwrap_or_else() for poison safety, enhancing the robustness of the RouterManager against panics.
Intelligent Gateway (IGW) Routing Logic: Implemented model ID validation for chat, completion, and generate requests when operating in IGW mode, ensuring early failure for unresolvable models.
OpenAI Router Integration: Added the capability to create and register an OpenAI-compatible router within the RouterManager, specifically for IGW mode.
Simplified Non-IGW Routing: Streamlined the routing process for non-IGW modes by delegating model validation responsibilities to individual routers, reducing complexity in the RouterManager.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist · 2025-12-24T10:17:24Z

Summary of Changes

Hello @slin1237, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the 'model-gateway's routing capabilities and robustness. It introduces performance optimizations by leveraging static router identifiers to minimize memory allocations, improves concurrency safety by gracefully handling 'RwLock' poisoning, and refines model ID validation, particularly for Intelligent Gateway (IGW) mode. Additionally, it integrates OpenAI router creation and simplifies non-IGW routing by shifting validation responsibilities to the specific routers.

Highlights

Performance Optimization: Introduced static RouterId constants using '&'static str to eliminate heap allocations in hot paths, improving performance and reducing runtime overhead.
Concurrency Safety: Replaced 'RwLock.unwrap()' calls with 'unwrap_or_else(|e| e.into_inner())' to enhance poison safety when dealing with 'RwLock's, ensuring more robust error handling in concurrent operations.
IGW Model ID Validation: Implemented explicit model ID validation for 'chat', 'completion', and 'generate' requests when operating in IGW (Intelligent Gateway) mode, ensuring requests are routed to valid models and failing fast if not.
OpenAI Router Integration: Enabled the creation and registration of an OpenAI-compatible router within the 'RouterManager' for IGW mode, expanding routing capabilities.
Simplified Non-IGW Routing: Streamlined the routing logic for non-IGW mode by delegating model validation to the individual routers, reducing complexity in the 'RouterManager' itself.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces several valuable optimizations and fixes to the RouterManager. The move to &'static str for RouterId and using constants is a great performance improvement, reducing heap allocations in hot paths. The adoption of unwrap_or_else for RwLocks enhances robustness by handling potential lock poisoning. The routing logic is also improved, with clearer separation between IGW and non-IGW modes, and more efficient router selection.

I've added a few suggestions to further improve code clarity and maintainability by addressing some minor code duplication and an unreachable code path. Overall, this is a solid set of changes.

gemini-code-assist · 2025-12-24T10:19:25Z

+            (ConnectionMode::Http, RoutingMode::OpenAI { .. }) => router_ids::HTTP_OPENAI,
+            (ConnectionMode::Grpc { .. }, RoutingMode::Regular { .. }) => router_ids::GRPC_REGULAR,
+            (ConnectionMode::Grpc { .. }, RoutingMode::PrefillDecode { .. }) => router_ids::GRPC_PD,
+            (ConnectionMode::Grpc { .. }, RoutingMode::OpenAI { .. }) => router_ids::GRPC_REGULAR,


The combination of ConnectionMode::Grpc and RoutingMode::OpenAI represents an invalid configuration that should be caught during startup validation in RouterFactory. Therefore, this match arm is unreachable. Returning GRPC_REGULAR is misleading. Using unreachable! makes this assumption explicit and will cause a panic if this logic is ever reached, which is desirable for what should be an impossible state.

Suggested change

(ConnectionMode::Grpc { .. }, RoutingMode::OpenAI { .. }) => router_ids::GRPC_REGULAR,

(ConnectionMode::Grpc { .. }, RoutingMode::OpenAI { .. }) => unreachable!("Invalid config: OpenAI mode requires HTTP"),

gemini-code-assist · 2025-12-24T10:19:26Z

+        let router =
+            self.select_router_for_request(headers, effective_model_id.as_deref().or(model_id));

        if let Some(router) = router {
            router
-                .route_generate(headers, body, Some(&resolved_model_id))
+                .route_generate(headers, body, effective_model_id.as_deref().or(model_id))
                .await


The expression effective_model_id.as_deref().or(model_id) is used twice here. To improve clarity and avoid re-computation, it's better to evaluate it once and store the result in a variable. This same pattern is repeated in route_chat and route_completion and could be improved there as well.

Suggested change

let router =

self.select_router_for_request(headers, effective_model_id.as_deref().or(model_id));

if let Some(router) = router {

router

.route_generate(headers, body, Some(&resolved_model_id))

.route_generate(headers, body, effective_model_id.as_deref().or(model_id))

.await

let final_model_id = effective_model_id.as_deref().or(model_id);

let router = self.select_router_for_request(headers, final_model_id);

if let Some(router) = router {

router

.route_generate(headers, body, final_model_id)

.await

gemini-code-assist · 2025-12-24T10:19:26Z

+        let effective_model_id = if self.enable_igw {
+            // Use provided model_id or fall back to body.model
+            let model = model_id.or(Some(&body.model));
+            match self.resolve_model_id(model) {
+                Ok(id) => Some(id),
+                Err(err_response) => return *err_response,
+            }
+        } else {
+            None
+        };


This block of logic for resolving effective_model_id is duplicated in route_completion (lines 564-573). Consider extracting this into a private helper function to reduce code duplication and improve maintainability. The helper could take enable_igw, model_id, and body.model as arguments.

…ct#15741)

slin1237 requested review from CatherineSue and key4ng as code owners December 24, 2025 09:14

github-actions Bot added the model-gateway label Dec 24, 2025

slin1237 added the run-ci label Dec 24, 2025

gemini-code-assist Bot reviewed Dec 24, 2025

View reviewed changes

slin1237 merged commit 2f7c629 into main Dec 24, 2025
72 checks passed

slin1237 deleted the fixup-n/1 branch December 24, 2025 16:28

jiaming1130 pushed a commit to zhuyijie88/sglang that referenced this pull request Dec 25, 2025

[model-gateway] Fix IGW routing and optimize RouterManager (sgl-proje…

d51671d

…ct#15741)

Leoyzen pushed a commit to Leoyzen/sglang that referenced this pull request Dec 25, 2025

[model-gateway] Fix IGW routing and optimize RouterManager (sgl-proje…

a9c32b5

…ct#15741)

Leoyzen pushed a commit to Leoyzen/sglang that referenced this pull request Dec 25, 2025

[model-gateway] Fix IGW routing and optimize RouterManager (sgl-proje…

2e4dab8

…ct#15741)

YChange01 pushed a commit to YChange01/sglang that referenced this pull request Jan 13, 2026

[model-gateway] Fix IGW routing and optimize RouterManager (sgl-proje…

896ce64

…ct#15741)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[model-gateway] Fix IGW routing and optimize RouterManager#15741

[model-gateway] Fix IGW routing and optimize RouterManager#15741
slin1237 merged 1 commit intomainfrom
fixup-n/1

slin1237 commented Dec 24, 2025

Uh oh!

gemini-code-assist Bot commented Dec 24, 2025

Uh oh!

gemini-code-assist Bot commented Dec 24, 2025

Uh oh!

gemini-code-assist Bot commented Dec 24, 2025

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Dec 24, 2025

Uh oh!

gemini-code-assist Bot Dec 24, 2025

Uh oh!

gemini-code-assist Bot Dec 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	(ConnectionMode::Grpc { .. }, RoutingMode::OpenAI { .. }) => router_ids::GRPC_REGULAR,
	(ConnectionMode::Grpc { .. }, RoutingMode::OpenAI { .. }) => unreachable!("Invalid config: OpenAI mode requires HTTP"),

Conversation

slin1237 commented Dec 24, 2025

Checklist

Uh oh!

gemini-code-assist Bot commented Dec 24, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot commented Dec 24, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot commented Dec 24, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant