[model-gateway]Enable IGW mode with gRPC router and auto enable IGW when service discovery is turned on by YouNeedCryDear · Pull Request #15459 · sgl-project/sglang

YouNeedCryDear · 2025-12-19T07:32:46Z

Summary

Enable IGW mode to spin up gRPC routers (regular and PD) and select them preferentially when matching workers are present.
Auto-enable IGW when service discovery is requested, aligning router initialization with discovery-driven worker registration.
Improve readiness reporting by returning 200 when any registered worker is healthy.

Motivation

IGW mode internally uses all types of routers; previously parser factories and router creation logic were tied to explicit single gRPC instance, leaving IGW without the necessary gRPC router coverage.
Service discovery implies IGW behavior; requiring a separate flag led to misconfiguration risk and silent misalignment between discovery and router mode.
Operators need the router manager to pick the best transport (gRPC vs HTTP, PD vs regular) based on available workers and to surface health when capacity exists.

Modifications

sgl-model-gateway/src/app_context.rs: Initialize reasoning/tool parser factories when either gRPC or IGW is enabled, reflecting IGW’s gRPC router usage.
sgl-model-gateway/src/main.rs: Decouple routing-mode selection from IGW, streamline PD routing config, and automatically flip enable_igw on when service discovery is requested (with an info log).
sgl-model-gateway/src/routers/router_manager.rs: Always create a gRPC regular router in IGW mode; create HTTP and gRPC PD routers only when PD disaggregation is enabled; choose routers based on worker connection mode/role priority (grpc-pd > http-pd > grpc-regular > http-regular); update health endpoint to report ready if any worker is healthy.

Accuracy Tests

Benchmarking and Profiling

Screenshot

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.
Work with maintainers to merge your PR. See the PR Merge Process

gemini-code-assist · 2025-12-19T07:32:49Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2025-12-19T07:41:24Z

+            // Load tokenizer with thread safe lock
+            if let Err(e) = app_context
+                .tokenizer_registry
+                .load(&model_id, || async move {
+                    factory::create_tokenizer_async(&tokenizer_path.to_string())
+                        .await
+                        .map_err(|e| e.to_string())


Preserve chat_template when loading worker tokenizers

Dynamic tokenizer loading ignores the configured chat template: RegisterTokenizerStep loads tokenizers with create_tokenizer_async (lines 50-56) without passing RouterConfig.chat_template or the worker’s model card template. In service-discovery/IGW mode this path supplies the only tokenizer for gRPC routers, so any --chat-template override is silently dropped and prompts are formatted with the default template, which can break models that rely on the custom template. Consider passing the configured chat template when registering tokenizers so gRPC routing remains consistent with the router’s settings.

Useful? React with 👍 / 👎.

Thanks @slin1237 for fixing this bug

…s set

…hen service discovery is turned on (sgl-project#15459)

YouNeedCryDear requested review from ByronHsu, CatherineSue, key4ng and slin1237 as code owners December 19, 2025 07:32

github-actions Bot added the model-gateway label Dec 19, 2025

chatgpt-codex-connector Bot reviewed Dec 19, 2025

View reviewed changes

YouNeedCryDear force-pushed the enable-igw-grpc branch from 62e1268 to 0747af1 Compare December 24, 2025 05:59

slin1237 added the run-ci label Dec 24, 2025

YouNeedCryDear force-pushed the enable-igw-grpc branch from 0747af1 to e642fe9 Compare December 24, 2025 07:23

support grpc in IGW mode and auto enable IGW when service discovery i…

748b556

…s set

YouNeedCryDear force-pushed the enable-igw-grpc branch from e642fe9 to 748b556 Compare December 24, 2025 07:25

slin1237 approved these changes Dec 24, 2025

View reviewed changes

slin1237 merged commit f65fa04 into sgl-project:main Dec 24, 2025
62 checks passed

YouNeedCryDear deleted the enable-igw-grpc branch December 25, 2025 08:57

jiaming1130 pushed a commit to zhuyijie88/sglang that referenced this pull request Dec 25, 2025

[model-gateway]Enable IGW mode with gRPC router and auto enable IGW w…

233e02a

…hen service discovery is turned on (sgl-project#15459)

YChange01 pushed a commit to YChange01/sglang that referenced this pull request Jan 13, 2026

[model-gateway]Enable IGW mode with gRPC router and auto enable IGW w…

a4f8a13

…hen service discovery is turned on (sgl-project#15459)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[model-gateway]Enable IGW mode with gRPC router and auto enable IGW when service discovery is turned on#15459

[model-gateway]Enable IGW mode with gRPC router and auto enable IGW when service discovery is turned on#15459
slin1237 merged 1 commit intosgl-project:mainfrom
YouNeedCryDear:enable-igw-grpc

YouNeedCryDear commented Dec 19, 2025 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Dec 19, 2025

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Dec 19, 2025

Uh oh!

YouNeedCryDear Dec 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

YouNeedCryDear commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Screenshot

Checklist

Uh oh!

gemini-code-assist Bot commented Dec 19, 2025

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

YouNeedCryDear Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

YouNeedCryDear commented Dec 19, 2025 •

edited

Loading