[model-gateway] Add model scope support and LRU eviction for GPU-constrained environments by slin1237 · Pull Request #16525 · sgl-project/sglang

slin1237 · 2026-01-05T23:13:18Z

Adds session/class scope support for models to enable efficient GPU resource management when running tests with more models than available GPUs.

Key changes:

Add @pytest.mark.model(name, scope="session|class") marker support
Session-scoped models are pre-launched at startup (default)
Class-scoped models are launched on-demand when needed
Add LRU eviction: when GPUs are full, least recently used models are evicted
Evicted models are queued for re-launch when needed again
Add Gateway class for unified gateway lifecycle management
Add worker management API tests (IGW mode)

This enables CI environments with limited GPUs (e.g., 4 GPUs, 6 models):

Pre-launch what fits (a, b, c, d)
Queue overflow (e, f)
On-demand: when test needs 'e', evict LRU and launch 'e'

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.

Review Process

Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
Get approvals from CODEOWNERS and other reviewers.
Trigger CI tests with comments (/tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci) or contact authorized users to do so.
After green CI and required approvals, ask Merge Oncalls to merge.

gemini-code-assist · 2026-01-05T23:13:21Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

…trained environments Adds session/class scope support for models to enable efficient GPU resource management when running tests with more models than available GPUs. Key changes: - Add @pytest.mark.model(name, scope="session|class") marker support - Session-scoped models are pre-launched at startup (default) - Class-scoped models are launched on-demand when needed - Add LRU eviction: when GPUs are full, least recently used models are evicted - Evicted models are queued for re-launch when needed again - Add Gateway class for unified gateway lifecycle management - Add worker management API tests (IGW mode) This enables CI environments with limited GPUs (e.g., 4 GPUs, 6 models): 1. Pre-launch what fits (a, b, c, d) 2. Queue overflow (e, f) 3. On-demand: when test needs 'e', evict LRU and launch 'e'

…trained environments (sgl-project#16525)

slin1237 requested review from CatherineSue and key4ng as code owners January 5, 2026 23:13

github-actions Bot added the model-gateway label Jan 5, 2026

slin1237 added the run-ci label Jan 5, 2026

slin1237 force-pushed the smg-ci-n/10 branch from 7aea08f to 745598c Compare January 6, 2026 00:00

slin1237 force-pushed the smg-ci-n/10 branch 5 times, most recently from df325d1 to 48b8349 Compare January 6, 2026 01:55

trigger build

11b3158

slin1237 force-pushed the smg-ci-n/10 branch from 48b8349 to 11b3158 Compare January 6, 2026 02:15

slin1237 merged commit 402a0bd into main Jan 6, 2026
58 of 61 checks passed

slin1237 deleted the smg-ci-n/10 branch January 6, 2026 02:28

jamesjxliu pushed a commit to jamesjxliu/sglang that referenced this pull request Jan 6, 2026

[model-gateway] Add model scope support and LRU eviction for GPU-cons…

b8e8da6

…trained environments (sgl-project#16525)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[model-gateway] Add model scope support and LRU eviction for GPU-constrained environments#16525

[model-gateway] Add model scope support and LRU eviction for GPU-constrained environments#16525
slin1237 merged 2 commits intomainfrom
smg-ci-n/10

slin1237 commented Jan 5, 2026

Uh oh!

gemini-code-assist Bot commented Jan 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

slin1237 commented Jan 5, 2026

Checklist

Review Process

Uh oh!

gemini-code-assist Bot commented Jan 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant