Skip to content

[DLLM] Remove cuda graph batch size limitation#17458

Merged
Kangyan-Zhou merged 1 commit intosgl-project:mainfrom
btw616:dllm-remove-bs-limitation
Jan 23, 2026
Merged

[DLLM] Remove cuda graph batch size limitation#17458
Kangyan-Zhou merged 1 commit intosgl-project:mainfrom
btw616:dllm-remove-bs-limitation

Conversation

@btw616
Copy link
Copy Markdown
Contributor

@btw616 btw616 commented Jan 21, 2026

Motivation

Dynamic batching in DLLM was added in PR #14883, and the cuda graph batch size limitation should have been removed then, but it was overlooked. This PR removes the limitation.

Modifications

Remove hard-coded override of cuda_graph_bs for DLLM.

Accuracy Tests

Tested on H20-3e using test/registered/dllm/test_dllm_batching.py

Before this PR:

Accuracy: 0.915
Invalid: 0.000
Latency: 240.785 s
Output throughput: 102.216 token/s
metrics={'accuracy': 0.915, 'invalid': 0.0, 'latency': 240.7848315560259, 'output_throughput': 102.21574108696822}

After this PR:

Accuracy: 0.915
Invalid: 0.000
Latency: 130.954 s
Output throughput: 187.485 token/s
metrics={'accuracy': 0.915, 'invalid': 0.0, 'latency': 130.9542715421412, 'output_throughput': 187.4852932315319}

Benchmarking and Profiling

Please refer to the output-throughput results in the "Accuracy Tests" above.

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
  4. After green CI and required approvals, ask Merge Oncalls to merge.

Dynamic batching in DLLM is already supported, so the cuda graph
batch size limitation is no longer needed and should be removed.
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@ClawSeven
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci

@ClawSeven
Copy link
Copy Markdown
Collaborator

/rerun-failed-ci

@Kangyan-Zhou Kangyan-Zhou merged commit 5438cd2 into sgl-project:main Jan 23, 2026
240 of 273 checks passed
Johnsonms pushed a commit to Johnsonms/sglang that referenced this pull request Feb 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants