Add support for more batch sizes in cpu_graph_runner by CaoE · Pull Request #13881 · sgl-project/sglang

CaoE · 2025-11-25T03:33:58Z

Motivation

Add support for more batch sizes in cpu_graph_runner to reduce python overhead and achieve higher performance.

Modifications

Add replay_prepare to pad the input in order to utilize the compiled graph.
Change the strategy for capture_bs, adding more support for capture_bs.
Reuse --cuda-graph-bs to allow users to customize the graph batch size on CPU.

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.
Work with maintainers to merge your PR. See the PR Merge Process

CaoE · 2025-11-25T05:42:23Z

Hi @Alcanderian @FlamingoPg @zhyncs Currently, sglang uses parameters with a "cuda" prefix for graph capture, such as --cuda-graph-bs. It is hard to use parameters like --cuda-graph-bs for CPU usage because it's confusing, and similar issues may arise in the future if other device types, such as XPU, also begin to support graph mode. Will sglang consider using parameters, e.g., device-graph-bs, that are applicable to multiple device types? Or, if necessary, add device-specific parameters for other devices, such as cpu-graph-bs? In this draft PR, I added cpu-graph-bs to raise this question as a reference. Thank you very much if you could give me some suggestions. cc @mingfeima

CaoE · 2025-12-26T07:17:01Z

/tag-run-ci-label

CaoE · 2025-12-26T09:29:24Z

@Alcanderian @zhyncs Could you please help review this PR ? Thanks.

CaoE · 2026-01-04T07:14:20Z

@FlamingoPg @Alcanderian Could you please review this PR ? Thank you.

CaoE · 2026-01-07T02:10:51Z

/rerun-failed-ci

CaoE · 2026-01-07T05:30:47Z

/rerun-failed-ci

mingfeima · 2026-02-27T05:11:01Z

/rerun-failed-ci

ZailiWang · 2026-03-16T02:48:07Z

/rerun-failed-ci

CaoE · 2026-03-19T14:35:23Z

/rerun-failed-ci

github-actions Bot added documentation Improvements or additions to documentation deepseek labels Nov 25, 2025

CaoE force-pushed the compile_padding2 branch 2 times, most recently from c1692a4 to 64ad61e Compare November 25, 2025 03:35

CaoE changed the title ~~Add support for more batch sizes in torch.compile on the CPU~~ Add support for more batch sizes in torch.compile on cpu_graph_runner Dec 4, 2025

CaoE changed the title ~~Add support for more batch sizes in torch.compile on cpu_graph_runner~~ Add support for more batch sizes in cpu_graph_runner Dec 4, 2025

CaoE force-pushed the compile_padding2 branch from 16ab343 to 36e8557 Compare December 26, 2025 07:01

CaoE marked this pull request as ready for review December 26, 2025 07:12

CaoE requested review from Fridge003, Ying1123, hnyls2002, ispobock and merrymercy as code owners December 26, 2025 07:12

github-actions Bot added the run-ci label Dec 26, 2025

CaoE requested review from BBuf, FlamingoPg, HaiShaw, yizhang2077 and zhyncs as code owners December 26, 2025 07:22

github-actions Bot added the sgl-kernel label Dec 26, 2025

CaoE added 4 commits December 26, 2025 15:00

add more batch size support for torch.compile on CPU

36e8557

Merge branch 'main' into compile_padding2

c0880c9

fix registration

a548c9f

Merge branch 'main' into compile_padding2

331ce5e

jianan-gu added a commit to jianan-gu/sglang that referenced this pull request Feb 27, 2026

port sgl-project#13881

03ff2dc

CaoE added 10 commits February 27, 2026 16:28

merge main

ba19424

update

a3ba473

Merge branch 'main' into compile_padding2

c513a48

update

ab2da1e

Merge branch 'main' into compile_padding2

e6140b7

Merge branch 'main' into compile_padding2

390e8a6

fix disable_piecewise_cuda_graph on XPU

d5d34a4

Merge branch 'main' into compile_padding2

1b7cf94

Merge branch 'main' into compile_padding2

a69a584

Merge branch 'main' into compile_padding2

a7d51af

yeahdongcn mentioned this pull request Mar 7, 2026

[diffusion][llm] macOS support #19549

Merged

5 tasks

Merge branch 'main' into compile_padding2

153f894

CaoE requested review from Kangyan-Zhou and bingxche as code owners March 13, 2026 01:35

fix xpu ci

92c30f6

CaoE added 4 commits March 16, 2026 09:20

merge main and fix conflicts

f6a9a34

fix xpu ci

f3fa7db

reslove conflicts

4f50912

merge main and reslove conflicts

b6346f4

Kangyan-Zhou merged commit 274581f into sgl-project:main Mar 19, 2026
109 of 146 checks passed

Wangzheee pushed a commit to Wangzheee/sglang that referenced this pull request Mar 21, 2026

Add support for more batch sizes in cpu_graph_runner (sgl-project#13881)

6d5bcb8

0-693 pushed a commit to 0-693/sglang that referenced this pull request Mar 25, 2026

Add support for more batch sizes in cpu_graph_runner (sgl-project#13881)

7b2226f

dutsc pushed a commit to dutsc/sglang that referenced this pull request Mar 30, 2026

Add support for more batch sizes in cpu_graph_runner (sgl-project#13881)

a8d7cb6

JustinTong0323 pushed a commit to JustinTong0323/sglang that referenced this pull request Apr 7, 2026

Add support for more batch sizes in cpu_graph_runner (sgl-project#13881)

d0216ed

yhyang201 pushed a commit to yhyang201/sglang that referenced this pull request Apr 22, 2026

Add support for more batch sizes in cpu_graph_runner (sgl-project#13881)

b45d83d

ZailiWang mentioned this pull request Apr 23, 2026

[Intel CPU/XPU] SGL doc updates #23547

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for more batch sizes in cpu_graph_runner#13881

Add support for more batch sizes in cpu_graph_runner#13881
Kangyan-Zhou merged 38 commits intosgl-project:mainfrom
CaoE:compile_padding2

CaoE commented Nov 25, 2025 •

edited

Loading

Uh oh!

CaoE commented Nov 25, 2025 •

edited

Loading

Uh oh!

CaoE commented Dec 26, 2025

Uh oh!

CaoE commented Dec 26, 2025

Uh oh!

CaoE commented Jan 4, 2026

Uh oh!

CaoE commented Jan 7, 2026

Uh oh!

CaoE commented Jan 7, 2026

Uh oh!

mingfeima commented Feb 27, 2026

Uh oh!

ZailiWang commented Mar 16, 2026

Uh oh!

CaoE commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

CaoE commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Checklist

Uh oh!

CaoE commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CaoE commented Dec 26, 2025

Uh oh!

CaoE commented Dec 26, 2025

Uh oh!

CaoE commented Jan 4, 2026

Uh oh!

CaoE commented Jan 7, 2026

Uh oh!

CaoE commented Jan 7, 2026

Uh oh!

mingfeima commented Feb 27, 2026

Uh oh!

ZailiWang commented Mar 16, 2026

Uh oh!

CaoE commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

CaoE commented Nov 25, 2025 •

edited

Loading

CaoE commented Nov 25, 2025 •

edited

Loading