[Diffusion][CPU] Init CPU platform support for SGLang Diffusion by jianan-gu · Pull Request #20816 · sgl-project/sglang

jianan-gu · 2026-03-18T05:40:48Z

Motivation

This PR adds native support to run SGLang Diffusion on CPU only platforms (e.g., Intel Xeon)

Key changes

CPU source installation for SGLang Diffusion
General CPU only path logic functionality (like no offloading...)
CPU OMP core binding and automatic NUMA nodes binding
CPU functionality with key ops using torch native path (SDPA attention, apply_rotary_embedding and more)
TP functionality and commutation ops with shared memory optimizations (allreduce/allgather)

Tested models

Tongyi-MAI/Z-Image-Turbo

sglang generate --model-path Tongyi-MAI/Z-Image-Turbo  --prompt "A curious raccoon peers through a vibrant field of yellow sunflowers, its eyes wide with interest"   --generator-device cpu

A_curious_raccoon_peers_through_a_vibrant_field_of_yellow_sunflowers_its_eyes_wide_with_interest_20260318-045352_9f34c904

black-forest-labs/FLUX.1-dev

sglang generate --model-path black-forest-labs/FLUX.1-dev   --prompt "A curious raccoon peers through a vibrant field of yellow sunflowers, its eyes wide with interest"   --generator-device cpu

A_curious_raccoon_peers_through_a_vibrant_field_of_yellow_sunflowers_its_eyes_wide_with_interest_20260318-050227_67fd7637

black-forest-labs/FLUX.2-klein-4B

sglang generate --model-path black-forest-labs/FLUX.2-klein-4B   --prompt "A curious raccoon peers through a vibrant field of yellow sunflowers, its eyes wide with interest"   --generator-device cpu

A_curious_raccoon_peers_through_a_vibrant_field_of_yellow_sunflowers_its_eyes_wide_with_interest_20260318-051215_ba6c5bf9

Wan-AI/Wan2.2-TI2V-5B-Diffusers

sglang generate --prompt 'A curious raccoon peers through a vibrant field of yellow sunflowers, its eyes wide with interest' --save-output  --model-path Wan-AI/Wan2.2-TI2V-5B-Diffusers  --height 480 --width 832  --generator-device cpu

A_curious_raccoon_peers_through_a_vibrant_field_of_yellow_sunflowers_its_eyes_wide_with_interest_20260318-065816_3a3a7f5c.mp4

Qwen/Qwen-Image-Edit

sglang generate --model-path Qwen/Qwen-Image-Edit     --prompt="Convert 2D style to 3D style" --image-path="https://github.com/lm-sys/lm-sys.github.io/releases/download/test/TI2I_Qwen_Image_Edit_Input.jpg"     --width=1536 --height=1024 --save-output --generator-device cpu

Convert_2D_style_to_3D_style_20260318-083736_ee9a7bf1

FastVideo/FastWan2.1-T2V-1.3B-Diffusers

sglang generate --model-path FastVideo/FastWan2.1-T2V-1.3B-Diffusers --prompt "A curious raccoon" --save-output --generator-device cpu

A_curious_raccoon_20260318-091646_d9d96fab.mp4

More plans after this PR:

Enable CPU kernel optimizations in sgl-kernels and their integration (replace native ops like apply_rotary_embedding)
CPU AMX attention backend design and their integration (replace SDPA attention, also consider variants like SEGA)
More parallelism evaluations and supports

gemini-code-assist · 2026-03-18T05:40:50Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

gemini-code-assist · 2026-03-19T06:27:32Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

mingfeima

how do we handle attention backend?
which attention backend from ‎python/sglang/multimodal_gen/runtime/layers/attention/backends is used for cpu device?

jianan-gu · 2026-04-16T07:07:49Z

Hi @mickqian could you please help review this PR? Thanks.

jianan-gu · 2026-04-20T06:58:13Z

/tag-and-rerun-ci

…project#20816)

init pure cpu platform for sglang diffusion

8687e27

github-actions Bot added dependencies Pull requests that update a dependency file diffusion SGLang Diffusion jit-kernel labels Mar 18, 2026

jianan-gu changed the title ~~[Diffusion][CPU] Init pure cpu platform support for SGLang Diffusion~~ [Diffusion][CPU] Init pure CPU platform support for SGLang Diffusion Mar 18, 2026

jianan-gu changed the title ~~[Diffusion][CPU] Init pure CPU platform support for SGLang Diffusion~~ [Diffusion][CPU] Init CPU platform support for SGLang Diffusion Mar 18, 2026

jianan-gu added 4 commits March 18, 2026 01:42

format

248a3b3

minor refine offload

0f2ce5b

add cpu core binding

8bd90b5

use shm comm ops

1206671

jianan-gu marked this pull request as ready for review March 19, 2026 06:27

jianan-gu requested review from BBuf, mickqian, ping1jing2, yhyang201 and yingluosanqian as code owners March 19, 2026 06:27

jianan-gu added 4 commits March 22, 2026 23:02

Merge remote-tracking branch 'origin' into sglang_diffusion_cpu

bc880a9

minor fix after rebase

38a2268

format

e5663fa

Merge branch 'main' into sglang_diffusion_cpu

d05f6d7

mingfeima reviewed Mar 29, 2026

View reviewed changes

Comment thread python/sglang/jit_kernel/diffusion/triton/rotary.py

mingfeima requested changes Mar 30, 2026

View reviewed changes

jianan-gu added 3 commits April 16, 2026 02:01

Merge remote-tracking branch 'origin' into sglang_diffusion_cpu

6ad1925

refinements

25559cd

refine cpu-worker

7b6b6f3

jianan-gu requested a review from mingfeima April 16, 2026 07:06

Merge branch 'main' into sglang_diffusion_cpu

d94cc5c

mickqian reviewed Apr 17, 2026

View reviewed changes

Comment thread python/sglang/multimodal_gen/runtime/managers/cpu_worker.py

refine cpu worker

45ee675

jianan-gu requested a review from mickqian April 17, 2026 08:26

mickqian reviewed Apr 18, 2026

View reviewed changes

Comment thread python/sglang/multimodal_gen/runtime/managers/cpu_worker.py Outdated

jianan-gu added 2 commits April 20, 2026 02:56

refactor

ded813f

Merge branch 'main' into sglang_diffusion_cpu

4ef0d92

jianan-gu requested a review from mickqian April 20, 2026 06:58

github-actions Bot added the run-ci label Apr 20, 2026

mickqian approved these changes Apr 20, 2026

View reviewed changes

mingfeima approved these changes Apr 21, 2026

View reviewed changes

mingfeima merged commit 2cf3ac5 into sgl-project:main Apr 21, 2026
130 of 164 checks passed

zhangying098 pushed a commit to zhangying098/sglang that referenced this pull request Apr 23, 2026

[Diffusion][CPU] Init CPU platform support for SGLang Diffusion (sgl-…

8c8222f

…project#20816)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Diffusion][CPU] Init CPU platform support for SGLang Diffusion#20816

[Diffusion][CPU] Init CPU platform support for SGLang Diffusion#20816
mingfeima merged 16 commits intosgl-project:mainfrom
jianan-gu:sglang_diffusion_cpu

jianan-gu commented Mar 18, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Mar 18, 2026

Uh oh!

gemini-code-assist Bot commented Mar 19, 2026

Uh oh!

Uh oh!

mingfeima left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jianan-gu commented Apr 16, 2026

Uh oh!

Uh oh!

Uh oh!

jianan-gu commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jianan-gu commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Key changes

Tested models

More plans after this PR:

Uh oh!

gemini-code-assist Bot commented Mar 18, 2026

Uh oh!

gemini-code-assist Bot commented Mar 19, 2026

Uh oh!

Uh oh!

mingfeima left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jianan-gu commented Apr 16, 2026

Uh oh!

Uh oh!

Uh oh!

jianan-gu commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jianan-gu commented Mar 18, 2026 •

edited

Loading