Skip to content

[bugfix] Fix prefill tbo disabled when --deepep-mode=auto#14333

Merged
ch-wan merged 2 commits intosgl-project:mainfrom
yuhyao:bugfix/tbo-deepep-auto
Dec 3, 2025
Merged

[bugfix] Fix prefill tbo disabled when --deepep-mode=auto#14333
ch-wan merged 2 commits intosgl-project:mainfrom
yuhyao:bugfix/tbo-deepep-auto

Conversation

@yuhyao
Copy link
Copy Markdown
Contributor

@yuhyao yuhyao commented Dec 3, 2025

Motivation

When TBO is enabled and deepep is set to auto, TBO is actually not activated for prefill batches.

Modifications

We found that in prepare_mlp_sync_batch_raw, tbo_preparer.prepare_all_gather does not return the correct local_can_run_tbo. The issue is that deepep_mode.resolve(local_batch.is_extend_in_batch) produces an incorrect result because local_batch.is_extend_in_batch is not set properly.

Accuracy Tests

Tested Qwen3-235B-A22B-FP8 (dp_attn_size = 8, ep_size = 8) on 8 * H800:
Launch command:

python -m sglang.launch_server --model-path path/to/Qwen3-235B-A22B-FP8 --tp-size 8 --context-length 40960 --host 0.0.0.0 --port 8000 --random-seed 42 --max-prefill-tokens 20480 --mem-fraction-static 0.7 --max-running-requests 128 --disable-radix-cache --chunked-prefill-size -1 --dp-size 8 --enable-dp-attention --ep-size 8 --moe-a2a-backend deepep --deepep-mode auto --attention-backend flashinfer --enable-two-batch-overlap

Test command:

python -m sglang.test.few_shot_gsm8k --port 8000 --num-questions 200

Result:

100%|███████████████████████████████████████████████████████████████████| 200/200 [00:31<00:00,  6.43it/s]
Accuracy: 0.960
Invalid: 0.000
Latency: 31.960 s
Output throughput: 869.569 token/s

Benchmarking and Profiling

Checklist

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@ch-wan ch-wan added the run-ci label Dec 3, 2025
@ch-wan ch-wan self-assigned this Dec 3, 2025
@ch-wan ch-wan merged commit 77512ae into sgl-project:main Dec 3, 2025
25 of 71 checks passed
tom-jerr pushed a commit to tom-jerr/sglang that referenced this pull request Dec 4, 2025
yingluosanqian pushed a commit to yingluosanqian/sglang that referenced this pull request Dec 4, 2025
tonyluj pushed a commit to openanolis/sglang that referenced this pull request Dec 5, 2025
tonyluj pushed a commit to openanolis/sglang that referenced this pull request Dec 5, 2025
yuchengz816-bot pushed a commit to yuchengz816-bot/sglang that referenced this pull request Dec 8, 2025
Kevin-XiongC pushed a commit to novitalabs/sglang that referenced this pull request Dec 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants