Skip to content

[PD] Support PD disaggregation with Prefill PP#8846

Merged
zhyncs merged 35 commits intomainfrom
fixing_pd_pp
Aug 17, 2025
Merged

[PD] Support PD disaggregation with Prefill PP#8846
zhyncs merged 35 commits intomainfrom
fixing_pd_pp

Conversation

@ShangmingCai
Copy link
Copy Markdown
Collaborator

@ShangmingCai ShangmingCai commented Aug 6, 2025

Motivation

To reduce TTFT, we want to support Prefill PP with PD disaggregation through this PR.

Decode PP will be supported in the next PR. @ssssnow

Accuracy Test

Qwen3-8B(Prefill TP 2 PP2, Decode TP 1):
python sglang/benchmark/gsm8k/bench_sglang.py --port 8000 --parallel 300 --num-questions 300
100%|█| 300/300 [02:29<00:00,  2.0
Accuracy: 0.943
Invalid: 0.000
Latency: 149.493 s
Output throughput: 249.315 token/s

Qwen3-8B(Prefill TP 2 PP2, Decode TP 2):
python sglang/benchmark/gsm8k/bench_sglang.py --port 8000 --parallel 300 --num-questions 300
100%|█████| 300/300 [00:24<00:00, 12.01it/s]
Accuracy: 0.947
Invalid: 0.000
Latency: 25.186 s
Output throughput: 1469.696 token/s

Qwen3-8B(Prefill TP 1 PP2, Decode TP 2):
python sglang/benchmark/gsm8k/bench_sglang.py --port 8000 --parallel 300 --num-questions 300
100%|█| 300/300 [00:26<00:00, 11.15i
Accuracy: 0.950
Invalid: 0.000
Latency: 27.047 s
Output throughput: 1378.700 token/s

Qwen3-8B(Prefill DP 2 TP 1 PP2, Decode TP 4):
python sglang/benchmark/gsm8k/bench_sglang.py --port 8000 --parallel 300 --num-questions 300
100%|█| 300/300 [00:20<00:00, 14.44i
Accuracy: 0.950
Invalid: 0.000
Latency: 21.158 s
Output throughput: 1755.561 token/s

DeepSeek(Prefill TP 8 PP2, Decode TP 16): verified by @ssssnow 
python benchmark/gsm8k/bench_sglang.py --port 9091 --parallel 300 --num-questions 300
Accuracy: 0.947
Invalid: 0.000
Latency: 32.184 s
Output throughput: 927.104 token/s

Benchmark & Profiling

Checklist

ShangmingCai and others added 17 commits July 30, 2025 19:32
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
Signed-off-by: Shangming Cai <csmthu@gmail.com>
Signed-off-by: Shangming Cai <csmthu@gmail.com>
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

ShangmingCai and others added 3 commits August 6, 2025 13:07
Signed-off-by: Shangming Cai <csmthu@gmail.com>
Signed-off-by: Shangming Cai <csmthu@gmail.com>
@ssssnow
Copy link
Copy Markdown
Contributor

ssssnow commented Aug 6, 2025

Notice: pp related modification in deepseek_v2.py is borrowed from #6434, related author should be added as co-authors

@ShangmingCai ShangmingCai changed the title [WIP][PD] Support PD disaggregation with Prefill PP [PD] Support PD disaggregation with Prefill PP Aug 7, 2025
@zhyncs zhyncs merged commit 384f8ab into main Aug 17, 2025
126 of 130 checks passed
@zhyncs zhyncs deleted the fixing_pd_pp branch August 17, 2025 01:31
narutolhy pushed a commit to narutolhy/sglang that referenced this pull request Aug 17, 2025
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
Signed-off-by: Shangming Cai <csmthu@gmail.com>
Co-authored-by: root <huzhiyuan@xiaohongshu.com>
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
Co-authored-by: Francis <38564764+ssssnow@users.noreply.github.com>
Co-authored-by: zitto <zhjc1124@gmail.com>
MahmoudAshraf97 pushed a commit to MahmoudAshraf97/sglang that referenced this pull request Sep 8, 2025
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
Signed-off-by: Shangming Cai <csmthu@gmail.com>
Co-authored-by: root <huzhiyuan@xiaohongshu.com>
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
Co-authored-by: Francis <38564764+ssssnow@users.noreply.github.com>
Co-authored-by: zitto <zhjc1124@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants