Add pipeline parallelism for DeepSeekV2#6434
Add pipeline parallelism for DeepSeekV2#6434zhjc1124 wants to merge 29 commits intosgl-project:mainfrom
Conversation
|
run test_pp_consistency |
|
Launch DeepSeek-R1 with three node(tp_size=8, pp_size=3) test bench_serving |
Yes. |
Sorry for that. I lose to import Union in deepseek_v2.py when fixing conflicts. And I found there are other bugs after merging main. |
|
did you test with tp=2,pp=8 on 8 nodes? |
I only have 3 nodes. I also succeed to launch DeepSeek-Coder-V2-Lite-Instruct with tp=2,pp=12 on 3 nodes. |
DeepSeek-Coder-V2-Lite-Chat with tp=2,pp=8 on 8 nodes is ok, but DeepSeekV3 would error |
|
I encountered an error while testing DeepSeek-V3 on MI300X with PP=8. The issue can be reproduced as follows: python3 -m sglang.bench_offline_throughput |
|
new test case |
|
I find bug that the pp partition is unbalanced, that may cause OOM. #6666 |
|
@zhjc1124 tp=2 would error with fused moe triton kernel, so i use enable-ep-moe, then can run successful, but i find pipeline parallelism implement now has no async for send hiddenstates,the speed is so slow compare with vllm's pipline parallelism using ray |
|
hi could you please rebase the code |
This PR has been included in #8846 |




Motivation
#5724 #5925
Modifications
Checklist