[DP Attention] Optimize dp_padding_mode selection for dp_size=1 in extend mode by wangfakang · Pull Request #20406 · sgl-project/sglang

wangfakang · 2026-03-12T04:35:12Z

CC @yizhang2077 @ShangmingCai @nvcastet @ch-wan @merrymercy @Fridge003 PTAL, thx.

Motivation

When dp_size=1, the MAX_LEN and SUM_LEN modes have identical communication overhead since max_len equals sum_len. Previously, extend mode(get_dp_padding_mode) unconditionally used SUM_LEN, which prevented symmetric memory from being used (via disabled=True).

Now with dp_size=1, we prefer MAX_LEN mode to enable symmetric memory optimizations needed for NSA CP and other features.

sglang/python/sglang/srt/layers/dp_attention.py

Lines 64 to 68 in abc672e

    
           def get_dp_padding_mode( 
        
               cls, is_extend_in_batch, global_num_tokens: List[int] 
        
           ) -> DpPaddingMode: 
        
               if is_extend_in_batch: 
        
                   return DpPaddingMode.SUM_LEN

sglang/python/sglang/srt/layers/dp_attention.py

Lines 119 to 125 in abc672e

    
           def get_global_dp_buffer(cls) -> torch.Tensor: 
        
               with use_symmetric_memory(get_tp_group(), disabled=not cls._dp_max_padding): 
        
                   buffer = torch.empty( 
        
                       (cls._global_dp_buffer_len, cls._hidden_size), 
        
                       dtype=cls._dtype, 
        
                       device=cls._device, 
        
                   )

sglang/python/sglang/srt/layers/dp_attention.py

Lines 129 to 135 in abc672e

    
           def get_local_dp_buffer(cls) -> torch.Tensor: 
        
               with use_symmetric_memory(get_tp_group(), disabled=not cls._dp_max_padding): 
        
                   buffer = torch.empty( 
        
                       (cls._local_dp_buffer_len, cls._hidden_size), 
        
                       dtype=cls._dtype, 
        
                       device=cls._device, 
        
                   )

Modifications

Update the logic of get_dp_padding_mode :

Only use SUM_LEN for extend mode when dp_size > 1.
Prefer MAX_LEN when communication cost is equal (>= instead of >).
This allows symmetric memory optimization for NSA CP and other use cases.

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.

Review Process

Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
Get approvals from CODEOWNERS and other reviewers.
Trigger CI tests with comments or contact authorized users to do so.
- /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
After green CI and required approvals, ask Merge Oncalls to merge.

…tend mode Signed-off-by: wangfakang <fakangwang@gmail.com>

gemini-code-assist · 2026-03-12T04:35:17Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

ShangmingCai · 2026-03-12T05:25:07Z

/tag-and-rerun-ci

wangfakang · 2026-03-12T12:34:38Z

/rerun-failed-ci

wangfakang · 2026-03-13T01:42:05Z

/rerun-failed-ci

wangfakang · 2026-03-16T02:51:25Z

/rerun-failed-ci

wangfakang · 2026-03-16T05:07:57Z

/rerun-failed-ci

wangfakang · 2026-03-16T08:33:13Z

/rerun-failed-ci

wangfakang · 2026-03-16T10:39:36Z

/rerun_failed_ci

…tend mode (sgl-project#20406) Signed-off-by: wangfakang <fakangwang@gmail.com>

[DP Attention] Optimize dp_padding_mode selection for dp_size=1 in ex…

eadd8f0

…tend mode Signed-off-by: wangfakang <fakangwang@gmail.com>

wangfakang requested review from BBuf, Edwardf0t1, Fridge003, HaiShaw, Ying1123, ch-wan, ispobock and merrymercy as code owners March 12, 2026 04:35

Merge branch 'main' into opt_DpPaddingMode

d215b3a

yizhang2077 approved these changes Mar 12, 2026

View reviewed changes

github-actions Bot added the run-ci label Mar 12, 2026

wangfakang added 2 commits March 13, 2026 14:11

Merge branch 'main' into opt_DpPaddingMode

58a47bf

Merge branch 'main' into opt_DpPaddingMode

1c0de6a

ShangmingCai approved these changes Mar 16, 2026

View reviewed changes

ShangmingCai merged commit 3d58cd1 into sgl-project:main Mar 16, 2026
256 of 277 checks passed

wangfakang mentioned this pull request Mar 17, 2026

[Feature] Add DCP support for DeepSeek v3.2 #18167

Open

7 tasks

Wangzheee pushed a commit to Wangzheee/sglang that referenced this pull request Mar 21, 2026

[DP Attention] Optimize dp_padding_mode selection for dp_size=1 in ex…

dfd9a2c

…tend mode (sgl-project#20406) Signed-off-by: wangfakang <fakangwang@gmail.com>

0-693 pushed a commit to 0-693/sglang that referenced this pull request Mar 25, 2026

[DP Attention] Optimize dp_padding_mode selection for dp_size=1 in ex…

58188e2

…tend mode (sgl-project#20406) Signed-off-by: wangfakang <fakangwang@gmail.com>

JustinTong0323 pushed a commit to JustinTong0323/sglang that referenced this pull request Apr 7, 2026

[DP Attention] Optimize dp_padding_mode selection for dp_size=1 in ex…

908818d

…tend mode (sgl-project#20406) Signed-off-by: wangfakang <fakangwang@gmail.com>

yhyang201 pushed a commit to yhyang201/sglang that referenced this pull request Apr 22, 2026

[DP Attention] Optimize dp_padding_mode selection for dp_size=1 in ex…

b4f5355

…tend mode (sgl-project#20406) Signed-off-by: wangfakang <fakangwang@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DP Attention] Optimize dp_padding_mode selection for dp_size=1 in extend mode#20406

[DP Attention] Optimize dp_padding_mode selection for dp_size=1 in extend mode#20406
ShangmingCai merged 4 commits intosgl-project:mainfrom
wangfakang:opt_DpPaddingMode

wangfakang commented Mar 12, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Mar 12, 2026

Uh oh!

ShangmingCai commented Mar 12, 2026

Uh oh!

wangfakang commented Mar 12, 2026

Uh oh!

wangfakang commented Mar 13, 2026

Uh oh!

wangfakang commented Mar 16, 2026

Uh oh!

wangfakang commented Mar 16, 2026

Uh oh!

wangfakang commented Mar 16, 2026

Uh oh!

wangfakang commented Mar 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	def get_dp_padding_mode(
	cls, is_extend_in_batch, global_num_tokens: List[int]
	) -> DpPaddingMode:
	if is_extend_in_batch:
	return DpPaddingMode.SUM_LEN

	def get_global_dp_buffer(cls) -> torch.Tensor:
	with use_symmetric_memory(get_tp_group(), disabled=not cls._dp_max_padding):
	buffer = torch.empty(
	(cls._global_dp_buffer_len, cls._hidden_size),
	dtype=cls._dtype,
	device=cls._device,
	)

	def get_local_dp_buffer(cls) -> torch.Tensor:
	with use_symmetric_memory(get_tp_group(), disabled=not cls._dp_max_padding):
	buffer = torch.empty(
	(cls._local_dp_buffer_len, cls._hidden_size),
	dtype=cls._dtype,
	device=cls._device,
	)

Conversation

wangfakang commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

Uh oh!

gemini-code-assist Bot commented Mar 12, 2026

Uh oh!

ShangmingCai commented Mar 12, 2026

Uh oh!

wangfakang commented Mar 12, 2026

Uh oh!

wangfakang commented Mar 13, 2026

Uh oh!

wangfakang commented Mar 16, 2026

Uh oh!

wangfakang commented Mar 16, 2026

Uh oh!

wangfakang commented Mar 16, 2026

Uh oh!

wangfakang commented Mar 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wangfakang commented Mar 12, 2026 •

edited

Loading