[bugfix] correct local_chunk_len for DCP in reorg_kvcache with long context by pisceskkk · Pull Request #28526 · vllm-project/vllm

pisceskkk · 2025-11-12T07:43:42Z

Fix the issues #28476 #28411
The previous DCP implementation did not account for cases where long contexts were split to multi-chunks. This has now been addressed by updating the correct chunked local_chunk_len based on the chunk_size and local_context_len.

CC @LucasWilkinson @cjackal @Nemo-G

…l-size is enabled Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>

gemini-code-assist

Code Review

This pull request introduces a bugfix for Distributed Context Parallelism (DCP) in the context of chunked prefill for long sequences. The core of the change is in the reorg_kvcache function, where the calculation of local_chunk_len is corrected. Previously, it incorrectly used the total local_context_len, which failed for contexts split across multiple chunks. The new implementation correctly calculates the length of the context for the current rank within the current chunk by using the newly introduced chunk_size and chunk_idx parameters. The changes are well-contained, logical, and include appropriate assertions. The fix appears correct and addresses the described issue effectively.

cjackal · 2025-11-12T14:41:20Z

Thanks for the quick fix, it looks promising. I'll rerun the test and come back with the result.

LucasWilkinson

LGTM, good catch; thanks for ascii art, helps alot!

cjackal · 2025-11-13T13:18:35Z

Thanks for the quick fix, it looks promising. I'll rerun the test and come back with the result.

DSV3 now runs smooth on >120k inputs 🙌

ehfd · 2025-11-13T15:31:03Z

Need to merge this soon, to land in v0.11.1.

…ontext (vllm-project#28526) Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: George D. Torres <gdavtor@gmail.com>

…ontext (#28526) Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> (cherry picked from commit 968060c)

…ontext (vllm-project#28526) Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

pisceskkk and others added 2 commits November 12, 2025 12:54

[bugfix] AssertionError on long context when --decode-context-paralle…

05671bb

…l-size is enabled Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>

Update vllm/v1/attention/backends/mla/common.py

91ca1c8

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>

pisceskkk requested a review from pavanimajety as a code owner November 12, 2025 07:43

mergify bot added the v1 label Nov 12, 2025

mergify bot mentioned this pull request Nov 12, 2025

[bugfix] correct local_chunk_len for DCP in reorg_kvcache with long context #28514

Closed

gemini-code-assist bot reviewed Nov 12, 2025

View reviewed changes

This was referenced Nov 12, 2025

[Bug]: [DCP] [DSV3] AssertionError on long context when --decode-context-parallel-size is enabled #28476

Closed

[DCP] Support Decode Context Parallel (DCP) for GQA with Flashinfer #25438

Merged

LucasWilkinson approved these changes Nov 13, 2025

View reviewed changes

LucasWilkinson enabled auto-merge (squash) November 13, 2025 01:43

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 13, 2025

pisceskkk added 2 commits November 13, 2025 15:56

Merge branch 'main' into dcp-bugfix

8b60c96

Merge branch 'main' into dcp-bugfix

fba7ec6

mgoin approved these changes Nov 13, 2025

View reviewed changes

mgoin added this to the v0.11.1 milestone Nov 13, 2025

vllm-bot merged commit 968060c into vllm-project:main Nov 13, 2025
48 of 50 checks passed

pisceskkk mentioned this pull request Nov 14, 2025

[Bug]: assert reorganized_kv_c_normed.shape[0] == sum_seq_len #28411

Closed

1 task

pisceskkk deleted the dcp-bugfix branch November 20, 2025 00:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[bugfix] correct local_chunk_len for DCP in reorg_kvcache with long context#28526

[bugfix] correct local_chunk_len for DCP in reorg_kvcache with long context#28526
vllm-bot merged 4 commits intovllm-project:mainfrom
pisceskkk:dcp-bugfix

pisceskkk commented Nov 12, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

cjackal commented Nov 12, 2025

Uh oh!

LucasWilkinson left a comment

Uh oh!

cjackal commented Nov 13, 2025

Uh oh!

ehfd commented Nov 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

Conversation

pisceskkk commented Nov 12, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

cjackal commented Nov 12, 2025

Uh oh!

LucasWilkinson left a comment

Choose a reason for hiding this comment

Uh oh!

cjackal commented Nov 13, 2025

Uh oh!

ehfd commented Nov 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

pisceskkk commented Nov 12, 2025 •

edited by github-actions bot

Loading