Skip to content

[PD]: Support incremental transfer for mooncake transfer engine#24257

Merged
hzh0425 merged 3 commits intosgl-project:mainfrom
hzh0425:pd/mooncake-incremental-transfer
May 3, 2026
Merged

[PD]: Support incremental transfer for mooncake transfer engine#24257
hzh0425 merged 3 commits intosgl-project:mainfrom
hzh0425:pd/mooncake-incremental-transfer

Conversation

@hzh0425
Copy link
Copy Markdown
Collaborator

@hzh0425 hzh0425 commented May 2, 2026

Motivation

Implemented Mooncake decode-radix transfer support:

TODO: Next PR will support hybrid tree (Mamba/SWA) on decode side

Collabrate with @ShangmingCai

Modifications

Accuracy Tests

GSM8K Accuracy

  model: qwen32b
  deocde_tree_run1  0.942
  decode_tree_run2  0.942
  no_deocde_tree    0.942     

Speed Tests and Profiling

Checklist

Review and Merge Process

  1. Ping Merge Oncalls to start the process. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • Common commands include /tag-and-rerun-ci, /tag-run-ci-label, /rerun-failed-ci
  4. After green CI and required approvals, ask Merge Oncalls or people with Write permission to merge the PR.

Co-authored-by: Shangming Cai <csmthu@gmail.com>
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Comment thread python/sglang/srt/disaggregation/mooncake/conn.py
Comment thread python/sglang/srt/disaggregation/mooncake/conn.py
Copy link
Copy Markdown
Collaborator

@ShangmingCai ShangmingCai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread test/registered/unit/test_mooncake_decode_radix_transfer.py Outdated
@hzh0425
Copy link
Copy Markdown
Collaborator Author

hzh0425 commented May 2, 2026

/rerun-stage stage-c-test-8-gpu-h20

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 2, 2026

✅ Triggered stage-c-test-8-gpu-h20 to run independently (skipping dependencies). View workflow run

…al-transfer

# Conflicts:
#	test/registered/distributed/test_disaggregation_decode_radix_cache.py
@hzh0425
Copy link
Copy Markdown
Collaborator Author

hzh0425 commented May 3, 2026

/rerun-stage stage-c-test-8-gpu-h20

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 3, 2026

✅ Triggered stage-c-test-8-gpu-h20 to run independently (skipping dependencies). View workflow run

@hzh0425
Copy link
Copy Markdown
Collaborator Author

hzh0425 commented May 3, 2026

@hzh0425 hzh0425 merged commit 9a5450a into sgl-project:main May 3, 2026
287 of 300 checks passed
@nvpohanh
Copy link
Copy Markdown
Collaborator

nvpohanh commented May 4, 2026

cc @YAMY1234

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants