Skip to content

[PD] feat: support mooncake intra-node nvlink kv transfer#17866

Merged
ShangmingCai merged 18 commits intosgl-project:mainfrom
TTThanos:Feature/new_intra_nvlink
Feb 3, 2026
Merged

[PD] feat: support mooncake intra-node nvlink kv transfer#17866
ShangmingCai merged 18 commits intosgl-project:mainfrom
TTThanos:Feature/new_intra_nvlink

Conversation

@TTThanos
Copy link
Copy Markdown
Contributor

@TTThanos TTThanos commented Jan 28, 2026

Motivation

Add new feature : Enable Intra_Node nvlink in SGlang to compatible with Mooncake INTRA_NODE NVLINK isolation PR
kvcache-ai/Mooncake#1341 (comment)

Modifications

Mainly modified part is /Mooncake/utils.py

Accuracy Tests

To enable intra-Node nvlink, you can launch server using the following command

export NCCL_IB_GID_INDEX=1 export NCCL_SOCKET_IFNAME=eth0,eth1 export NCCL_IB_DISABLE=0 export GLOO_SOCKET_IFNAME=eth0 model_path=/mnt/models/Qwen3-235B-A22B-FP8 FILE_NAME_PREFIX=Prefill_Mooncake_INRTANVLINK_kv_transfer_Hicache_test_qwen3_235b_tp4_1210_0 SGLANG_MOONCAKE_CUSTOM_MEM_POOL=true SGLANG_MOONCAKE_CUSTOM_MEM_POOL=INTRA_NVLINK MC_LOG_LEVEL=INFO MC_TE_METRIC=true MC_INTRANODE_NVLINK=true SGLANG_TORCH_PROFILER_DIR=/root/Yaozhong_hiecache/profile/ python3 -m sglang.launch_server \ --model-path ${model_path} \ --tp 4 \ --mem-fraction-static 0.85 \ --disaggregation-mode prefill \ --port 7001 \ --watchdog-timeout 1000000 --decode-log-interval 1 >/root/LYZ_hicache/log/${FILE_NAME_PREFIX}.log 2>&1

SGLANG_MOONCAKE_CUSTOM_MEM_POOL=true is to align with the design when use MNNVL "MC_FORCE_MNNVL=true". Please be awared, "MC_FORCE_MNNVL=true" and "MC_INTRANODE_NVLINK=true" must be exclusively used when launch SGlang server.

Benchmarking and Profiling

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
  4. After green CI and required approvals, ask Merge Oncalls to merge.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@stmatengss
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci

@stmatengss stmatengss changed the title Feature/new intra nvlink [PD] feat: support intra nvlink kv transfer Jan 28, 2026
Copy link
Copy Markdown
Collaborator

@ShangmingCai ShangmingCai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@ShangmingCai ShangmingCai changed the title [PD] feat: support intra nvlink kv transfer [PD] feat: support mooncake intra nvlink kv transfer Jan 29, 2026
@ShangmingCai ShangmingCai changed the title [PD] feat: support mooncake intra nvlink kv transfer [PD] feat: support mooncake intra-node nvlink kv transfer Jan 29, 2026
if (
self.enable_custom_mem_pool and self.custom_mem_pool_type == "NVLINK"
) or envs.SGLANG_MOONCAKE_SEND_AUX_TCP.get():
if self.enable_custom_mem_pool and self.custom_mem_pool_type in ("NVLINK", "INTRA_NVLINK"):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this required? Why intra nvlink require send aux with tcp?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to log, this is required for intra nvlink. Eitherwise, it will show Prefill transfer failed for request rank=xxx
image
, and I don't know the reason.
By enable sending aux, the problem disappears.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a precision issue with NVL72 previously? So tcp is a workaround for aux data. @ShangmingCai

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No modifications are needed. "Fix me when Mooncake's nvlink_transport is bug-free" applies to mnnvl.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, mnnvl has a sync issue when transferring tiny data. So maybe this happens for intra-node nvlink as well, the granularity issue. I am shepherding this PR: #17430, maybe it will help.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic might be wrong? I think intra-node nvlink should not set up SGLANG_MOONCAKE_CUSTOM_MEM_POOL, so self.enable_custom_mem_pool is False.

Copy link
Copy Markdown
Contributor Author

@TTThanos TTThanos Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic might be wrong? I think intra-node nvlink should not set up SGLANG_MOONCAKE_CUSTOM_MEM_POOL, so self.enable_custom_mem_pool is False.

Problem solved by adding condition:
'elif envs.SGLANG_MOONCAKE_CUSTOM_MEM_POOL.get() == "INTRA_NVLINK":'
Due to device = 'cpu' in previous version, the register_memory will failed in Mooncake when registering aux_data_ptrs. Now, device will be assigned as 'cuda' when using INTRA_NVLINK and aux_data_ptrs will be successfully registered and also no need to send aux with tcp.
image

@ShangmingCai
Copy link
Copy Markdown
Collaborator

Let me fix the conflicts

@ShangmingCai
Copy link
Copy Markdown
Collaborator

please fix lint

@TTThanos
Copy link
Copy Markdown
Contributor Author

please fix lint

Solved

@ShangmingCai
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci


# Global constants for custom memory pool types
SUPPORTED_MOONCAKE_CUSTOM_MEM_POOL_TYPES = ["NVLINK", "BAREX"]
SUPPORTED_MOONCAKE_CUSTOM_MEM_POOL_TYPES = ["NVLINK", "BAREX", "INTRA_NVLINK"]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INTRA_NVLINK appears unprofessional. How about INTRA_NODE_NVLINK?

@stmatengss
Copy link
Copy Markdown
Collaborator

Merge main due to #18044

@stmatengss
Copy link
Copy Markdown
Collaborator

/rerun-failed-ci

1 similar comment
@stmatengss
Copy link
Copy Markdown
Collaborator

/rerun-failed-ci

@stmatengss
Copy link
Copy Markdown
Collaborator

/rerun-failed-ci

@ShangmingCai ShangmingCai merged commit a45647b into sgl-project:main Feb 3, 2026
336 of 360 checks passed
charlesHsuGG pushed a commit to charlesHsuGG/sglang that referenced this pull request Feb 5, 2026
…t#17866)

Co-authored-by: 百麒 <yaozhong.lyz@alibaba-inc.com>
Co-authored-by: Teng Ma <teng-ma@linux.alibaba.com>
sfiisf pushed a commit to sfiisf/sglang that referenced this pull request Feb 5, 2026
…t#17866)

Co-authored-by: 百麒 <yaozhong.lyz@alibaba-inc.com>
Co-authored-by: Teng Ma <teng-ma@linux.alibaba.com>
Johnsonms pushed a commit to Johnsonms/sglang that referenced this pull request Feb 14, 2026
…t#17866)

Co-authored-by: 百麒 <yaozhong.lyz@alibaba-inc.com>
Co-authored-by: Teng Ma <teng-ma@linux.alibaba.com>
Wangzheee pushed a commit to Wangzheee/sglang that referenced this pull request Mar 21, 2026
…t#17866)

Co-authored-by: 百麒 <yaozhong.lyz@alibaba-inc.com>
Co-authored-by: Teng Ma <teng-ma@linux.alibaba.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants