Skip to content

Commit cfead25

Browse files
[Qwen3.5] mamba slice fix (Prefill TP != Decode TP & decode TP size>1) (sgl-project#20655)
Co-authored-by: Shangming Cai <csmthu@gmail.com>
1 parent 966ae87 commit cfead25

1 file changed

Lines changed: 3 additions & 1 deletion

File tree

  • python/sglang/srt/disaggregation/mooncake

python/sglang/srt/disaggregation/mooncake/conn.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -726,7 +726,9 @@ def _send_mamba_state_slice(
726726
# Each prefill sends all its dims to the appropriate offset in decode
727727
src_dim_start = 0
728728
num_dims_to_send = src_dim
729-
dst_dim_start = local_tp_rank_in_group * src_dim
729+
writers_per_decode = self.attn_tp_size // dst_attn_tp_size
730+
local_writer_idx = local_tp_rank_in_group % writers_per_decode
731+
dst_dim_start = local_writer_idx * src_dim
730732
else:
731733
# 1 prefill rank sends to multiple decode ranks
732734
# Prefill sends a slice of its dims to each decode rank

0 commit comments

Comments
 (0)