Skip to content

Add --speculative-moe-tp-size=1 flag to run Deepseek MTP layer with no parallelism.#16590

Closed
trevor-m wants to merge 6 commits intosgl-project:mainfrom
trevor-m:spec-tp-1-rebase
Closed

Add --speculative-moe-tp-size=1 flag to run Deepseek MTP layer with no parallelism.#16590
trevor-m wants to merge 6 commits intosgl-project:mainfrom
trevor-m:spec-tp-1-rebase

Conversation

@trevor-m
Copy link
Copy Markdown
Collaborator

@trevor-m trevor-m commented Jan 6, 2026

Motivation

For wideep+MTP.

For Deepseek FP4, this flag allows the TP/EP size to be set to 1 for the MTP layer only. This will potentially avoid communication overhead and issues where the experts aren't divisible by the number of workers.

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments (/tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci) or contact authorized users to do so.
  4. After green CI and required approvals, ask Merge Oncalls to merge.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Comment thread python/sglang/srt/server_args.py Outdated
Comment thread python/sglang/srt/layers/communicator.py Outdated
@github-actions github-actions Bot added the documentation Improvements or additions to documentation label Jan 7, 2026
@trevor-m trevor-m changed the title Add --speculative-moe-tp-ep-size=1 flag to run Deepseek MTP layer with no parallelism. Add --speculative-moe-tp-size=1 flag to run Deepseek MTP layer with no parallelism. Jan 7, 2026
@trevor-m trevor-m force-pushed the spec-tp-1-rebase branch 2 times, most recently from c5870e1 to a0ad018 Compare January 7, 2026 21:32
@rainj-me
Copy link
Copy Markdown
Collaborator

rainj-me commented Jan 8, 2026

enable_nextn_moe_sparse_fully_dp import error fix: bytedance-iaas@7111c5f

@trevor-m
Copy link
Copy Markdown
Collaborator Author

trevor-m commented Jan 8, 2026

enable_nextn_moe_sparse_fully_dp import error fix: bytedance-iaas@7111c5f

Thanks @rainj-me!

@Fridge003
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci

@github-actions github-actions Bot added the run-ci label Jan 8, 2026
@trevor-m
Copy link
Copy Markdown
Collaborator Author

trevor-m commented Apr 7, 2026

Closing since fp8 mtp layer provided better perf

@trevor-m trevor-m closed this Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deepseek documentation Improvements or additions to documentation run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants