Skip to content

[MoE] Move various experts classes to fused_moe/experts/#41979

Merged
DarkLight1337 merged 31 commits into
vllm-project:mainfrom
neuralmagic:move-marlin-moe-to-experts
May 10, 2026
Merged

[MoE] Move various experts classes to fused_moe/experts/#41979
DarkLight1337 merged 31 commits into
vllm-project:mainfrom
neuralmagic:move-marlin-moe-to-experts

Conversation

@bnellnm

@bnellnm bnellnm commented May 7, 2026

Copy link
Copy Markdown
Collaborator

Purpose

  • Rename fused_marlin_moe.py to experts/marlin_moe.py and update all references across the codebase (source, tests, docs).
  • Rename rocm_aiter_fused_moe.py to experts/rocm_aiter_moe.py and update all references. Also fix stale docs pointer for grouped_topk after its earlier move to router.
  • Rename flashinfer_cutlass_moe.py to experts/flashinfer_cutlass_moe.py and update all references (source, tests, docs).
  • Extract TritonExperts and TritonWNA16Experts from fused_moe.py into a new experts/triton_moe.py module. Update all references across the codebase (source, tests, C++ comment, docs).

Forked from #40572, #40573, #40571, #40570

cc @Jackmin801 , @robertgshaw2-redhat

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Jackmin801 and others added 26 commits April 22, 2026 00:34
Rename rocm_aiter_fused_moe.py to experts/rocm_aiter_moe.py and update
all references. Also fix stale docs pointer for grouped_topk after its
earlier move to router/.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Jackmin801 <ongjackm@gmail.com>
Rename fused_marlin_moe.py to experts/marlin_moe.py and update all
references across the codebase (source, tests, docs).

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Jackmin801 <ongjackm@gmail.com>
Extract TritonExperts and TritonWNA16Experts from fused_moe.py into a
new experts/triton_moe.py module. Update all references across the
codebase (source, tests, C++ comment, docs).

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Jackmin801 <ongjackm@gmail.com>
Rename flashinfer_cutlass_moe.py to experts/flashinfer_cutlass_moe.py
and update all references (source, tests, docs).

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Jackmin801 <ongjackm@gmail.com>
Signed-off-by: Robert Shaw <robertgshaw2@gmail.com>
Signed-off-by: Robert Shaw <robertgshaw2@gmail.com>
Signed-off-by: Robert Shaw <robertgshaw2@gmail.com>
Signed-off-by: Robert Shaw <robertgshaw2@gmail.com>
…experts

Signed-off-by: Jackmin801 <ongjackm@gmail.com>

# Conflicts:
#	vllm/model_executor/layers/fused_moe/__init__.py
Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com>
Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com>
…xperts

Signed-off-by: Jackmin801 <ongjackm@gmail.com>

# Conflicts:
#	vllm/model_executor/layers/fused_moe/__init__.py
…experts

Signed-off-by: Jackmin801 <ongjackm@gmail.com>

# Conflicts:
#	vllm/lora/layers/fused_moe.py
…experts

Signed-off-by: Jackmin801 <ongjackm@gmail.com>

# Conflicts:
#	vllm/lora/layers/fused_moe.py
#	vllm/model_executor/layers/fused_moe/fused_moe.py
…erts

Signed-off-by: Bill Nell <bnell@redhat.com>
…perts

Signed-off-by: Bill Nell <bnell@redhat.com>
@bnellnm

bnellnm commented May 7, 2026

Copy link
Copy Markdown
Collaborator Author

Thanks for the work!

Could we combine your several PRs into single one? #41981 #41977 , #41976

Sure, I'll combine them all.

@bnellnm bnellnm requested a review from yewentao256 May 7, 2026 21:39
@mergify mergify Bot added the rocm Related to AMD ROCm label May 7, 2026
@github-project-automation github-project-automation Bot moved this to Todo in AMD May 7, 2026
@bnellnm bnellnm changed the title [MoE] Move marlin moe to fused_moe/experts/ [MoE] Move various experts classes to fused_moe/experts/ May 7, 2026

@yewentao256 yewentao256 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the work!

@github-project-automation github-project-automation Bot moved this to Ready in NVIDIA May 8, 2026
@yewentao256 yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label May 8, 2026
@DarkLight1337 DarkLight1337 merged commit 1b57eb4 into vllm-project:main May 10, 2026
160 checks passed
@github-project-automation github-project-automation Bot moved this from Ready to Done in NVIDIA May 10, 2026
@github-project-automation github-project-automation Bot moved this from Todo to Done in AMD May 10, 2026
yiliu30 pushed a commit to yiliu30/vllm-fork that referenced this pull request May 11, 2026
…t#41979)

Signed-off-by: Jackmin801 <ongjackm@gmail.com>
Signed-off-by: Robert Shaw <robertgshaw2@gmail.com>
Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Co-authored-by: Jackmin801 <ongjackm@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Robert Shaw <robertgshaw2@gmail.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com>
TomerBN-Nvidia added a commit to TomerBN-Nvidia/vllm that referenced this pull request May 11, 2026
Resolves rename conflict from vllm-project#41979 (fused_marlin_moe.py → experts/marlin_moe.py).
Git rename detection carries all PR edits through to the new path; no manual
conflict resolution needed.

Signed-off-by: Tomer Barnatan <tbarnatan@nvidia.com>
weifang231 pushed a commit to weifang231/eb-vllm that referenced this pull request May 13, 2026
…t#41979)

Signed-off-by: Jackmin801 <ongjackm@gmail.com>
Signed-off-by: Robert Shaw <robertgshaw2@gmail.com>
Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Co-authored-by: Jackmin801 <ongjackm@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Robert Shaw <robertgshaw2@gmail.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com>
mfylcek pushed a commit to mfylcek/vllm that referenced this pull request May 19, 2026
…t#41979)

Signed-off-by: Jackmin801 <ongjackm@gmail.com>
Signed-off-by: Robert Shaw <robertgshaw2@gmail.com>
Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Co-authored-by: Jackmin801 <ongjackm@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Robert Shaw <robertgshaw2@gmail.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com>
jhu960213 pushed a commit to jhu960213/vllm that referenced this pull request May 20, 2026
…t#41979)

Signed-off-by: Jackmin801 <ongjackm@gmail.com>
Signed-off-by: Robert Shaw <robertgshaw2@gmail.com>
Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Co-authored-by: Jackmin801 <ongjackm@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Robert Shaw <robertgshaw2@gmail.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com>
mvanhorn pushed a commit to mvanhorn/vllm that referenced this pull request Jun 4, 2026
…t#41979)

Signed-off-by: Jackmin801 <ongjackm@gmail.com>
Signed-off-by: Robert Shaw <robertgshaw2@gmail.com>
Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Co-authored-by: Jackmin801 <ongjackm@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Robert Shaw <robertgshaw2@gmail.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com>
Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
knight0528 pushed a commit to knight0528/vllm that referenced this pull request Jun 8, 2026
…t#41979)

Signed-off-by: Jackmin801 <ongjackm@gmail.com>
Signed-off-by: Robert Shaw <robertgshaw2@gmail.com>
Signed-off-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com>
Signed-off-by: Bill Nell <bnell@redhat.com>
Co-authored-by: Jackmin801 <ongjackm@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Robert Shaw <robertgshaw2@gmail.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Jackmin801 <56836461+Jackmin801@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation nvidia ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm

Projects

Status: Done
Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants