[PEFT, ckpts] feat: modelopt for LoRA & deepseek arch by HollowMan6 · Pull Request #3612 · NVIDIA-NeMo/Megatron-Bridge

HollowMan6 · 2026-04-30T21:13:43Z

What does this PR do ?

As MoE with modelopt sets moe_grouped_gemm disabled, https://github.com/NVIDIA/Megatron-LM/blob/12f18dafbf9ea1a947f06c7aecde0208c0ada161/megatron/core/post_training/modelopt/gpt/model_specs.py#L146 additional mappings are needed here. Also, modelopt linear layers should be correctly recognized for lora.

Changelog

Add additional mappings support with moe_grouped_gemm disabled for Deepseek arch.
Support wrapping modelopt linear layers for LoRA.

GitHub Actions CI

See the CI sectionin the Contributing doc for how to trigger the CI. A Nvidia developer will need to approve and trigger the CI for external contributors.

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

If you haven't finished some of the above items you can still open "Draft" PR.

Additional Information

Related to # (issue)

copy-pr-bot · 2026-04-30T21:13:46Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

HollowMan6 · 2026-04-30T21:15:21Z

/claude review

Copilot

Pull request overview

This PR extends Megatron-Bridge’s PEFT (LoRA) and DeepSeek checkpoint-conversion support to better handle ModelOpt and DeepSeek MoE naming differences (notably when moe_grouped_gemm is disabled).

Changes:

Add detection + adapter-attribute handling for ModelOpt’s local Megatron Linear so it can be LoRA-wrapped correctly.
Route ModelOpt Linear modules away from the nn.Linear fast-path in both LoRA and CanonicalLoRA transforms.
Extend DeepSeek parameter mappings to cover local_experts naming and fix MTP mapping wildcard replacement to only target the intended wildcard groups.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File	Description
`src/megatron/bridge/peft/utils.py`	Adds `is_modelopt_linear()` and a ModelOpt-specific `AdapterAttributes` return path.
`src/megatron/bridge/peft/lora.py`	Ensures ModelOpt `Linear` does not go through the `nn.Linear`/TE adapter path.
`src/megatron/bridge/peft/canonical_lora.py`	Same exclusion for CanonicalLoRA’s `nn.Linear` fast-path.
`src/megatron/bridge/models/deepseek/common.py`	Adds DeepSeek `local_experts` mappings and corrects MTP wildcard replacement behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

cuichenx · 2026-05-01T23:50:05Z

/claude review #3612

yaoyu-33 · 2026-05-03T03:58:08Z

+def is_modelopt_linear(m: nn.Module) -> bool:
+    """Return whether a module is ModelOpt's local Megatron Linear."""
+    cls = type(m)
+    return cls.__name__ == "Linear" and cls.__module__ == "megatron.core.post_training.modelopt.layers"


this looks a bit fragile?

you can put import inside check or guard it up

from megatron.core.post_training.modelopt.layers import Linear as ModelOptLinear

def is_modelopt_linear(m: nn.Module) -> bool:
return isinstance(m, ModelOptLinear)

Thanks for reviewing, I just changed this into safe_import_from similar to those on top (i.e. TEColumnParallelLinear)

As MoE with modelopt sets `moe_grouped_gemm` disabled, https://github.com/NVIDIA/Megatron-LM/blob/12f18dafbf9ea1a947f06c7aecde0208c0ada161/megatron/core/post_training/modelopt/gpt/model_specs.py#L146 additional mappings are needed here. Also, modelopt linear layers should be correctly recognized for lora. Signed-off-by: Hollow Man <hollowman@opensuse.org>

Signed-off-by: Hollow Man <hollowman@opensuse.org>

HollowMan6 · 2026-05-05T05:55:59Z

/ok to test dfd1c12

Signed-off-by: Hollow Man <hollowman@opensuse.org>

cuichenx

LGTM

Signed-off-by: Hollow Man <hollowman@opensuse.org>

Signed-off-by: Hollow Man <hollowman@opensuse.org> Signed-off-by: Vasudevan Rengasamy <vrengasamy@nvidia.com>

Copilot AI review requested due to automatic review settings April 30, 2026 21:13

Copilot started reviewing on behalf of HollowMan6 April 30, 2026 21:14 View session

Copilot AI reviewed Apr 30, 2026

View reviewed changes

Comment thread src/megatron/bridge/peft/utils.py

Comment thread src/megatron/bridge/peft/lora.py

Comment thread src/megatron/bridge/peft/canonical_lora.py

HollowMan6 mentioned this pull request Apr 30, 2026

[worker] feat: QAT with FP8 (w8a8 & w8a16) verl-project/verl#6229

Closed

8 tasks

yaoyu-33 reviewed May 3, 2026

View reviewed changes

yaoyu-33 added the waiting-on-customer Waiting on the original author to respond label May 3, 2026

HollowMan6 requested a review from yaoyu-33 May 3, 2026 07:58

copy-pr-bot Bot temporarily deployed to test May 5, 2026 04:33 Inactive

HollowMan6 force-pushed the modelopt branch from dbb138c to e46741e Compare May 5, 2026 05:44

HollowMan6 added 3 commits May 4, 2026 22:54

Add test cases

bfe5dae

Signed-off-by: Hollow Man <hollowman@opensuse.org>

use safe_import_from for importing

dfd1c12

Signed-off-by: Hollow Man <hollowman@opensuse.org>

HollowMan6 force-pushed the modelopt branch from e46741e to dfd1c12 Compare May 5, 2026 05:55

Fix lint

bf8c412

Signed-off-by: Hollow Man <hollowman@opensuse.org>

copy-pr-bot Bot temporarily deployed to test May 5, 2026 15:57 Inactive

cuichenx approved these changes May 5, 2026

View reviewed changes

cuichenx merged commit f6041a2 into NVIDIA-NeMo:main May 5, 2026
86 checks passed

HollowMan6 deleted the modelopt branch May 5, 2026 18:45

cuichenx mentioned this pull request May 8, 2026

[NeMo FW 26.06 Release] MBridge v0.5.0 Roadmap #3754

Open

gautham-kollu pushed a commit that referenced this pull request May 12, 2026

[PEFT, ckpts] feat: modelopt for LoRA & deepseek arch (#3612)

162da60

Signed-off-by: Hollow Man <hollowman@opensuse.org>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PEFT, ckpts] feat: modelopt for LoRA & deepseek arch#3612

[PEFT, ckpts] feat: modelopt for LoRA & deepseek arch#3612
cuichenx merged 4 commits into
NVIDIA-NeMo:mainfrom
HollowMan6:modelopt

HollowMan6 commented Apr 30, 2026

Uh oh!

copy-pr-bot Bot commented Apr 30, 2026

Uh oh!

HollowMan6 commented Apr 30, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cuichenx commented May 1, 2026

Uh oh!

yaoyu-33 May 3, 2026

Uh oh!

HollowMan6 May 3, 2026

Uh oh!

HollowMan6 commented May 5, 2026

Uh oh!

cuichenx left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

HollowMan6 commented Apr 30, 2026

What does this PR do ?

Changelog

GitHub Actions CI

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot Bot commented Apr 30, 2026

Uh oh!

HollowMan6 commented Apr 30, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cuichenx commented May 1, 2026

Uh oh!

yaoyu-33 May 3, 2026

Choose a reason for hiding this comment

Uh oh!

HollowMan6 May 3, 2026

Choose a reason for hiding this comment

Uh oh!

HollowMan6 commented May 5, 2026

Uh oh!

cuichenx left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants