feat: Implement dynamic position embedding type by RayenTian · Pull Request #2768 · NVIDIA-NeMo/Megatron-Bridge

RayenTian · 2026-03-12T03:21:04Z

Megatron models based on hf_config; remove hardcoded 'rope' setting from various model bridge s.

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Changelog

Add specific line by line info of high level changes in this PR.

GitHub Actions CI

See the CI sectionin the Contributing doc for how to trigger the CI. A Nvidia developer will need to approve and trigger the CI for external contributors.

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

If you haven't finished some of the above items you can still open "Draft" PR.

Additional Information

Related to # (issue)

Megatron models based on hf_config; remove hardcoded 'rope' setting from various model bridge s. Signed-off-by: ruit <ruit@nvidia.com>

copy-pr-bot · 2026-03-12T03:21:08Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

cuichenx

LGTM

cuichenx · 2026-03-12T18:06:17Z

        # Llama-specific Megatron defaults
        provider.normalization = "RMSNorm"
        provider.gated_linear_unit = True
-        provider.position_embedding_type = "rope"


https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct/blob/main/config.json
Llama 3 has rope_type = "llama3". so we might need to keep this in llama bridge?

I think it's OK here.

I think code in here can handle the situation. If the rope_scaling is set in model config, we will use it directly. If we leave this line out, it will force the use of rope.

LLama has special rope_scaling handle logic here. So this line doesn't matter.

Signed-off-by: ruit <ruit@nvidia.com>

Implement dynamic position embedding type for

318b20d

Megatron models based on hf_config; remove hardcoded 'rope' setting from various model bridge s. Signed-off-by: ruit <ruit@nvidia.com>

RayenTian requested a review from yaoyu-33 March 12, 2026 03:21

RayenTian marked this pull request as ready for review March 12, 2026 03:54

yaoyu-33 added feature New capabilities, enhancements, or enablement work area:model Model implementations and HF bridge logic needs-review PR is ready for code review and waiting on a reviewer labels Mar 12, 2026

cuichenx approved these changes Mar 12, 2026

View reviewed changes

cuichenx merged commit 4846ce4 into yuya/remove-model-providers Mar 12, 2026
2 checks passed

cuichenx deleted the ruit/rm_pe_hard_code branch March 12, 2026 17:47

cuichenx reviewed Mar 12, 2026

View reviewed changes

yaoyu-33 pushed a commit that referenced this pull request Mar 12, 2026

feat: Implement dynamic position embedding type (#2768)

56e283c

Signed-off-by: ruit <ruit@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Implement dynamic position embedding type#2768

feat: Implement dynamic position embedding type#2768
cuichenx merged 1 commit into
yuya/remove-model-providersfrom
ruit/rm_pe_hard_code

RayenTian commented Mar 12, 2026

Uh oh!

copy-pr-bot Bot commented Mar 12, 2026

Uh oh!

cuichenx left a comment

Uh oh!

Uh oh!

cuichenx Mar 12, 2026

Uh oh!

RayenTian Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

RayenTian commented Mar 12, 2026

What does this PR do ?

Changelog

GitHub Actions CI

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot Bot commented Mar 12, 2026

Uh oh!

cuichenx left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cuichenx Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

RayenTian Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants