Skip to content

rename models/mimo infra to MegatronMIMO#3340

Merged
cuichenx merged 10 commits into
mainfrom
liding/code-reorg
Apr 23, 2026
Merged

rename models/mimo infra to MegatronMIMO#3340
cuichenx merged 10 commits into
mainfrom
liding/code-reorg

Conversation

@liding-nv

@liding-nv liding-nv commented Apr 15, 2026

Copy link
Copy Markdown
Contributor

Rename our heterogeneous-parallelism framework to MegatronMIMO / megatron_mimo to disambiguate
from MCore's upstream MimoModel and the community MiMo model bridge (MimoBridge), both of which
stay unchanged.

Convention

  • Classes: MegatronMIMO* (PascalCase, acronym UPPER)
  • Files / dirs / functions / vars: megatron_mimo_* (snake_case)

Preserved (not renamed):

  • MCore megatron.core.models.mimo.* (MimoModel, MimoOptimizer, MIMO_LANGUAGE_MODULE_KEY,
    .mimo_config)
  • Community bridge src/megatron/bridge/models/mimo/mimo_bridge.py (MimoBridge,
    "MiMoForCausalLM")

Pure structural refactor — no logic changes

Signed-off-by: Li Ding <liding@nvidia.com>
@copy-pr-bot

copy-pr-bot Bot commented Apr 15, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@liding-nv

Copy link
Copy Markdown
Contributor Author

/claude review

Comment thread src/megatron/bridge/training/setup_mimo.py Outdated
Comment thread src/megatron/bridge/training/setup_mimo.py Outdated
Comment thread tests/unit_tests/models/omni_modal/test_omni_modal_ddp.py Outdated
@claude

claude Bot commented Apr 15, 2026

Copy link
Copy Markdown
Contributor

A couple of files outside the PR still reference the old names:

  • src/megatron/bridge/training/pretrain_mimo.py:58 — docstring still says MimoModelProvider (should be OmniModalProvider)
  • tests/unit_tests/training/mimo/test_mimo_checkpointing.py:35 — docstring still says MimoModelInfra (should be OmniModalInfra)

Neither is a code-level breakage (just docstrings), but worth sweeping while you're at it.

Otherwise the mechanical rename looks correct — imports, class names, test patches, and error messages are all consistently updated.

Signed-off-by: Li Ding <liding@nvidia.com>
@liding-nv liding-nv marked this pull request as ready for review April 15, 2026 18:56
@coderabbitai

coderabbitai Bot commented Apr 15, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

This pull request migrates the codebase from MIMO-specific providers and configuration classes to OmniModal equivalents throughout examples, core bridge infrastructure, training utilities, and tests. Key renames include MimoModelProviderOmniModalProvider, MimoParallelismConfigOmniModalParallelismConfig, and MimoModelInfraOmniModalInfra.

Changes

Cohort / File(s) Summary
Example and E2E Test Scripts
examples/models/omni_modal/run_hetero_llava.sh, examples/models/omni_modal/test_mimo_*.py (3 files)
Updated test entrypoints and imports to use OmniModal provider/config types instead of MIMO. Replaced MimoModelProvider/MimoParallelismConfig with OmniModalProvider/OmniModalParallelismConfig.
Verification Scripts
examples/models/omni_modal/verify_*.py (2 files)
Added docstrings to main() functions documenting conversion verification purposes.
OmniModal Provider Classes
src/megatron/bridge/models/omni_modal/omni_modal_provider.py, src/megatron/bridge/models/omni_modal/llava_provider.py
Renamed provider classes: MimoModelProviderOmniModalProvider, MimoModelInfraOmniModalInfra, LlavaMimoProviderLlavaOmniModalProvider. Updated internal type annotations and method signatures to reference new OmniModal config type.
OmniModal Configuration and Builder
src/megatron/bridge/models/omni_modal/omni_modal_config.py, src/megatron/bridge/models/omni_modal/omni_modal_builder.py, src/megatron/bridge/models/omni_modal/omni_modal_ddp.py
Renamed dataclass MimoParallelismConfigOmniModalParallelismConfig. Updated function signatures to reference new config/DDP wrapping function names (wrap_omni_modal_model_distributed).
OmniModal Module Exports
src/megatron/bridge/models/omni_modal/__init__.py
New module init file re-exporting OmniModal provider, infrastructure, and configuration classes for public API access.
MIMO Module Exports
src/megatron/bridge/models/mimo/__init__.py
Consolidated re-exports to single MimoBridge symbol, reducing from five exported names to one.
Data Loading Utilities
src/megatron/bridge/data/loaders.py, src/megatron/bridge/data/mimo/loaders.py, src/megatron/bridge/data/mimo/dp_utils.py
Updated provider/config type checks and function signatures to reference OmniModalProvider and OmniModalParallelismConfig instead of MIMO equivalents.
Training Configuration and Setup
src/megatron/bridge/training/config.py, src/megatron/bridge/training/setup_mimo.py, src/megatron/bridge/training/checkpointing.py
Updated ConfigContainer.model union type and function parameters from MimoModelProvider/MimoModelInfra to OmniModal equivalents. Updated related docstrings and comments.
Training Utilities
src/megatron/bridge/training/mimo_parallel_utils.py, src/megatron/bridge/training/train_mimo.py, src/megatron/bridge/training/pretrain_mimo.py, src/megatron/bridge/training/utils/theoretical_memory_utils.py
Updated all function signatures and docstrings to accept OmniModalInfra and OmniModalProvider parameters instead of MIMO equivalents. Updated conditional type checks for provider detection.
Functional Test Suite
tests/functional_tests/test_groups/training/test_pretrain_mimo.py
Replaced provider/config imports and instantiations to use OmniModalProvider and OmniModalParallelismConfig with corresponding module parallelism structure.
Unit Tests - Data
tests/unit_tests/data/mimo/test_dp_utils.py, tests/unit_tests/data/mimo/test_loaders.py
Updated test fixtures and mock providers from MimoParallelismConfig/MimoModelProvider to OmniModal equivalents. Updated error message expectations in assertions.
Unit Tests - Models
tests/unit_tests/models/omni_modal/*.py (5 files)
New test suite targeting OmniModal provider classes. Updated imports, instantiations, and assertions to exercise OmniModalProvider, LlavaOmniModalProvider, and OmniModalParallelismConfig.
Unit Tests - Training
tests/unit_tests/training/mimo/test_mimo_*.py (2 files)
Updated test helpers and configuration instantiations to reference OmniModalInfra and OmniModalParallelismConfig instead of MIMO types.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Test Results For Major Changes ⚠️ Warning PR performs major refactoring renaming core infrastructure classes across 33+ files and 13 test files without documented evidence of testing or CI/CD validation. Update PR description with CI/CD workflow results, confirmation that all 13 modified test files pass, or documentation that changes are pure renames with passing test coverage.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title 'rename models/mimo infra to models/omni_modal' accurately and concisely summarizes the primary change: renaming the models/mimo infrastructure to models/omni_modal, which is clearly reflected throughout the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 84.21% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch liding/code-reorg

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
src/megatron/bridge/training/pretrain_mimo.py (1)

57-59: Enforce the documented cfg.model contract at runtime.

Line 57 documents OmniModalProvider, but pretrain_mimo() does not validate it before calling setup_mimo(). Add a fail-fast check so invalid configs raise a clear error instead of later attribute errors.

Suggested patch
 from megatron.bridge.training.config import ConfigContainer, mimo_runtime_config_update
+from megatron.bridge.models.omni_modal.omni_modal_provider import OmniModalProvider
 from megatron.bridge.training.pretrain import _maybe_destroy_process_group
 from megatron.bridge.training.setup_mimo import setup_mimo
@@
 def pretrain_mimo(
@@
 ) -> None:
@@
     logger.info("Starting MIMO pretraining")
+    if not isinstance(cfg.model, OmniModalProvider):
+        raise TypeError(
+            f"pretrain_mimo requires cfg.model to be OmniModalProvider, got {type(cfg.model).__name__}."
+        )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/megatron/bridge/training/pretrain_mimo.py` around lines 57 - 59,
pretrain_mimo currently documents that cfg.model must be an OmniModalProvider
but doesn't validate this at runtime; add a fail-fast type/attribute check at
the start of pretrain_mimo (before calling setup_mimo) to verify cfg is a
ConfigContainer and cfg.model implements the OmniModalProvider contract (e.g.,
check isinstance(cfg.model, OmniModalProvider) or the presence of required
methods/attributes used by setup_mimo), and raise a clear ValueError/TypeError
with a descriptive message if the check fails so callers get an immediate,
actionable error instead of later attribute errors.
src/megatron/bridge/models/omni_modal/__init__.py (1)

14-20: Sort __all__ to satisfy Ruff RUF022.

__all__ is currently unsorted; aligning ordering avoids lint noise and keeps exports deterministic.

♻️ Suggested change
 __all__ = [
     "LlavaOmniModalProvider",
+    "ModuleParallelismConfig",
     "OmniModalInfra",
+    "OmniModalParallelismConfig",
     "OmniModalProvider",
-    "OmniModalParallelismConfig",
-    "ModuleParallelismConfig",
 ]
As per coding guidelines: `**/*.py` must pass Ruff linting/formatting.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/megatron/bridge/models/omni_modal/__init__.py` around lines 14 - 20, The
__all__ list in omni_modal/__init__.py is unsorted and causing Ruff RUF022;
reorder the exported names alphabetically (e.g., "LlavaOmniModalProvider",
"ModuleParallelismConfig", "OmniModalInfra", "OmniModalParallelismConfig",
"OmniModalProvider") so the __all__ sequence is deterministic and
lint-compliant; update the __all__ assignment to contain the same symbols sorted
lexicographically.
src/megatron/bridge/training/mimo_parallel_utils.py (1)

273-277: Add * separator for multiple same-type parameters in validate_data_loader_contract.

Line 273-277 has three int parameters and should enforce keyword-only separation per repo style rule.

♻️ Proposed refactor
 def validate_data_loader_contract(
     infra: OmniModalInfra,
+    *,
     global_batch_size: int,
     micro_batch_size: int,
     num_microbatches: int,
 ):

As per coding guidelines, "**/*.py: Use * separator for functions with multiple same-type parameters".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/megatron/bridge/training/mimo_parallel_utils.py` around lines 273 - 277,
The function validate_data_loader_contract has three same-typed int params
(global_batch_size, micro_batch_size, num_microbatches) and needs a keyword-only
separator; update the signature to add a '*' before those params (e.g., def
validate_data_loader_contract(infra: OmniModalInfra, *, global_batch_size: int,
micro_batch_size: int, num_microbatches: int):) and adjust any callers to pass
those three arguments by keyword if necessary.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/models/omni_modal/run_hetero_llava.sh`:
- Around line 7-10: Replace the distributed-launch command invocation: locate
the shell script that contains the current command starting with "uv run
torchrun" (in examples/models/omni_modal/run_hetero_llava.sh) and update that
invocation to use the repository convention "uv run python -m
torch.distributed.run" while keeping the existing flags/arguments
(--nproc_per_node "$GPUS_PER_NODE", --nnodes "$NUM_NODES", and the target script
examples/models/omni_modal/test_mimo_training_llava.py) unchanged so the
behavior and parameters remain the same.

In `@src/megatron/bridge/data/mimo/loaders.py`:
- Around line 75-78: The raised ValueError message referencing "run
build_model()" is misleading; update the message where OmniModalProvider._grids
is checked (the block raising ValueError for cfg.model._grids is None) to
instruct callers to run the provider initialization path (e.g., setup_mimo() or
the provider infra that triggers build_model()) so it points to the actual setup
entrypoint; modify the error string to mention OmniModalProvider._grids,
build_model(), and setup_mimo()/provider initialization so users know to
initialize the provider via setup_mimo() (or equivalent) before building data
loaders.

In `@tests/unit_tests/data/mimo/test_loaders.py`:
- Line 46: The test assertion using pytest.raises should escape the literal dot
so the regex matches exactly; update the pytest.raises(...) call in
tests/unit_tests/data/mimo/test_loaders.py (the line using
pytest.raises(ValueError, match="cfg.model must be OmniModalProvider")) to use
an escaped/raw regex for the dot (e.g., r"cfg\.model must be OmniModalProvider")
so the period is treated literally and Ruff RUF043 is resolved.

In `@tests/unit_tests/models/omni_modal/test_omni_modal_ddp.py`:
- Line 14: Update the stale docstring that reads "Create a mock MimoModel
(MCore) for testing." in the test_omni_modal_ddp.py test to use OmniModal
terminology instead (e.g., "Create a mock OmniModal (MCore) for testing.") so
the docstring matches the OmniModal test path and naming.

---

Nitpick comments:
In `@src/megatron/bridge/models/omni_modal/__init__.py`:
- Around line 14-20: The __all__ list in omni_modal/__init__.py is unsorted and
causing Ruff RUF022; reorder the exported names alphabetically (e.g.,
"LlavaOmniModalProvider", "ModuleParallelismConfig", "OmniModalInfra",
"OmniModalParallelismConfig", "OmniModalProvider") so the __all__ sequence is
deterministic and lint-compliant; update the __all__ assignment to contain the
same symbols sorted lexicographically.

In `@src/megatron/bridge/training/mimo_parallel_utils.py`:
- Around line 273-277: The function validate_data_loader_contract has three
same-typed int params (global_batch_size, micro_batch_size, num_microbatches)
and needs a keyword-only separator; update the signature to add a '*' before
those params (e.g., def validate_data_loader_contract(infra: OmniModalInfra, *,
global_batch_size: int, micro_batch_size: int, num_microbatches: int):) and
adjust any callers to pass those three arguments by keyword if necessary.

In `@src/megatron/bridge/training/pretrain_mimo.py`:
- Around line 57-59: pretrain_mimo currently documents that cfg.model must be an
OmniModalProvider but doesn't validate this at runtime; add a fail-fast
type/attribute check at the start of pretrain_mimo (before calling setup_mimo)
to verify cfg is a ConfigContainer and cfg.model implements the
OmniModalProvider contract (e.g., check isinstance(cfg.model, OmniModalProvider)
or the presence of required methods/attributes used by setup_mimo), and raise a
clear ValueError/TypeError with a descriptive message if the check fails so
callers get an immediate, actionable error instead of later attribute errors.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 96e377e7-7820-4ef1-8c38-df91efb21837

📥 Commits

Reviewing files that changed from the base of the PR and between 44797c1 and 1d467d5.

📒 Files selected for processing (41)
  • examples/models/omni_modal/__init__.py
  • examples/models/omni_modal/convert_hf_clip_to_megatron.py
  • examples/models/omni_modal/convert_hf_llama_to_megatron.py
  • examples/models/omni_modal/run_conversion_verification.sh
  • examples/models/omni_modal/run_hetero_llava.sh
  • examples/models/omni_modal/run_hetero_llava_parallelism_tests.sh
  • examples/models/omni_modal/run_hetero_llava_parallelism_tests_unfrozen_llm.sh
  • examples/models/omni_modal/run_mimo_checkpoint_resume.sh
  • examples/models/omni_modal/run_mimo_parallelism_tests.sh
  • examples/models/omni_modal/test_mimo_checkpoint_resume_e2e.py
  • examples/models/omni_modal/test_mimo_training_e2e.py
  • examples/models/omni_modal/test_mimo_training_llava.py
  • examples/models/omni_modal/verify_clip_conversion.py
  • examples/models/omni_modal/verify_llama_conversion.py
  • src/megatron/bridge/data/loaders.py
  • src/megatron/bridge/data/mimo/dp_utils.py
  • src/megatron/bridge/data/mimo/loaders.py
  • src/megatron/bridge/models/mimo/__init__.py
  • src/megatron/bridge/models/omni_modal/__init__.py
  • src/megatron/bridge/models/omni_modal/llava_provider.py
  • src/megatron/bridge/models/omni_modal/omni_modal_builder.py
  • src/megatron/bridge/models/omni_modal/omni_modal_config.py
  • src/megatron/bridge/models/omni_modal/omni_modal_ddp.py
  • src/megatron/bridge/models/omni_modal/omni_modal_provider.py
  • src/megatron/bridge/training/checkpointing.py
  • src/megatron/bridge/training/config.py
  • src/megatron/bridge/training/mimo_parallel_utils.py
  • src/megatron/bridge/training/pretrain_mimo.py
  • src/megatron/bridge/training/setup_mimo.py
  • src/megatron/bridge/training/train_mimo.py
  • src/megatron/bridge/training/utils/theoretical_memory_utils.py
  • tests/functional_tests/test_groups/training/test_pretrain_mimo.py
  • tests/unit_tests/data/mimo/test_dp_utils.py
  • tests/unit_tests/data/mimo/test_loaders.py
  • tests/unit_tests/models/omni_modal/__init__.py
  • tests/unit_tests/models/omni_modal/test_llava_provider.py
  • tests/unit_tests/models/omni_modal/test_omni_modal_builder.py
  • tests/unit_tests/models/omni_modal/test_omni_modal_ddp.py
  • tests/unit_tests/models/omni_modal/test_omni_modal_provider.py
  • tests/unit_tests/training/mimo/test_mimo_checkpointing.py
  • tests/unit_tests/training/mimo/test_mimo_config.py

Comment on lines +7 to +10
uv run torchrun \
--nproc_per_node "$GPUS_PER_NODE" \
--nnodes "$NUM_NODES" \
tests/e2e/mimo/test_mimo_training_llava.py \
examples/models/omni_modal/test_mimo_training_llava.py \

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check whether `uv run torchrun` is still used elsewhere.
# Expected: no remaining occurrences once standardized.
rg -n --type=sh -C1 'uv run torchrun|uv run python -m torch\.distributed\.run'

Repository: NVIDIA-NeMo/Megatron-Bridge

Length of output: 35077


Replace uv run torchrun with uv run python -m torch.distributed.run to align with repository conventions.

The codebase standardizes on uv run python -m torch.distributed.run for distributed execution. Update lines 7–10 accordingly.

Suggested diff
-uv run torchrun \
+uv run python -m torch.distributed.run \
     --nproc_per_node "$GPUS_PER_NODE" \
     --nnodes "$NUM_NODES" \
     examples/models/omni_modal/test_mimo_training_llava.py \
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
uv run torchrun \
--nproc_per_node "$GPUS_PER_NODE" \
--nnodes "$NUM_NODES" \
tests/e2e/mimo/test_mimo_training_llava.py \
examples/models/omni_modal/test_mimo_training_llava.py \
uv run python -m torch.distributed.run \
--nproc_per_node "$GPUS_PER_NODE" \
--nnodes "$NUM_NODES" \
examples/models/omni_modal/test_mimo_training_llava.py \
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/models/omni_modal/run_hetero_llava.sh` around lines 7 - 10, Replace
the distributed-launch command invocation: locate the shell script that contains
the current command starting with "uv run torchrun" (in
examples/models/omni_modal/run_hetero_llava.sh) and update that invocation to
use the repository convention "uv run python -m torch.distributed.run" while
keeping the existing flags/arguments (--nproc_per_node "$GPUS_PER_NODE",
--nnodes "$NUM_NODES", and the target script
examples/models/omni_modal/test_mimo_training_llava.py) unchanged so the
behavior and parameters remain the same.

Comment on lines 75 to 78
if cfg.model._grids is None:
raise ValueError(
"MimoModelProvider._grids is None. Ensure build_model() is called before building data loaders."
"OmniModalProvider._grids is None. Ensure build_model() is called before building data loaders."
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clarify the _grids error guidance to reference the actual setup path.

Line 77 asks callers to run build_model(), but this path is driven by setup_mimo() / provider infra initialization. The current message can misdirect debugging.

Suggested patch
     if cfg.model._grids is None:
         raise ValueError(
-            "OmniModalProvider._grids is None. Ensure build_model() is called before building data loaders."
+            "OmniModalProvider._grids is None. Ensure setup_mimo() (or OmniModalProvider.build_infra()) "
+            "has run before building data loaders."
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/megatron/bridge/data/mimo/loaders.py` around lines 75 - 78, The raised
ValueError message referencing "run build_model()" is misleading; update the
message where OmniModalProvider._grids is checked (the block raising ValueError
for cfg.model._grids is None) to instruct callers to run the provider
initialization path (e.g., setup_mimo() or the provider infra that triggers
build_model()) so it points to the actual setup entrypoint; modify the error
string to mention OmniModalProvider._grids, build_model(), and
setup_mimo()/provider initialization so users know to initialize the provider
via setup_mimo() (or equivalent) before building data loaders.

@@ -43,7 +43,7 @@ def test_build_mimo_data_loaders_raises_when_model_not_mimo(monkeypatch):
cfg = SimpleNamespace(model=object(), train=SimpleNamespace(micro_batch_size=2))
provider = FakeProvider()

with pytest.raises(ValueError, match="cfg.model must be MimoModelProvider"):
with pytest.raises(ValueError, match="cfg.model must be OmniModalProvider"):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the regex warning is resolved in this file.
uv run ruff check tests/unit_tests/data/mimo/test_loaders.py --select RUF043

Repository: NVIDIA-NeMo/Megatron-Bridge

Length of output: 113


🏁 Script executed:

# Read the file to see the current state at line 46
sed -n '40,52p' tests/unit_tests/data/mimo/test_loaders.py

Repository: NVIDIA-NeMo/Megatron-Bridge

Length of output: 618


🏁 Script executed:

# Check Python regex behavior: does unescaped dot match only dot or any character?
python3 << 'EOF'
import re

# Test the pattern with unescaped dot
pattern_unescaped = "cfg.model must be OmniModalProvider"
pattern_escaped = r"cfg\.model must be OmniModalProvider"

test_string = "cfg.model must be OmniModalProvider"

# Test unescaped (dot matches any char)
match_unescaped = re.search(pattern_unescaped, test_string)
print(f"Unescaped pattern matches: {match_unescaped is not None}")

# Test escaped (dot matches literal dot)
match_escaped = re.search(pattern_escaped, test_string)
print(f"Escaped pattern matches: {match_escaped is not None}")

# Show what unescaped dot matches
test_string2 = "cfgXmodel must be OmniModalProvider"  # Replace . with X
match_unescaped2 = re.search(pattern_unescaped, test_string2)
print(f"Unescaped pattern matches 'cfgXmodel...' (X instead of dot): {match_unescaped2 is not None}")
EOF

Repository: NVIDIA-NeMo/Megatron-Bridge

Length of output: 198


🏁 Script executed:

# Find where the ValueError is raised to see the exact error message
rg 'cfg\.model must be OmniModalProvider' --type py

Repository: NVIDIA-NeMo/Megatron-Bridge

Length of output: 322


🏁 Script executed:

# Try running ruff check with python -m ruff if available
python3 -m ruff check tests/unit_tests/data/mimo/test_loaders.py --select RUF043 2>&1 || echo "ruff not available via python -m"

Repository: NVIDIA-NeMo/Megatron-Bridge

Length of output: 877


🏁 Script executed:

# Check if ruff is installed and callable directly
ruff check tests/unit_tests/data/mimo/test_loaders.py --select RUF043 2>&1 || echo "ruff check failed"

Repository: NVIDIA-NeMo/Megatron-Bridge

Length of output: 862


Escape the dot in the match regex at Line 46.

Line 46 uses match="cfg.model must be OmniModalProvider", where . is treated as a regex wildcard matching any character. Use a raw escaped pattern for exact matching and to clear the RUF043 linting error.

Suggested fix
-    with pytest.raises(ValueError, match="cfg.model must be OmniModalProvider"):
+    with pytest.raises(ValueError, match=r"cfg\.model must be OmniModalProvider"):

Per the coding guidelines, all Python files must pass Ruff checks.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
with pytest.raises(ValueError, match="cfg.model must be OmniModalProvider"):
with pytest.raises(ValueError, match=r"cfg\.model must be OmniModalProvider"):
🧰 Tools
🪛 Ruff (0.15.10)

[warning] 46-46: Pattern passed to match= contains metacharacters but is neither escaped nor raw

(RUF043)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit_tests/data/mimo/test_loaders.py` at line 46, The test assertion
using pytest.raises should escape the literal dot so the regex matches exactly;
update the pytest.raises(...) call in tests/unit_tests/data/mimo/test_loaders.py
(the line using pytest.raises(ValueError, match="cfg.model must be
OmniModalProvider")) to use an escaped/raw regex for the dot (e.g., r"cfg\.model
must be OmniModalProvider") so the period is treated literally and Ruff RUF043
is resolved.

def _create_mock_mimo_model(self, has_language_model=True, modality_names=None):
"""Create a mock MimoModel for testing."""
def _create_mock_omni_modal_model(self, has_language_model=True, modality_names=None):
"""Create a mock MimoModel (MCore) for testing."""

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update stale docstring naming to OmniModal terminology.

Line 14 still mentions MimoModel; this is now testing the OmniModal path and should use consistent naming.

✏️ Suggested text fix
-        """Create a mock MimoModel (MCore) for testing."""
+        """Create a mock OmniModal model (MCore) for testing."""
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"""Create a mock MimoModel (MCore) for testing."""
"""Create a mock OmniModal model (MCore) for testing."""
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit_tests/models/omni_modal/test_omni_modal_ddp.py` at line 14, Update
the stale docstring that reads "Create a mock MimoModel (MCore) for testing." in
the test_omni_modal_ddp.py test to use OmniModal terminology instead (e.g.,
"Create a mock OmniModal (MCore) for testing.") so the docstring matches the
OmniModal test path and naming.

@liding-nv

Copy link
Copy Markdown
Contributor Author

/ok to test 1d467d5

@yaoyu-33 yaoyu-33 added feature New capabilities, enhancements, or enablement work area:model Model implementations and HF bridge logic needs-review PR is ready for code review and waiting on a reviewer labels Apr 16, 2026
Comment thread src/megatron/bridge/training/setup_megatron_mimo.py
Signed-off-by: Li Ding <liding@nvidia.com>
Signed-off-by: Li Ding <liding@nvidia.com>
@liding-nv

Copy link
Copy Markdown
Contributor Author

/ok to test e8ed80e

@cuichenx

Copy link
Copy Markdown
Contributor

Rename to MegatronMIMO

@cuichenx cuichenx marked this pull request as draft April 21, 2026 20:49
@liding-nv

Copy link
Copy Markdown
Contributor Author

/ok to test 1d58f11

@liding-nv

Copy link
Copy Markdown
Contributor Author

/claude review

@liding-nv liding-nv marked this pull request as ready for review April 23, 2026 16:51
Comment thread src/megatron/bridge/data/megatron_mimo/__init__.py Outdated
Args:
cfg: ConfigContainer with training configuration. ``cfg.model`` must be
a ``MimoModelProvider``. ``cfg.optimizer`` (a ``BridgeOptimizerConfig``)
an ``MegatronMIMOProvider``. ``cfg.optimizer`` (a ``BridgeOptimizerConfig``)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar: "MegatronMIMO" starts with a consonant sound → should be "a", not "an". The old code correctly had `a ``MimoModelProvider```.

Suggested change
an ``MegatronMIMOProvider``. ``cfg.optimizer`` (a ``BridgeOptimizerConfig``)
a ``MegatronMIMOProvider``. ``cfg.optimizer`` (a ``BridgeOptimizerConfig``)

Same issue in setup_megatron_mimo.py lines 178 and 188.

provider = MimoModelProvider(
provider = MegatronMIMOProvider(
language_model_spec=language_spec,
modality_submodules_spec={"images": modality_spec},

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same "an" → "a" grammar fix:

Suggested change
modality_submodules_spec={"images": modality_spec},
"""Create a MegatronMIMOParallelismConfig."""

@claude

claude Bot commented Apr 23, 2026

Copy link
Copy Markdown
Contributor

Light review — clean mechanical rename overall. Two minor issues:

  1. Grammar ("an" → "a"): The old code correctly used a ``MimoModelProvider``` but the rename introduced an ``MegatronMIMOProvider``` in several docstrings. "MegatronMIMO" starts with a consonant sound, so it should be "a". Spotted in:

    • pretrain_megatron_mimo.py:58
    • setup_megatron_mimo.py:178 and :188
    • test_megatron_mimo_provider.py:46
  2. Copyright year: src/megatron/bridge/data/megatron_mimo/__init__.py is a new file but carries 2025 from the deleted original — should be 2026.

No logic issues found. The old mimo_config / MimoModel / MIMO_LANGUAGE_MODULE_KEY references from MCore are correctly preserved.

@liding-nv

Copy link
Copy Markdown
Contributor Author

/ok to test 43d7a22

@liding-nv

Copy link
Copy Markdown
Contributor Author

/ok to test 43d7a22

@cuichenx cuichenx changed the title rename models/mimo infra to models/omni_modal rename models/mimo infra to MegatronMIMO Apr 23, 2026
@cuichenx

Copy link
Copy Markdown
Contributor

/ok to test 43d7a22

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Signed-off-by: Chen Cui <cxcui@alumni.cmu.edu>
@cuichenx

Copy link
Copy Markdown
Contributor

/ok to test fa4bf87

@cuichenx

Copy link
Copy Markdown
Contributor

/ok to test c8457a6

@cuichenx cuichenx merged commit 994b7f2 into main Apr 23, 2026
376 of 385 checks passed
@cuichenx cuichenx deleted the liding/code-reorg branch April 23, 2026 23:42
vasunvidia pushed a commit to vasunvidia/Megatron-Bridge that referenced this pull request Jun 10, 2026
Signed-off-by: Li Ding <liding@nvidia.com>
Signed-off-by: Chen Cui <cxcui@alumni.cmu.edu>
Co-authored-by: Chen Cui <chcui@nvidia.com>
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Signed-off-by: Vasudevan Rengasamy <vrengasamy@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:model Model implementations and HF bridge logic feature New capabilities, enhancements, or enablement work full-test-suite needs-review PR is ready for code review and waiting on a reviewer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants