Describe the bug
CI tests in tests/unit_tests/transformer/moe/test_paged_stashing.py are failing with an AttributeError because the test treats a functools.partial as if it were a ModuleSpec:
tests/unit_tests/transformer/moe/test_paged_stashing.py:139: AttributeError
E AttributeError: 'functools.partial' object has no attribute 'submodules'
Failing nodes:
TestPagedStashing::test_forward_backward_4_layers
TestPagedStashingOverBudget::test_overload_factor_and_over_budget
Tag @NVIDIA/mcore-oncall to get oncall's attention to this issue.
Root cause (likely)
At tests/unit_tests/transformer/moe/test_paged_stashing.py:139 the test does:
transformer_layer_spec = get_gpt_layer_with_transformer_engine_spec(
num_experts=self.config.num_moe_experts, moe_grouped_gemm=True
)
...
MoELayer(self.config, transformer_layer_spec.submodules.mlp.submodules)
In the MoE branch, transformer_layer_spec.submodules.mlp is set to partial(MoELayer, submodules=MoESubmodules(...)) (see megatron/core/models/gpt/moe_module_specs.py:66, introduced by #3435 on 2026-05-10). A functools.partial does not expose .submodules, so the attribute access at line 139 crashes.
Commit history:
5e31514165 (#3435, 2026-05-10, @nick-schank) switched MoE spec to return a partial.
f007db77b9 (#4247, 2026-05-22, @nanz-nv) added the test that assumes the old ModuleSpec shape.
Failing run
Error (verbatim, abridged)
tests/unit_tests/transformer/moe/test_paged_stashing.py:139: AttributeError
E AttributeError: 'functools.partial' object has no attribute 'submodules'
FAILED tests/unit_tests/transformer/moe/test_paged_stashing.py::TestPagedStashing::test_forward_backward_4_layers
FAILED tests/unit_tests/transformer/moe/test_paged_stashing.py::TestPagedStashingOverBudget::test_overload_factor_and_over_budget
Steps/Code to reproduce bug
Re-run the failing CI job linked above, or locally inside the dev container:
pytest tests/unit_tests/transformer/moe/test_paged_stashing.py
Additional context
Triaged automatically via /create-issue. Fix is to update _create_moe_layer to construct MoESubmodules directly (or to read it via .keywords["submodules"] on the partial) rather than chaining .submodules.mlp.submodules.
Describe the bug
CI tests in
tests/unit_tests/transformer/moe/test_paged_stashing.pyare failing with anAttributeErrorbecause the test treats afunctools.partialas if it were aModuleSpec:Failing nodes:
TestPagedStashing::test_forward_backward_4_layersTestPagedStashingOverBudget::test_overload_factor_and_over_budgetTag @NVIDIA/mcore-oncall to get oncall's attention to this issue.
Root cause (likely)
At
tests/unit_tests/transformer/moe/test_paged_stashing.py:139the test does:In the MoE branch,
transformer_layer_spec.submodules.mlpis set topartial(MoELayer, submodules=MoESubmodules(...))(seemegatron/core/models/gpt/moe_module_specs.py:66, introduced by #3435 on 2026-05-10). Afunctools.partialdoes not expose.submodules, so the attribute access at line 139 crashes.Commit history:
5e31514165(#3435, 2026-05-10, @nick-schank) switched MoE spec to return apartial.f007db77b9(#4247, 2026-05-22, @nanz-nv) added the test that assumes the oldModuleSpecshape.Failing run
tests/unit_tests/transformer/moe/**/*.py - latestError (verbatim, abridged)
Steps/Code to reproduce bug
Re-run the failing CI job linked above, or locally inside the dev container:
Additional context
Triaged automatically via
/create-issue. Fix is to update_create_moe_layerto constructMoESubmodulesdirectly (or to read it via.keywords["submodules"]on the partial) rather than chaining.submodules.mlp.submodules.