[mxfp8 moe training] register constant with pytree by danielvegamyhre · Pull Request #3667 · pytorch/ao

danielvegamyhre · 2026-01-21T00:29:10Z

After #3606 @torch._dynamo.nonstrict_trace was added to support autograd functions having different tensor subclasses for forward output and backward input.

The e2e integration tests for compile and eager passed and it was landed, but it turns out if you explicitly specify a scaling recipe instead of defaulting, it breaks with this error:

torch._dynamo.exc.Unsupported: Invalid input type for nonstrict_trace-ed function Explanation: For nonstrict_trace-ed functions, only basic types (e.g., torch.Tensor, int, float) or pytree containers of those are allowed as inputs. The provided argument contains an unsupported type. Hint: Use one of the following to register the type with pytree: * torch.utils._pytree.register_constant * torch.utils._pytree.register_dataclass * torch.utils._pytree.register_pytree_node

Basically we need to register the enum with pytree, which is done in this PR.

Tests

Added unit test cases which fail without this change.

pytest test/prototype/moe_training/test_scaled_grouped_mm.py -k dq

pytorch-bot · 2026-01-21T00:29:14Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3667

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 4 Unrelated Failures

As of commit 8a86e54 with merge base 3834780 ():

NEW FAILURES - The following jobs have failed:

Run Regression Tests / test (CPU 2.6, linux.4xlarge, torch==2.6.0 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh)
RuntimeError: Command docker exec -t 626d1b6c0309090a2d0deaf3d3cf00232e87bbc90660b39717e1299c1db67ce8 /exec failed with exit code 2
Run Regression Tests / test (CUDA 2.6, linux.g5.12xlarge.nvidia.gpu, torch==2.6.0, cuda, 12.6) / linux-job (gh)
RuntimeError: Command docker exec -t bfa5fc78b092cfaa4756ccd1944f94e2bc08e122bc62b2fc64a876049630da7d /exec failed with exit code 2

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

Run Regression Tests / test (CUDA 2.7, linux.g5.12xlarge.nvidia.gpu, torch==2.7.1, cuda, 12.6) / linux-job (gh) (trunk failure)
test/quantization/quantize_/workflows/int8/test_int8_tensor.py::TestInt8StaticQuant::test_static_activation_per_row_int8_weight_granularity1_bfloat16
Run Regression Tests / test (CUDA 2.8, linux.g5.12xlarge.nvidia.gpu, torch==2.8.0, cuda, 12.6) / linux-job (gh) (trunk failure)
test/quantization/quantize_/workflows/int8/test_int8_tensor.py::TestInt8StaticQuant::test_static_activation_per_row_int8_weight_granularity1_bfloat16
Run Regression Tests / test (CUDA 2.9, linux.g5.12xlarge.nvidia.gpu, torch==2.9.1, cuda, 12.6) / linux-job (gh) (trunk failure)
test/quantization/quantize_/workflows/int8/test_int8_tensor.py::TestInt8StaticQuant::test_static_activation_per_row_int8_weight_granularity1_bfloat16
Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://downloa... / linux-job (gh) (trunk failure)
test/quantization/quantize_/workflows/int8/test_int8_tensor.py::TestInt8StaticQuant::test_static_activation_per_row_int8_weight_granularity1_bfloat16

This comment was automatically generated by Dr. CI and updates every 15 minutes.

danielvegamyhre · 2026-01-21T03:45:36Z

test failures unrelated, landing

[mxfp8 moe training] register constant with pytree

8a86e54

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 21, 2026

danielvegamyhre added mx topic: bug fix Use this tag for PRs that fix bugs moe labels Jan 21, 2026

liangel-02 approved these changes Jan 21, 2026

View reviewed changes

danielvegamyhre merged commit 3350b2f into main Jan 21, 2026
17 of 25 checks passed

jcaip pushed a commit that referenced this pull request Jan 22, 2026

[mxfp8 moe training] register constant with pytree (#3667)

b6e80c6

danielvegamyhre mentioned this pull request Jan 23, 2026

fix dynamo trace issue and bug introduced by new float8_dtype param #3712

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mxfp8 moe training] register constant with pytree#3667

[mxfp8 moe training] register constant with pytree#3667
danielvegamyhre merged 1 commit into
mainfrom
pytree

danielvegamyhre commented Jan 21, 2026

Uh oh!

pytorch-bot Bot commented Jan 21, 2026 •

edited

Loading

Uh oh!

danielvegamyhre commented Jan 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

danielvegamyhre commented Jan 21, 2026

Tests

Uh oh!

pytorch-bot Bot commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3667

❌ 2 New Failures, 4 Unrelated Failures

Uh oh!

danielvegamyhre commented Jan 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pytorch-bot Bot commented Jan 21, 2026 •

edited

Loading