Support gpt-oss mxfp4 format qat

---

### 🔖 Feature description

**Support MXFP4 (Microscaling) Format for QAT and Post-Training Quantization via torchao/Model-Optimizer.**

Currently, Axolotl users attempting to use 4-bit floating-point formats may run into hardware-specific constraints (e.g., the `nvfp4` error which is exclusive to Blackwell sm100). This feature request proposes adding support for **MXFP4 (E2M1)**, a hardware-agnostic OCP standard that is supported on NVIDIA Hopper (H100/H800) and can be emulated efficiently on Ampere.

Implementing MXFP4 QAT will allow:

1. Higher training stability compared to INT4/FP4.
2. Better post-training weight compression for LLMs like `gpt-oss`.
3. Alignment with NVIDIA's `model-optimizer` and `torchao` roadmaps.

### ✔️ Solution

Integrate `torchao.quantization.quantize_` with MXFP4 specific configs or utilize NVIDIA's `modelopt` (Model Optimizer) workflow within Axolotl's quantization CLI.

**Key components:**

* Add `mxfp4` as a valid option for `quantization.weight_dtype` in the YAML config.
* Implement the MXFP4 fake-quantization logic in `axolotl.utils.quantization` during the QAT phase.
* Ensure compatibility with `torchao`'s MX format implementations (specifically `mx_fp4`).

**References:**

* [NVIDIA Model-Optimizer gpt-oss examples](https://github.com/NVIDIA/Model-Optimizer/tree/main/examples/gpt-oss)
* [OCP Microscaling Formats Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf)
* [torchao MX Support](https://github.com/pytorch/ao/tree/main/torchao/prototype/mx_formats)

### ❓ Alternatives

Currently, users are forced to use `int4_weight_only` or `fp8`, which either lacks the dynamic range of MXFP4 or doesn't provide the same 4-bit memory savings.

### 📝 Additional Context

As LLMs like `gpt-oss` (120B+) grow, 4-bit quantization becomes critical for inference. MXFP4 provides a sweet spot between 8-bit accuracy and 4-bit efficiency by using shared scales across groups of elements (e.g., block size 16 or 32).

https://github.com/axolotl-ai-cloud/axolotl/issues/3333

---

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support gpt-oss mxfp4 format qat #3547

🔖 Feature description

✔️ Solution

❓ Alternatives

📝 Additional Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Support gpt-oss mxfp4 format qat #3547

Description

🔖 Feature description

✔️ Solution

❓ Alternatives

📝 Additional Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions