[CPU] Reland qconv fp8 fusion passes by Xia-Weiwen · Pull Request #3433 · pytorch/ao

Xia-Weiwen · 2025-12-04T02:02:41Z

Reland #3418 with skipping test cases for ROCm

…6Inductor backend

pytorch-bot · 2025-12-04T02:02:45Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3433

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 7838789 with merge base a6dbf45 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

* add MXFP8 all gather support * added TODO for future feature * remove emoji from comment * fixed ruff formating * fixed ruff formatting * add mxfp8 and nvfp4 to Llama eval scripts (#3394) Update [ghstack-poisoned] * flip mx inference scaling setting to RCEIL (#3428) * Update [ghstack-poisoned] * Update [ghstack-poisoned] * Update [ghstack-poisoned] * add CLAUDE.local.md to gitignore (#3437) Summary: taking claude code for a more thorough spin, will start with local instructions and will see what makes sense to upstream Test Plan: Reviewers: Subscribers: Tasks: Tags: * bump python version in tutorial ci workflow (#3439) * [CPU] Reland qconv fp8 fusion passes (#3433) * [Reland][PT2E][X86] Add Inductor fusion passes of float8 qconv for X86Inductor backend * add torch version check for Qconv FP8 UTs * fix format issue * Skip tests for ROCm --------- Co-authored-by: Sun, Jiayi <jiayi.sun@intel.com> * Int8Tensor migration cleanup (#3407) * Int8Tensor migration Summary: This PR creates a new Int8Tensor and updates the configs to use the new Int8Tensor flow Test Plan: To ensure BC: ``` pytest test/quantization/test_quant_api.py ``` To test new Int8Tensor: ``` pytest test/quantization/quantize_/workflows/int8/test_int8_tensor.py ``` Reviewers: Subscribers: Tasks: Tags: * ruff fixes * add init * fix ruff again * update * wip * undo update tests * fix ruff * fix varname * fix typing * add tests * fix dtype * fix ci * address granularity cr * update _choose_quant_func_and_quantize_tensor * make block size required attribute * made dtype required as well * address nits * skip per tensor weight only test for now * [xpu][test] Port 2 test/dtypes_{floatx, bitpacking} UT files to intel XPU (#3368) * enable test/dtypes/test_bitpacking.py on intel xpu * enable test/dtypes/test_floatx.py * enable test/dtypes/test_floatx.py * fix format issue * fix format issue * update _DEVICES * [xpu][test] Port 2 test/quantization/pt2e/test_{quantize_pt2e, quantize_pt2e_qat} UT files to intel XPU (#3405) * add test/quantization/pt2e/test_quantize_pt2e.py * add test/quantization/pt2e/test_quantize_pt2e.py * test/quantization/pt2e/test_quantize_pt2e_qat.py * test/quantization/pt2e/test_quantize_pt2e_qat.py * fix format issue * update format * increase timeout for xpu * [Intel GPU] Enable optim SR test (#3055) * updated test with rebase changes * added checks to run only on CUDA with compatibility >=9 * updated test for H100 * added test to workflow --------- Co-authored-by: Vasiliy Kuznetsov <vkuzo@users.noreply.github.com> Co-authored-by: Daniel Vega-Myhre <danvm@meta.com> Co-authored-by: Xia Weiwen <weiwen.xia@intel.com> Co-authored-by: Sun, Jiayi <jiayi.sun@intel.com> Co-authored-by: Jesse Cai <jessecai@meta.com> Co-authored-by: xiangdong <40376367+zxd1997066@users.noreply.github.com> Co-authored-by: Artur Lesniak <artur.lesniak@intel.com>

* [Reland][PT2E][X86] Add Inductor fusion passes of float8 qconv for X86Inductor backend * add torch version check for Qconv FP8 UTs * fix format issue * Skip tests for ROCm --------- Co-authored-by: Sun, Jiayi <jiayi.sun@intel.com>

* add MXFP8 all gather support * added TODO for future feature * remove emoji from comment * fixed ruff formating * fixed ruff formatting * add mxfp8 and nvfp4 to Llama eval scripts (pytorch#3394) Update [ghstack-poisoned] * flip mx inference scaling setting to RCEIL (pytorch#3428) * Update [ghstack-poisoned] * Update [ghstack-poisoned] * Update [ghstack-poisoned] * add CLAUDE.local.md to gitignore (pytorch#3437) Summary: taking claude code for a more thorough spin, will start with local instructions and will see what makes sense to upstream Test Plan: Reviewers: Subscribers: Tasks: Tags: * bump python version in tutorial ci workflow (pytorch#3439) * [CPU] Reland qconv fp8 fusion passes (pytorch#3433) * [Reland][PT2E][X86] Add Inductor fusion passes of float8 qconv for X86Inductor backend * add torch version check for Qconv FP8 UTs * fix format issue * Skip tests for ROCm --------- Co-authored-by: Sun, Jiayi <jiayi.sun@intel.com> * Int8Tensor migration cleanup (pytorch#3407) * Int8Tensor migration Summary: This PR creates a new Int8Tensor and updates the configs to use the new Int8Tensor flow Test Plan: To ensure BC: ``` pytest test/quantization/test_quant_api.py ``` To test new Int8Tensor: ``` pytest test/quantization/quantize_/workflows/int8/test_int8_tensor.py ``` Reviewers: Subscribers: Tasks: Tags: * ruff fixes * add init * fix ruff again * update * wip * undo update tests * fix ruff * fix varname * fix typing * add tests * fix dtype * fix ci * address granularity cr * update _choose_quant_func_and_quantize_tensor * make block size required attribute * made dtype required as well * address nits * skip per tensor weight only test for now * [xpu][test] Port 2 test/dtypes_{floatx, bitpacking} UT files to intel XPU (pytorch#3368) * enable test/dtypes/test_bitpacking.py on intel xpu * enable test/dtypes/test_floatx.py * enable test/dtypes/test_floatx.py * fix format issue * fix format issue * update _DEVICES * [xpu][test] Port 2 test/quantization/pt2e/test_{quantize_pt2e, quantize_pt2e_qat} UT files to intel XPU (pytorch#3405) * add test/quantization/pt2e/test_quantize_pt2e.py * add test/quantization/pt2e/test_quantize_pt2e.py * test/quantization/pt2e/test_quantize_pt2e_qat.py * test/quantization/pt2e/test_quantize_pt2e_qat.py * fix format issue * update format * increase timeout for xpu * [Intel GPU] Enable optim SR test (pytorch#3055) * updated test with rebase changes * added checks to run only on CUDA with compatibility >=9 * updated test for H100 * added test to workflow --------- Co-authored-by: Vasiliy Kuznetsov <vkuzo@users.noreply.github.com> Co-authored-by: Daniel Vega-Myhre <danvm@meta.com> Co-authored-by: Xia Weiwen <weiwen.xia@intel.com> Co-authored-by: Sun, Jiayi <jiayi.sun@intel.com> Co-authored-by: Jesse Cai <jessecai@meta.com> Co-authored-by: xiangdong <40376367+zxd1997066@users.noreply.github.com> Co-authored-by: Artur Lesniak <artur.lesniak@intel.com>

jiayisunx and others added 4 commits December 3, 2025 01:57

[Reland][PT2E][X86] Add Inductor fusion passes of float8 qconv for X8…

a333b60

…6Inductor backend

add torch version check for Qconv FP8 UTs

d73ef6f

fix format issue

98268d2

Skip tests for ROCm

7838789

pytorch-bot Bot added the ci-no-td label Dec 4, 2025

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 4, 2025

Xia-Weiwen marked this pull request as ready for review December 4, 2025 02:02

Xia-Weiwen added the module: not user facing Use this tag if you don't want this PR to show up in release notes label Dec 4, 2025

Xia-Weiwen requested review from jcaip and jerryzh168 December 4, 2025 03:24

Xia-Weiwen added the ciflow/rocm label Dec 4, 2025

jerryzh168 merged commit 7e0d439 into pytorch:main Dec 4, 2025
24 of 25 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CPU] Reland qconv fp8 fusion passes#3433

[CPU] Reland qconv fp8 fusion passes#3433
jerryzh168 merged 4 commits intopytorch:mainfrom
Xia-Weiwen:reland_qconv_fp8

Xia-Weiwen commented Dec 4, 2025

Uh oh!

pytorch-bot Bot commented Dec 4, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Xia-Weiwen commented Dec 4, 2025

Uh oh!

pytorch-bot Bot commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3433

✅ No Failures

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pytorch-bot Bot commented Dec 4, 2025 •

edited

Loading