[xpu][feature] Supports woq_int8 inductor pattern on Intel GPU by xiaowangintel · Pull Request #163615 · pytorch/pytorch

xiaowangintel · 2025-09-23T06:30:43Z

Summary:

Supports woq_int8 inductor pattern on Intel GPU. When using torch.compile, woq_int8 will be lowering to _weight_int8pack_mm instead of being falled back mul().sum(). The Intel GPU backend of _weight_int8pack_mm was supported in #160938.

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo @chenyang78

pytorch-bot · 2025-09-23T06:30:47Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/163615

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 4c3d0c5 with merge base 0449e64 ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

xpu / linux-noble-xpu-n-py3.10 / test (default, 2, 12, linux.idc.xpu) (gh) (disabled by #170622 but the issue was closed recently and a rebase is needed to make it pass)
test/inductor/test_max_autotune.py::TestMaxAutotuneAsyncPipelined::test_max_autotune_addmm_persistent_tma_a_transposed_False_b_transposed_False_dynamic_False_tma_store_False

This comment was automatically generated by Dr. CI and updates every 15 minutes.

etaf · 2025-09-23T06:54:54Z

test/inductor/test_mkldnn_pattern_matcher.py

-            def matcher_check_fn():
-                self.assertEqual(
-                    counters["inductor"]["woq_matcher_count"], 0 if TEST_ACL else 1
+        devices = ["cpu", "xpu"]


Hi, this will fail on device without xpu.
I suggest you move this case to TestPatternMatcherGeneric, which is already designed to be reused by both cpu and xpu.

Gets it, I have modified.

pytorch-bot · 2025-10-14T01:04:02Z

To add the ciflow label ciflow/xpu please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

etaf · 2025-10-16T02:26:36Z

@pytorchbot rebase -b main

pytorchmergebot · 2025-10-16T02:28:05Z

@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here

pytorchmergebot · 2025-10-16T02:28:08Z

Successfully rebased xw/woq_int8_inductor onto refs/remotes/origin/main, please pull locally before adding more changes (for example, via git checkout xw/woq_int8_inductor && git pull --rebase)

etaf · 2025-10-20T00:43:54Z

@pytorchbot rebase -b main

pytorchmergebot · 2025-10-20T00:45:29Z

@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here

pytorchmergebot · 2025-10-20T00:45:32Z

Successfully rebased xw/woq_int8_inductor onto refs/remotes/origin/main, please pull locally before adding more changes (for example, via git checkout xw/woq_int8_inductor && git pull --rebase)

pytorchmergebot · 2025-12-15T23:38:37Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2025-12-16T00:16:11Z

Merge failed

Reason: 2 jobs have failed, first few of them are: linux-aarch64 / linux-jammy-aarch64-py3.10 / test (default, 3, 3, linux.arm64.m7g.4xlarge), linux-aarch64 / linux-jammy-aarch64-py3.10 / test (default, 3, 3, linux.arm64.m8g.4xlarge)

Details for Dev Infra team

Raised by workflow job

xiaowangintel · 2025-12-16T06:13:49Z

@pytorchbot merge

pytorchmergebot · 2025-12-16T06:15:46Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2025-12-16T06:53:19Z

Merge failed

Reason: 1 jobs have failed, first few of them are: linux-aarch64 / linux-jammy-aarch64-py3.10 / test (default, 3, 3, linux.arm64.m8g.4xlarge)

Details for Dev Infra team

Raised by workflow job

xiaowangintel · 2025-12-19T02:47:34Z

@pytorchbot merge

pytorchmergebot · 2025-12-19T02:50:36Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Summary: Supports woq_int8 inductor pattern on Intel GPU. When using torch.compile, woq_int8 will be lowering to _weight_int8pack_mm instead of being falled back mul().sum(). The Intel GPU backend of _weight_int8pack_mm was supported in #160938. Pull Request resolved: #163615 Approved by: https://github.com/etaf, https://github.com/EikanWang, https://github.com/desertfire, https://github.com/jansel

…ch#163615) Summary: Supports woq_int8 inductor pattern on Intel GPU. When using torch.compile, woq_int8 will be lowering to _weight_int8pack_mm instead of being falled back mul().sum(). The Intel GPU backend of _weight_int8pack_mm was supported in pytorch#160938. Pull Request resolved: pytorch#163615 Approved by: https://github.com/etaf, https://github.com/EikanWang, https://github.com/desertfire, https://github.com/jansel

pytorch-bot bot added the module: inductor label Sep 23, 2025

pytorchbot added the open source label Sep 23, 2025

etaf added ciflow/xpu Run XPU CI tasks topic: not user facing topic category labels Sep 23, 2025

etaf requested changes Sep 23, 2025

View reviewed changes

jbschlosser added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module intel This tag is for PR from Intel labels Sep 24, 2025

xiaowangintel force-pushed the xw/woq_int8_inductor branch from 8f0bc1a to f14a13f Compare September 29, 2025 08:18

pytorch-bot bot removed the ciflow/xpu Run XPU CI tasks label Sep 29, 2025

etaf approved these changes Oct 14, 2025

View reviewed changes

etaf added the ciflow/xpu Run XPU CI tasks label Oct 14, 2025

pytorch-bot bot removed the ciflow/xpu Run XPU CI tasks label Oct 14, 2025

etaf added the ciflow/xpu Run XPU CI tasks label Oct 14, 2025

etaf requested a review from EikanWang October 14, 2025 01:05

xiaowangintel force-pushed the xw/woq_int8_inductor branch from f14a13f to 3d30a78 Compare October 15, 2025 00:57

pytorch-bot bot removed the ciflow/xpu Run XPU CI tasks label Oct 15, 2025

etaf added the ciflow/xpu Run XPU CI tasks label Oct 15, 2025

pytorchmergebot force-pushed the xw/woq_int8_inductor branch from 3d30a78 to e2fc389 Compare October 16, 2025 02:28

pytorch-bot bot removed the ciflow/xpu Run XPU CI tasks label Oct 16, 2025

etaf added the ciflow/xpu Run XPU CI tasks label Oct 16, 2025

pytorchmergebot force-pushed the xw/woq_int8_inductor branch from e2fc389 to 70ce956 Compare October 20, 2025 00:45

pytorch-bot bot removed the ciflow/xpu Run XPU CI tasks label Oct 20, 2025

pytorchmergebot removed the merging label Dec 16, 2025

xiaowangintel added 2 commits December 15, 2025 18:00

Support woq_int8 pattern on Intel GPU

db68c86

Support woq_int8 pattern on Intel GPU

957b2d1

xiaowangintel force-pushed the xw/woq_int8_inductor branch from 9e5b591 to 957b2d1 Compare December 16, 2025 03:10

pytorch-bot bot removed ciflow/trunk Trigger trunk jobs on your pull request ciflow/xpu Run XPU CI tasks labels Dec 16, 2025

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 16, 2025

pytorchmergebot added the merging label Dec 16, 2025

pytorchmergebot removed the merging label Dec 16, 2025

jansel approved these changes Dec 17, 2025

View reviewed changes

Support woq_int8 pattern on Intel GPU

4c3d0c5

pytorch-bot bot removed the ciflow/trunk Trigger trunk jobs on your pull request label Dec 18, 2025

chuanqi129 added the ciflow/xpu Run XPU CI tasks label Dec 18, 2025

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 19, 2025

pytorchmergebot added the merging label Dec 19, 2025

pytorchmergebot closed this in 14c33d2 Dec 19, 2025

pytorchmergebot added Merged and removed merging labels Dec 19, 2025

Conversation

xiaowangintel commented Sep 23, 2025 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/163615

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

etaf Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xiaowangintel Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

pytorch-bot bot commented Oct 14, 2025

Uh oh!

etaf commented Oct 16, 2025

Uh oh!

pytorchmergebot commented Oct 16, 2025

Uh oh!

pytorchmergebot commented Oct 16, 2025

Uh oh!

etaf commented Oct 20, 2025

Uh oh!

pytorchmergebot commented Oct 20, 2025

Uh oh!

pytorchmergebot commented Oct 20, 2025

Uh oh!

pytorchmergebot commented Dec 15, 2025

Merge started

Uh oh!

pytorchmergebot commented Dec 16, 2025

Merge failed

Uh oh!

xiaowangintel commented Dec 16, 2025

Uh oh!

pytorchmergebot commented Dec 16, 2025

Merge started

Uh oh!

pytorchmergebot commented Dec 16, 2025

Merge failed

Uh oh!

xiaowangintel commented Dec 19, 2025

Uh oh!

pytorchmergebot commented Dec 19, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

xiaowangintel commented Sep 23, 2025 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Sep 23, 2025 •

edited

Loading

etaf Sep 23, 2025 •

edited

Loading