Skip to content

[Inductor][Intel GPU] Save threads_per_warp from tirton compiled kernel for launching kernel correctly in cpp wrapper.#163388

Merged
atalman merged 1 commit intorelease/2.9from
cherry-pick-163315-by-pytorch_bot_bot_
Sep 26, 2025
Merged

[Inductor][Intel GPU] Save threads_per_warp from tirton compiled kernel for launching kernel correctly in cpp wrapper.#163388
atalman merged 1 commit intorelease/2.9from
cherry-pick-163315-by-pytorch_bot_bot_

Conversation

@pytorchbot
Copy link
Copy Markdown
Collaborator

Stack from ghstack (oldest at bottom):

On the Inductor XPU backend, threads_per_warp is not always 32. For Intel GEMM Triton kernels, it can be 16. This information must be preserved for XPU so that the Cpp wrapper can launch the kernel with the correct configuration.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben

…rnel for launching kernel correctly in cpp wrapper. (#163315)

On the Inductor XPU backend, `threads_per_warp` is not always 32. For Intel GEMM Triton kernels, it can be 16. This information must be preserved for XPU so that the Cpp wrapper can launch the kernel with the correct configuration.

Pull Request resolved: #163315
Approved by: https://github.com/EikanWang, https://github.com/desertfire

(cherry picked from commit 9f8a311)
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Sep 20, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/163388

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 1f213fa with merge base 4840a1a (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@etaf etaf requested a review from atalman September 23, 2025 06:14
@etaf
Copy link
Copy Markdown
Collaborator

etaf commented Sep 23, 2025

Hi, @atalman This cherry-pick for release/2.9 is to fix the new feature that support flex attention on Inductor Intel GPU backend. Could you kindly help have a review?

@atalman atalman merged commit 7cadf8a into release/2.9 Sep 26, 2025
146 of 150 checks passed
@github-actions github-actions Bot deleted the cherry-pick-163315-by-pytorch_bot_bot_ branch October 27, 2025 02:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants