xpu: get xpu arch flags at runtime in cpp_extensions by dvrogozh · Pull Request #152192 · pytorch/pytorch

dvrogozh · 2025-04-25T15:43:39Z

This commit moves query for xpu arch flags to runtime when building SYCL extensions which allows to adjust TORCH_XPU_ARCH_LIST at python script level. That's handy for example in ci test which gives a try few variants of the list.

CC: @malfet, @jingxu10, @EikanWang, @guangyey

cc @gujinghui @EikanWang @fengyuan14 @guangyey

pytorch-bot · 2025-04-25T15:43:43Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152192

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 96f1e66 with merge base 9608e7f ():

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / linux-focal-py3_9-clang9-xla / build (gh) (trunk failure)
ninja: build stopped: subcommand failed

This comment was automatically generated by Dr. CI and updates every 15 minutes.

dvrogozh · 2025-04-25T15:44:43Z

@pytorchbot label "topic: not user facing"

dvrogozh · 2025-04-25T15:44:56Z

@pytorchbot label "module: xpu"

dvrogozh · 2025-04-25T15:50:30Z

@pytorchbot label "ciflow/xpu"

pytorch-bot · 2025-04-25T15:50:40Z

To add these label(s) (ciflow/xpu) to the PR, please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

test/test_cpp_extensions_jit.py

dvrogozh · 2025-04-28T04:24:26Z

In the last push to PR - fix of linter issue, no other changes.

guangyey · 2025-05-08T02:18:20Z

@pytorchbot rebase

pytorchmergebot · 2025-05-08T02:19:52Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2025-05-08T02:19:55Z

Successfully rebased extension onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout extension && git pull --rebase)

albanD

Sure

albanD · 2025-05-08T16:06:20Z

torch/utils/cpp_extension.py

 # for default spir64 target and avoid device specific compilations entirely. Further, kernels
 # will be JIT compiled at runtime.
+def _get_sycl_target_flags():
+    if _get_sycl_arch_list() != '':


Why do you build the full arch list just to check if it's empty?

User can either specify the empty arch list externally (via TORCH_XPU_ARCH_LIST building his extension) or default arch list can be empty if that's how user's pytorch version was originally built (with empty TORCH_XPU_ARCH_LIST when building pytorch). Further, empty arch list is still a valid case since it corresponds to JIT runtime compilation. This if handles options difference between JIT and AOT cases.

guangyey · 2025-05-09T01:52:20Z

@pytorchbot rebase

pytorchmergebot · 2025-05-09T01:53:52Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

This commit moves query for xpu arch flags to runtime when building SYCL extensions which allows to adjust `TORCH_XPU_ARCH_LIST` at python script level. That's handy for example in ci test which gives a try few variants of the list. Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

pytorchmergebot · 2025-05-09T01:53:54Z

Successfully rebased extension onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout extension && git pull --rebase)

guangyey · 2025-05-09T01:54:42Z

@pytorchbot merge

pytorchmergebot · 2025-05-09T01:56:58Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pengxin99 · 2025-06-13T09:13:32Z

torch/utils/cpp_extension.py

    '-fsycl',
-    '-fsycl-targets=spir64_gen,spir64' if _get_sycl_arch_list() != '' else '',
 ]

 _SYCL_DLINK_FLAGS = [
    *_COMMON_SYCL_FLAGS,
    '-fsycl-link',
    '--offload-compress',
-    f'-Xs "-device {_get_sycl_arch_list()}"' if _get_sycl_arch_list() != '' else '',
 ]


Hi, @dvrogozh, have some wonder on this pr, why remove '-fsycl-targets=spir64_gen,spir64' in _COMMON_SYCL_FLAGS, and it will cause '-fsycl-targets is lost in _SYCL_DLINK_FLAGS and make sycl kernel link failed like below:

RuntimeError: No kernel named _ZTS26AWQDequantizeKernelFunctorIN4sycl3_V16detail9half_impl4halfES4_Li16ELi16ELi16EE was found

Per my understanding, -fsycl-targets is compiler flag, not a linker flag. Tests for sycl extension which build test sycl kernels and executed them do pass after this change. Can you, please, be more specific on the scenario where you see the noted issue? file the bug.

@dvrogozh According to the SYCL User’s Manual, the -fsycl-targets option is typically specified during the linking stage when AOT (Ahead-Of-Time) compilation is enabled. refer to Normally, '-fsycl-targets' is specified when linking an application, in which case the AOT compiled device binaries are embedded within the application’s fat executable.

Thank you for the link, this helps. However, this documentation notes 2 possible usages. One when linking the application, another - when calling compiler. As of now sycl cpp extension in pytorch follows the second scheme and I do not observe compilation, linker or runtime errors associated with that including the one @pengxin99 notes above. Snapshots which include compile/link command lines from running pytorch side tests are below.

@pengxin99, is your statement of the error an actual observation which you see using sycl cpp extension or speculation of what might happen? If that's actual observation - please, file the issue here at github and describe repro case. If you don't see the error, but believe there is something to improve in sycl cpp extension support - again, file the issue and describe what you want to achieve which currently can't be achieved or what needs to be changed and why. Either way, I strongly believe this discussion needs to be continued in the dedicated issue where you can link back to this thread.

[1/3] icpx -DTORCH_EXTENSION_NAME=torch_test_xpu_extension_ZLcQS -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1018\" -isystem /home/dvrogozh/git/pytorch/pytorch/torch/include -isystem /home/dvrogozh/git/pytorch/pytorch/torch/include/torch/csrc/api/include -isystem /home/dvrogozh/miniforge3/include/python3.12 -fPIC -std=c++17 -fsycl -fsycl-targets=spir64_gen,spir64 -sycl-std=2020 -fsycl-host-compiler=c++ '-fsycl-host-compiler-options=-DTORCH_EXTENSION_NAME=torch_test_xpu_extension_ZLcQS -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\\"_gcc\\" -DPYBIND11_STDLIB=\\"_libstdcpp\\" -DPYBIND11_BUILD_ABI=\\"_cxxabi1018\\" -isystem /home/dvrogozh/git/pytorch/pytorch/torch/include -isystem /home/dvrogozh/git/pytorch/pytorch/torch/include/torch/csrc/api/include -isystem /home/dvrogozh/miniforge3/include/python3.12 -fPIC -std=c++17' -c -x c++ /home/dvrogozh/git/pytorch/pytorch/test/cpp_extensions/xpu_extension.sycl -o xpu_extension.sycl.o [2/3] icpx xpu_extension.sycl.o -o sycl_dlink.o -fsycl -fsycl-link --offload-compress -Xs "-device pvc" -Xs "-device pvc"

Hi, @dvrogozh, you can reproduce this issue follow: https://github.com/pengxin99/sglang/blob/xpu_build_back_issue_repro/sgl-kernel/README.md

Hi @pengxin99, thank you for the reproducer. This helps - I can reproduce the issue. The reason why you stepped into the problem is that you've specified custom "-fsycl-targets=spir64_gen" option. This overrides the default set by cpp extension as -fsycl-targets=spir64,spir64_gen. If you will drop this, then runtime error will go away and tests will start to pass. If your desire is to support sycl kernels in sglang, then you probably want to drop this option anyways and use the default which will support both AOT targets (via spir64_gen) and JIT compilation (via spir64). There are some other options which you don't need to set as these are set by cpp extension by default - these are -fsycl and -std=c++17. @pengxin99 : Can you consider to change this on your side to be like below?

extra_compile_args = { "cxx": ["-O3"], "sycl": ["-ffast-math", "-fsycl-device-code-split=per_kernel"], }

Above being said, behavior you've noted is indeed an actual issue. Effectively AOT compilation is currently not happening with sycl cpp extension. Instead JIT compilation happens (via spir64 target). The JIT compilation is the reason why I overlooked the problem - test did not fail. We need to adjust the test to cover such a case, i.e. need to have the case where we explicitly disable JIT compilation and have only AOT.

Filed #156249.

@dvrogozh, Thanks for the explanation, will follow your suggestion in my project.

Commit fixes AOT compilation in sycl cpp extension which got accidentally dropped on aca2c99 (fallback to JIT compilation had happened). Commit also fixes override logic for default sycl targets allowing flexibility to specify targets externally. Further, commit extends test coverage to cover such a case and fixes issue in the test where consequent tests executed same (first) compiled extension due to name conflicts. Fixes: pytorch#156249 Fixes: aca2c99 ("xpu: get xpu arch flags at runtime in cpp_extensions (pytorch#152192)") Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

@pengxin99

Commit fixes AOT compilation in sycl cpp extension which got accidentally dropped on aca2c99 (fallback to JIT compilation had happened). Commit also fixes override logic for default sycl targets allowing flexibility to specify targets externally. Further, commit extends test coverage to cover such a case and fixes issue in the test where consequent tests executed same (first) compiled extension due to name conflicts. Fixes: #156249 Fixes: aca2c99 ("xpu: get xpu arch flags at runtime in cpp_extensions (#152192)") CC: @pengxin99, @guangyey Pull Request resolved: #156364 Approved by: https://github.com/ezyang

dvrogozh requested review from ezyang, fmassa and soumith as code owners April 25, 2025 15:43

pytorch-bot bot added the topic: not user facing topic category label Apr 25, 2025

pytorch-bot bot added the module: xpu Intel XPU related issues label Apr 25, 2025

pytorchbot added the open source label Apr 25, 2025

dvrogozh mentioned this pull request Apr 25, 2025

[xpu] set aot device flags in cpp_extension #149459

Closed

guangyey approved these changes Apr 28, 2025

View reviewed changes

guangyey reviewed Apr 28, 2025

View reviewed changes

test/test_cpp_extensions_jit.py Outdated Show resolved Hide resolved

guangyey added this to PyTorch Intel Apr 28, 2025

guangyey added the ciflow/xpu Run XPU CI tasks label Apr 28, 2025

guangyey moved this to Review Required in PyTorch Intel Apr 28, 2025

guangyey added the release notes: xpu release notes category label Apr 28, 2025

dvrogozh force-pushed the extension branch from a236a49 to a1d27cb Compare April 28, 2025 04:18

pytorch-bot bot removed the ciflow/xpu Run XPU CI tasks label Apr 28, 2025

guangyey added the ciflow/xpu Run XPU CI tasks label Apr 28, 2025

dvrogozh force-pushed the extension branch from a1d27cb to 5d13a96 Compare April 28, 2025 16:23

pytorch-bot bot removed the ciflow/xpu Run XPU CI tasks label Apr 28, 2025

gujinghui approved these changes Apr 29, 2025

View reviewed changes

guangyey added keep-going Don't stop on first failure, keep running tests until the end ciflow/xpu Run XPU CI tasks labels Apr 29, 2025

guangyey added the ciflow/trunk Trigger trunk jobs on your pull request label May 8, 2025

guangyey requested review from albanD, atalman and malfet May 8, 2025 02:25

albanD approved these changes May 8, 2025

View reviewed changes

pytorchmergebot force-pushed the extension branch from a20caaf to 96f1e66 Compare May 9, 2025 01:53

pytorch-bot bot removed ciflow/trunk Trigger trunk jobs on your pull request ciflow/xpu Run XPU CI tasks labels May 9, 2025

guangyey added ciflow/xpu Run XPU CI tasks ciflow/trunk Trigger trunk jobs on your pull request labels May 9, 2025

pytorchmergebot added the merging label May 9, 2025

pytorchmergebot added the Merged label May 9, 2025

pytorchmergebot closed this in aca2c99 May 9, 2025

github-project-automation bot moved this from Review Required to Done in PyTorch Intel May 9, 2025

pytorchmergebot removed the merging label May 9, 2025

dvrogozh mentioned this pull request May 9, 2025

[RFC][API-Unstable] Support 3rd party SYCL kernels with CPP Extension API #153265

Closed

17 tasks

pengxin99 reviewed Jun 13, 2025

View reviewed changes

dvrogozh mentioned this pull request Jun 17, 2025

xpu: AOT compilation does not happen with sycl extension (JIT fallback happens) #156249

Closed

dvrogozh mentioned this pull request Jun 18, 2025

xpu: fix AOT compilation in sycl cpp extension #156364

Closed

Conversation

dvrogozh commented Apr 25, 2025 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152192

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

dvrogozh commented Apr 25, 2025

Uh oh!

dvrogozh commented Apr 25, 2025

Uh oh!

dvrogozh commented Apr 25, 2025

Uh oh!

pytorch-bot bot commented Apr 25, 2025

Uh oh!

Uh oh!

dvrogozh commented Apr 28, 2025

Uh oh!

guangyey commented May 8, 2025

Uh oh!

pytorchmergebot commented May 8, 2025

Uh oh!

pytorchmergebot commented May 8, 2025

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

guangyey commented May 9, 2025

Uh oh!

pytorchmergebot commented May 9, 2025

Uh oh!

pytorchmergebot commented May 9, 2025

Uh oh!

guangyey commented May 9, 2025

Uh oh!

pytorchmergebot commented May 9, 2025

Merge started

Uh oh!

pengxin99 Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

dvrogozh commented Apr 25, 2025 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Apr 25, 2025 •

edited

Loading

pengxin99 Jun 13, 2025 •

edited

Loading