[XPU] Update XPU C Shim Header by ratnampa · Pull Request #141086 · pytorch/pytorch

ratnampa · 2024-11-20T01:36:51Z

Caused by these commits: 34b2165 and 34e4205

The windows XPU builds are failing: https://github.com/pytorch/pytorch/actions/runs/11922274722/job/33228175750
due to recent PR merge with changes in fallback ops: 34e4205

This PR updates the XPU C Shim header file to overcome these build failures.

pytorch-bot · 2024-11-20T01:36:55Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/141086

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 3 Unrelated Failures

As of commit 293b93e with merge base 8b4ae29 ():

NEW FAILURE - The following job has failed:

xpu / linux-jammy-xpu-2025_0-py3.9 / test (default, 1, 4, linux.idc.xpu) (gh)
##[error]An error has occurred while creating the zip file for upload

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

inductor-rocm / rocm6.2-py3.10-inductor / test (inductor, 1, 2, linux.rocm.gpu.2) (gh) (similar failure)
##[error]Credentials could not be loaded, please check your action inputs: Could not load credentials from any providers
inductor-rocm / rocm6.2-py3.10-inductor / test (inductor, 2, 2, linux.rocm.gpu.2) (gh) (similar failure)
##[error]Credentials could not be loaded, please check your action inputs: Could not load credentials from any providers

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

inductor / cuda12.1-py3.10-gcc9-sm86 / test (inductor_timm, 1, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (trunk failure)
convnext_base

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jansel

Failing tests?

etaf · 2024-11-20T06:49:05Z

Failing tests?

Hi, @jansel the failed job xpu / linux-jammy-xpu-2025_0-py3.9 / test (default, 1, 4, linux.idc.xpu) (gh) is known issue at #140917 and is fixed in PR #140916 (It's blocked by this PR to fix the build issue)

and the Lint / lintrunner-noclang / linux-job (gh)

Lint for test/test_nestedtensor.py:

should be non related issue because this PR does not change the file test/test_nestedtensor.py.

jansel · 2024-11-21T06:06:03Z

Why isn't pytorchbot flagging these as preexisting failures? Can you rebase to viable/strict?

etaf · 2024-11-21T06:08:10Z

@pytorchbot rebase

pytorchmergebot · 2024-11-21T06:09:47Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2024-11-21T06:09:50Z

Successfully rebased ratnampa/update_xpu_c_shim onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout ratnampa/update_xpu_c_shim && git pull --rebase)

dvrogozh

@ratnampa : I suggest to update PR description and commit message by adding links to issue (#141268) and 2 commits (34b2165 and 34e4205) which caused this issue:

Fixes: #141268
Fixes: 34b2165bdb5 ("Insert aten.add into fallback_ops...")
Fixes: 34e420519df ("[Reland] dont decompose baddbmm (#141045)")

dvrogozh · 2024-11-21T20:05:30Z

 AOTI_TORCH_EXPORT AOTITorchError aoti_torch_xpu__addmm_activation(AtenTensorHandle self, AtenTensorHandle mat1, AtenTensorHandle mat2, double beta, double alpha, int32_t use_gelu, AtenTensorHandle* ret0);
 AOTI_TORCH_EXPORT AOTITorchError aoti_torch_xpu__fused_moving_avg_obs_fq_helper_functional(AtenTensorHandle self, AtenTensorHandle observer_on, AtenTensorHandle fake_quant_on, AtenTensorHandle running_min, AtenTensorHandle running_max, AtenTensorHandle scale, AtenTensorHandle zero_point, double averaging_const, int64_t quant_min, int64_t quant_max, int64_t ch_axis, int32_t per_row_fake_quant, int32_t symmetric_quant, AtenTensorHandle* ret0, AtenTensorHandle* ret1, AtenTensorHandle* ret2, AtenTensorHandle* ret3, AtenTensorHandle* ret4, AtenTensorHandle* ret5);
 AOTI_TORCH_EXPORT AOTITorchError aoti_torch_xpu__trilinear(AtenTensorHandle i1, AtenTensorHandle i2, AtenTensorHandle i3, const int64_t* expand1, int64_t expand1_len_, const int64_t* expand2, int64_t expand2_len_, const int64_t* expand3, int64_t expand3_len_, const int64_t* sumdim, int64_t sumdim_len_, int64_t unroll_dim, AtenTensorHandle* ret0);
+AOTI_TORCH_EXPORT AOTITorchError aoti_torch_xpu_add_Scalar(AtenTensorHandle self, double other, double alpha, AtenTensorHandle* ret0);


Here I am confused. The 34b2165 commit has added 2 items into torchgen/aoti/fallback_ops.py:

"aten.add.Scalar",

"aten.add.Tensor"

And c_shim_cuda.h and c_shim_cpu.h has 2 lines updated with above functions. python torchgen/gen.py --update-aoti-c-shim --xpu however adds only aten.add.Scalar... Build passes for me with that and issue is gone. @xytintel, @fengyuan14, all : can someone explain why only 1 line got updated for XPU? is that expected? is there some special logic in generation for XPU?

@dvrogozh aten.add.Tensor can be well supported by the Inductor. We do not need to fall back to Aten.

@dvrogozh @EikanWang , Comparing with aten.add.Scalar, the aten.add.Tensor is registered the torch-xpu-ops as an out-of-tree op, and it's aoti API is generated in torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.h.

@dvrogozh aten.add.Tensor can be well supported by the Inductor. We do not need to fall back to Aten.

Annoyingly, it was needed for complex number.

guangyey · 2024-11-22T02:12:33Z

@pytorchbot rebase

pytorchmergebot · 2024-11-22T02:14:00Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

atalman · 2024-11-22T21:36:34Z

@chuanqi129 please fix lint. The rest of the errors seems to be not related

EikanWang · 2024-11-23T04:27:00Z

@atalman , the lint error is not related with this PR as this PR does not touch the rnn.py. Let me rerun the lint ci.

etaf · 2024-11-23T08:57:00Z

@chuanqi129 please fix lint. The rest of the errors seems to be not related

Hi, @atalman , The lint error happens on main branch, seems not related to this PR https://hud.pytorch.org/failure?name=Lint%20%2F%20lintrunner-noclang%20%2F%20linux-job&jobName=lintrunner-noclang%20%2F%20linux-job&failureCaptures=%5B%22%3E%3E%3E%20Lint%20for%20torch%2Fao%2Fnn%2Fquantizable%2Fmodules%2Frnn.py%3A%22%5D:

guangyey · 2024-11-23T10:32:04Z

@pytorchbot rebase

pytorchmergebot · 2024-11-23T10:34:30Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2024-11-23T10:34:34Z

Successfully rebased ratnampa/update_xpu_c_shim onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout ratnampa/update_xpu_c_shim && git pull --rebase)

guangyey · 2024-11-23T10:35:19Z

We will rebase this PR to the latest viable/strict branch and land this PR once there are no other failures.

guangyey · 2024-11-23T13:37:42Z

@pytorchbot rebase

pytorchmergebot · 2024-11-23T13:39:12Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2024-11-23T13:39:15Z

Successfully rebased ratnampa/update_xpu_c_shim onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout ratnampa/update_xpu_c_shim && git pull --rebase)

EikanWang · 2024-11-24T09:03:38Z

@pytorchbot merge

pytorchmergebot · 2024-11-24T09:05:35Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2024-11-24T10:48:02Z

Merge failed

Reason: 1 jobs have failed, first few of them are: xpu / linux-jammy-xpu-2025_0-py3.9 / test (default, 1, 4, linux.idc.xpu)

Details for Dev Infra team

Raised by workflow job

EikanWang · 2024-11-24T12:16:19Z

@pytorchbot merge -i

pytorchmergebot · 2024-11-24T12:18:35Z

Merge started

Your change will be merged while ignoring the following 4 checks: inductor-rocm / rocm6.2-py3.10-inductor / test (inductor, 1, 2, linux.rocm.gpu.2), inductor-rocm / rocm6.2-py3.10-inductor / test (inductor, 2, 2, linux.rocm.gpu.2), inductor / cuda12.1-py3.10-gcc9-sm86 / test (inductor_timm, 1, 2, linux.g5.4xlarge.nvidia.gpu), xpu / linux-jammy-xpu-2025_0-py3.9 / test (default, 1, 4, linux.idc.xpu)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Fixes pytorch#141268 Caused by these commits: pytorch@34b2165 and pytorch@34e4205 The windows XPU builds are failing: https://github.com/pytorch/pytorch/actions/runs/11922274722/job/33228175750 due to recent PR merge with changes in fallback ops: pytorch@34e4205 This PR updates the XPU C Shim header file to overcome these build failures. Pull Request resolved: pytorch#141086 Approved by: https://github.com/etaf, https://github.com/EikanWang, https://github.com/jansel, https://github.com/malfet, https://github.com/dvrogozh, https://github.com/desertfire

ratnampa changed the title ~~Update XPU C Shim Header~~ [XPU] Update XPU C Shim Header Nov 20, 2024

pytorchbot added the open source label Nov 20, 2024

EikanWang added topic: not user facing topic category ciflow/xpu Run XPU CI tasks labels Nov 20, 2024

EikanWang requested a review from etaf November 20, 2024 02:13

etaf approved these changes Nov 20, 2024

View reviewed changes

EikanWang approved these changes Nov 20, 2024

View reviewed changes

EikanWang requested a review from desertfire November 20, 2024 02:54

etaf requested a review from jansel November 20, 2024 05:18

jansel requested changes Nov 20, 2024

View reviewed changes

etaf requested a review from jansel November 20, 2024 06:49

pytorchmergebot force-pushed the ratnampa/update_xpu_c_shim branch from 2e5a84b to 7ea7b99 Compare November 21, 2024 06:09

chuanqi129 mentioned this pull request Nov 21, 2024

Modify deps for ci/nightly with 2025.0 intel/torch-xpu-ops#1110

Merged

ratnampa mentioned this pull request Nov 21, 2024

[Break XPU] xpu: build fails for XPU backend due to outdated aoti_torch/generated/c_shim_xpu.h #141268

Closed

dvrogozh mentioned this pull request Nov 21, 2024

xpu: fix build after updates of torchgen/aoti/fallback_ops.py #141269

Closed

dvrogozh reviewed Nov 21, 2024

View reviewed changes

dvrogozh mentioned this pull request Nov 21, 2024

Update torch-xpu-ops commit pin #141201

Closed

malfet approved these changes Nov 21, 2024

View reviewed changes

dvrogozh approved these changes Nov 21, 2024

View reviewed changes

jansel approved these changes Nov 21, 2024

View reviewed changes

pytorch-bot Bot added the ciflow/inductor label Nov 21, 2024

desertfire approved these changes Nov 22, 2024

View reviewed changes

pytorchmergebot force-pushed the ratnampa/update_xpu_c_shim branch from f633808 to 03a9ec0 Compare November 22, 2024 08:57

pytorchmergebot force-pushed the ratnampa/update_xpu_c_shim branch from 03a9ec0 to aa96325 Compare November 23, 2024 10:34

Update XPU C Shim Header

293b93e

pytorchmergebot force-pushed the ratnampa/update_xpu_c_shim branch from aa96325 to 293b93e Compare November 23, 2024 13:39

pytorchmergebot added the merging label Nov 24, 2024

pytorchmergebot removed the merging label Nov 24, 2024

pytorchmergebot added the merging label Nov 24, 2024

pytorchmergebot added the Merged label Nov 24, 2024

pytorchmergebot closed this in 3614d13 Nov 24, 2024

pytorchmergebot removed the merging label Nov 24, 2024

EikanWang mentioned this pull request Nov 24, 2024

Fix access to _msvccompiler from newer distutils #141363

Closed

Conversation

ratnampa commented Nov 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Nov 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/141086

❌ 1 New Failure, 3 Unrelated Failures

Uh oh!

jansel left a comment

Choose a reason for hiding this comment

Uh oh!

etaf commented Nov 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jansel commented Nov 21, 2024

Uh oh!

etaf commented Nov 21, 2024

Uh oh!

pytorchmergebot commented Nov 21, 2024

Uh oh!

pytorchmergebot commented Nov 21, 2024

Uh oh!

dvrogozh left a comment

Choose a reason for hiding this comment

Uh oh!

dvrogozh Nov 21, 2024

Choose a reason for hiding this comment

Uh oh!

EikanWang Nov 22, 2024

Choose a reason for hiding this comment

Uh oh!

etaf Nov 22, 2024

Choose a reason for hiding this comment

Uh oh!

desertfire Nov 22, 2024

Choose a reason for hiding this comment

Uh oh!

guangyey commented Nov 22, 2024

Uh oh!

pytorchmergebot commented Nov 22, 2024

Uh oh!

atalman commented Nov 22, 2024

Uh oh!

EikanWang commented Nov 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

etaf commented Nov 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

guangyey commented Nov 23, 2024

Uh oh!

pytorchmergebot commented Nov 23, 2024

Uh oh!

pytorchmergebot commented Nov 23, 2024

Uh oh!

guangyey commented Nov 23, 2024

Uh oh!

guangyey commented Nov 23, 2024

Uh oh!

pytorchmergebot commented Nov 23, 2024

Uh oh!

pytorchmergebot commented Nov 23, 2024

Uh oh!

EikanWang commented Nov 24, 2024

Uh oh!

pytorchmergebot commented Nov 24, 2024

Merge started

Uh oh!

pytorchmergebot commented Nov 24, 2024

Merge failed

Uh oh!

EikanWang commented Nov 24, 2024

Uh oh!

pytorchmergebot commented Nov 24, 2024

Merge started

Uh oh!

Reviewers

Assignees

ratnampa commented Nov 20, 2024 •

edited

Loading

pytorch-bot Bot commented Nov 20, 2024 •

edited

Loading

etaf commented Nov 20, 2024 •

edited

Loading

EikanWang commented Nov 23, 2024 •

edited

Loading

etaf commented Nov 23, 2024 •

edited

Loading