[XPU] Update XPU C Shim Header#141086
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/141086
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 3 Unrelated FailuresAs of commit 293b93e with merge base 8b4ae29 ( NEW FAILURE - The following job has failed:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Hi, @jansel the failed job xpu / linux-jammy-xpu-2025_0-py3.9 / test (default, 1, 4, linux.idc.xpu) (gh) is known issue at #140917 and is fixed in PR #140916 (It's blocked by this PR to fix the build issue) and the Lint / lintrunner-noclang / linux-job (gh)
should be non related issue because this PR does not change the file test/test_nestedtensor.py. |
|
Why isn't pytorchbot flagging these as preexisting failures? Can you rebase to viable/strict? |
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Successfully rebased |
2e5a84b to
7ea7b99
Compare
dvrogozh
left a comment
There was a problem hiding this comment.
@ratnampa : I suggest to update PR description and commit message by adding links to issue (#141268) and 2 commits (34b2165 and 34e4205) which caused this issue:
Fixes: #141268
Fixes: 34b2165bdb5 ("Insert aten.add into fallback_ops...")
Fixes: 34e420519df ("[Reland] dont decompose baddbmm (#141045)")
| AOTI_TORCH_EXPORT AOTITorchError aoti_torch_xpu__addmm_activation(AtenTensorHandle self, AtenTensorHandle mat1, AtenTensorHandle mat2, double beta, double alpha, int32_t use_gelu, AtenTensorHandle* ret0); | ||
| AOTI_TORCH_EXPORT AOTITorchError aoti_torch_xpu__fused_moving_avg_obs_fq_helper_functional(AtenTensorHandle self, AtenTensorHandle observer_on, AtenTensorHandle fake_quant_on, AtenTensorHandle running_min, AtenTensorHandle running_max, AtenTensorHandle scale, AtenTensorHandle zero_point, double averaging_const, int64_t quant_min, int64_t quant_max, int64_t ch_axis, int32_t per_row_fake_quant, int32_t symmetric_quant, AtenTensorHandle* ret0, AtenTensorHandle* ret1, AtenTensorHandle* ret2, AtenTensorHandle* ret3, AtenTensorHandle* ret4, AtenTensorHandle* ret5); | ||
| AOTI_TORCH_EXPORT AOTITorchError aoti_torch_xpu__trilinear(AtenTensorHandle i1, AtenTensorHandle i2, AtenTensorHandle i3, const int64_t* expand1, int64_t expand1_len_, const int64_t* expand2, int64_t expand2_len_, const int64_t* expand3, int64_t expand3_len_, const int64_t* sumdim, int64_t sumdim_len_, int64_t unroll_dim, AtenTensorHandle* ret0); | ||
| AOTI_TORCH_EXPORT AOTITorchError aoti_torch_xpu_add_Scalar(AtenTensorHandle self, double other, double alpha, AtenTensorHandle* ret0); |
There was a problem hiding this comment.
Here I am confused. The 34b2165 commit has added 2 items into torchgen/aoti/fallback_ops.py:
- "aten.add.Scalar",
- "aten.add.Tensor"
And c_shim_cuda.h and c_shim_cpu.h has 2 lines updated with above functions. python torchgen/gen.py --update-aoti-c-shim --xpu however adds only aten.add.Scalar... Build passes for me with that and issue is gone. @xytintel, @fengyuan14, all : can someone explain why only 1 line got updated for XPU? is that expected? is there some special logic in generation for XPU?
There was a problem hiding this comment.
@dvrogozh aten.add.Tensor can be well supported by the Inductor. We do not need to fall back to Aten.
There was a problem hiding this comment.
@dvrogozh @EikanWang , Comparing with aten.add.Scalar, the aten.add.Tensor is registered the torch-xpu-ops as an out-of-tree op, and it's aoti API is generated in torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.h.
There was a problem hiding this comment.
@dvrogozh
aten.add.Tensorcan be well supported by the Inductor. We do not need to fall back to Aten.
Annoyingly, it was needed for complex number.
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
f633808 to
03a9ec0
Compare
|
@chuanqi129 please fix lint. The rest of the errors seems to be not related |
|
@atalman , the lint error is not related with this PR as this PR does not touch the rnn.py. Let me rerun the lint ci. |
Hi, @atalman , The lint error happens on main branch, seems not related to this PR https://hud.pytorch.org/failure?name=Lint%20%2F%20lintrunner-noclang%20%2F%20linux-job&jobName=lintrunner-noclang%20%2F%20linux-job&failureCaptures=%5B%22%3E%3E%3E%20Lint%20for%20torch%2Fao%2Fnn%2Fquantizable%2Fmodules%2Frnn.py%3A%22%5D: |
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Successfully rebased |
03a9ec0 to
aa96325
Compare
|
We will rebase this PR to the latest |
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Successfully rebased |
aa96325 to
293b93e
Compare
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: xpu / linux-jammy-xpu-2025_0-py3.9 / test (default, 1, 4, linux.idc.xpu) Details for Dev Infra teamRaised by workflow job |
|
@pytorchbot merge -i |
Merge startedYour change will be merged while ignoring the following 4 checks: inductor-rocm / rocm6.2-py3.10-inductor / test (inductor, 1, 2, linux.rocm.gpu.2), inductor-rocm / rocm6.2-py3.10-inductor / test (inductor, 2, 2, linux.rocm.gpu.2), inductor / cuda12.1-py3.10-gcc9-sm86 / test (inductor_timm, 1, 2, linux.g5.4xlarge.nvidia.gpu), xpu / linux-jammy-xpu-2025_0-py3.9 / test (default, 1, 4, linux.idc.xpu) Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Fixes pytorch#141268 Caused by these commits: pytorch@34b2165 and pytorch@34e4205 The windows XPU builds are failing: https://github.com/pytorch/pytorch/actions/runs/11922274722/job/33228175750 due to recent PR merge with changes in fallback ops: pytorch@34e4205 This PR updates the XPU C Shim header file to overcome these build failures. Pull Request resolved: pytorch#141086 Approved by: https://github.com/etaf, https://github.com/EikanWang, https://github.com/jansel, https://github.com/malfet, https://github.com/dvrogozh, https://github.com/desertfire
Fixes pytorch#141268 Caused by these commits: pytorch@34b2165 and pytorch@34e4205 The windows XPU builds are failing: https://github.com/pytorch/pytorch/actions/runs/11922274722/job/33228175750 due to recent PR merge with changes in fallback ops: pytorch@34e4205 This PR updates the XPU C Shim header file to overcome these build failures. Pull Request resolved: pytorch#141086 Approved by: https://github.com/etaf, https://github.com/EikanWang, https://github.com/jansel, https://github.com/malfet, https://github.com/dvrogozh, https://github.com/desertfire

Fixes #141268
Caused by these commits: 34b2165 and 34e4205
The windows XPU builds are failing: https://github.com/pytorch/pytorch/actions/runs/11922274722/job/33228175750
due to recent PR merge with changes in fallback ops: 34e4205
This PR updates the XPU C Shim header file to overcome these build failures.