[Inductor XPU] Fix wrong device check before skip concat linear. by etaf · Pull Request #140916 · pytorch/pytorch

etaf · 2024-11-18T01:04:05Z

Stack from ghstack (oldest at bottom):

[Inductor UT] Refine expected kernel count for XPU in test_compiled_optimizers. #141470
-> [Inductor XPU] Fix wrong device check before skip concat linear. #140916

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

[ghstack-poisoned]

pytorch-bot · 2024-11-18T01:04:09Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/140916

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 2 Unrelated Failures

As of commit d09f20d with merge base 3614d13 ():

NEW FAILURES - The following jobs have failed:

xpu / linux-jammy-xpu-2025_0-py3.9 / test (default, 3, 4, linux.idc.xpu) (gh)
inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_maximize_xpu
xpu / linux-jammy-xpu-2025_0-py3.9 / test (default, 4, 4, linux.idc.xpu) (gh)
inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_weight_decay_xpu

FLAKY - The following job failed but was likely due to flakiness present on trunk:

xpu / linux-jammy-xpu-2025_0-py3.9 / test (default, 1, 4, linux.idc.xpu) (gh) (similar failure)
##[error]An error has occurred while creating the zip file for upload

UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:

inductor / cuda12.4-py3.10-gcc9-sm86 / test (inductor_timm, 1, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (#141498)
convnext_base

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

…oncat linear. ghstack-source-id: e953960 Pull Request resolved: #140916

eellison

The device initialization doesn't really matter here. We ignore devices in the generated pattern matching searches. If you want to skip for cpp, you should update the check functions and check device there.

[ghstack-poisoned]

…oncat linear. ghstack-source-id: e2216ad Pull Request resolved: #140916

etaf · 2024-11-19T01:20:47Z

The device initialization doesn't really matter here. We ignore devices in the generated pattern matching searches. If you want to skip for cpp, you should update the check functions and check device there.

Thanks @eellison , I've updated the code, please have a review.

eellison · 2024-11-20T00:09:04Z

-        if not config.cpp.enable_concat_linear:
-            return
-
+    # workaround https://github.com/pytorch/pytorch/issues/97894


this workaround comment is not relevant..

resolved, thanks.

etaf · 2024-11-20T05:10:02Z

@pytorchbot rebase

pytorchmergebot · 2024-11-20T05:11:30Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

[ghstack-poisoned]

pytorchmergebot · 2024-11-20T05:11:41Z

Successfully rebased gh/etaf/67/orig onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/140916)

[ghstack-poisoned]

…oncat linear. ghstack-source-id: e292c7f Pull Request resolved: #140916

[ghstack-poisoned]

…oncat linear. ghstack-source-id: 55b1ca0 Pull Request resolved: #140916

etaf · 2024-11-26T00:33:55Z

xpu / linux-jammy-xpu-2025_0-py3.9 / test (default, 3, 4, linux.idc.xpu) (gh)
inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_maximize_xpu
xpu / linux-jammy-xpu-2025_0-py3.9 / test (default, 4, 4, linux.idc.xpu) (gh)
inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_weight_decay_xpu
The failed job is known issue: #141466

@pytorchbot merge -i

pytorchmergebot · 2024-11-26T00:35:47Z

Merge started

Your change will be merged while ignoring the following 4 checks: xpu / linux-jammy-xpu-2025_0-py3.9 / test (default, 1, 4, linux.idc.xpu), xpu / linux-jammy-xpu-2025_0-py3.9 / test (default, 3, 4, linux.idc.xpu), xpu / linux-jammy-xpu-2025_0-py3.9 / test (default, 4, 4, linux.idc.xpu), inductor / cuda12.4-py3.10-gcc9-sm86 / test (inductor_timm, 1, 2, linux.g5.4xlarge.nvidia.gpu)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2024-11-26T06:34:17Z

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

etaf · 2024-11-26T12:57:34Z

@pytorchbot merge

pytorchmergebot · 2024-11-26T12:59:26Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2024-11-26T12:59:47Z

Merge failed

Reason: 2 jobs have failed, first few of them are: xpu / linux-jammy-xpu-2025_0-py3.9 / test (default, 3, 4, linux.idc.xpu), xpu / linux-jammy-xpu-2025_0-py3.9 / test (default, 4, 4, linux.idc.xpu)

Details for Dev Infra team

Raised by workflow job

etaf · 2024-11-26T13:22:25Z

@pytorchbot merge -i

pytorchmergebot · 2024-11-26T13:24:32Z

Merge started

Your change will be merged while ignoring the following 4 checks: xpu / linux-jammy-xpu-2025_0-py3.9 / test (default, 1, 4, linux.idc.xpu), xpu / linux-jammy-xpu-2025_0-py3.9 / test (default, 3, 4, linux.idc.xpu), xpu / linux-jammy-xpu-2025_0-py3.9 / test (default, 4, 4, linux.idc.xpu), inductor / cuda12.4-py3.10-gcc9-sm86 / test (inductor_timm, 1, 2, linux.g5.4xlarge.nvidia.gpu)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…orch#140916) Fix pytorch#140917 Pull Request resolved: pytorch#140916 Approved by: https://github.com/EikanWang, https://github.com/eellison

Update

24f582b

[ghstack-poisoned]

pytorch-bot Bot added ciflow/inductor module: inductor labels Nov 18, 2024

etaf changed the title ~~[Inductor XPU] Fix the missed XPU device check before skip enabling concat linear.~~ [Inductor XPU] Fix wrong device assign before skip enabling concat linear. Nov 18, 2024

etaf changed the title ~~[Inductor XPU] Fix wrong device assign before skip enabling concat linear.~~ [Inductor XPU] Fix wrong device assign before skip concat linear. Nov 18, 2024

pytorchbot added the open source label Nov 18, 2024

Update

83352fd

[ghstack-poisoned]

etaf added a commit that referenced this pull request Nov 18, 2024

[Inductor XPU] Fix the missed XPU device check before skip enabling c…

4341459

…oncat linear. ghstack-source-id: e953960 Pull Request resolved: #140916

etaf added ciflow/xpu Run XPU CI tasks topic: not user facing topic category labels Nov 18, 2024

chuanqi129 mentioned this pull request Nov 18, 2024

[CI] Upgrade XPU support packages version to 2025.0 #139775

Closed

etaf requested review from eellison and zhuhaozhe November 18, 2024 02:20

etaf changed the title ~~[Inductor XPU] Fix wrong device assign before skip concat linear.~~ [WIP][Inductor XPU] Fix wrong device assign before skip concat linear. Nov 18, 2024

etaf marked this pull request as draft November 18, 2024 06:11

eellison reviewed Nov 18, 2024

View reviewed changes

Update

d60b9d2

[ghstack-poisoned]

etaf added a commit that referenced this pull request Nov 19, 2024

[Inductor XPU] Fix the missed XPU device check before skip enabling c…

89357d7

…oncat linear. ghstack-source-id: e2216ad Pull Request resolved: #140916

etaf requested a review from eellison November 19, 2024 01:20

etaf changed the title ~~[WIP][Inductor XPU] Fix wrong device assign before skip concat linear.~~ [Inductor XPU] Fix wrong device assign before skip concat linear. Nov 19, 2024

etaf marked this pull request as ready for review November 19, 2024 01:21

etaf requested review from EikanWang and jgong5 November 19, 2024 01:22

etaf changed the title ~~[Inductor XPU] Fix wrong device assign before skip concat linear.~~ [Inductor XPU] Fix wrong device check before skip concat linear. Nov 19, 2024

EikanWang approved these changes Nov 19, 2024

View reviewed changes

eellison approved these changes Nov 20, 2024

View reviewed changes

Update

d2bd3cb

[ghstack-poisoned]

Update

c68bb51

[ghstack-poisoned]

etaf added a commit that referenced this pull request Nov 20, 2024

[Inductor XPU] Fix the missed XPU device check before skip enabling c…

521f372

…oncat linear. ghstack-source-id: e292c7f Pull Request resolved: #140916

etaf mentioned this pull request Nov 20, 2024

[XPU] Update XPU C Shim Header #141086

Closed

Update

d09f20d

[ghstack-poisoned]

etaf added a commit that referenced this pull request Nov 24, 2024

[Inductor XPU] Fix the missed XPU device check before skip enabling c…

4eb003e

…oncat linear. ghstack-source-id: 55b1ca0 Pull Request resolved: #140916

etaf mentioned this pull request Nov 25, 2024

[Inductor UT] Refine expected kernel count for XPU in test_compiled_optimizers. #141470

Closed

pytorch-bot Bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 26, 2024

pytorchmergebot added the merging label Nov 26, 2024

pytorchmergebot removed the merging label Nov 26, 2024

pytorchmergebot added the merging label Nov 26, 2024

pytorchmergebot added the Merged label Nov 26, 2024

pytorchmergebot closed this in 9dd3b85 Nov 26, 2024

pytorchmergebot removed the merging label Nov 26, 2024

github-actions Bot deleted the gh/etaf/67/head branch December 27, 2024 02:06

Conversation

etaf commented Nov 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Nov 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/140916

❌ 2 New Failures, 2 Unrelated Failures

Uh oh!

eellison left a comment

Choose a reason for hiding this comment

Uh oh!

etaf commented Nov 19, 2024

Uh oh!

eellison Nov 20, 2024

Choose a reason for hiding this comment

Uh oh!

etaf Nov 20, 2024

Choose a reason for hiding this comment

Uh oh!

etaf commented Nov 20, 2024

Uh oh!

pytorchmergebot commented Nov 20, 2024

Uh oh!

pytorchmergebot commented Nov 20, 2024

Uh oh!

etaf commented Nov 26, 2024

Uh oh!

pytorchmergebot commented Nov 26, 2024

Merge started

Uh oh!

pytorchmergebot commented Nov 26, 2024

Uh oh!

etaf commented Nov 26, 2024

Uh oh!

pytorchmergebot commented Nov 26, 2024

Merge started

Uh oh!

pytorchmergebot commented Nov 26, 2024

Merge failed

Uh oh!

etaf commented Nov 26, 2024

Uh oh!

pytorchmergebot commented Nov 26, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

etaf commented Nov 18, 2024 •

edited

Loading

pytorch-bot Bot commented Nov 18, 2024 •

edited

Loading