Skip to content

Update torch-xpu-ops pin (ATen XPU implementation)#135647

Closed
fengyuan14 wants to merge 1 commit intomainfrom
fy/update-xpu
Closed

Update torch-xpu-ops pin (ATen XPU implementation)#135647
fengyuan14 wants to merge 1 commit intomainfrom
fy/update-xpu

Conversation

@fengyuan14
Copy link
Collaborator

@fengyuan14 fengyuan14 commented Sep 11, 2024

Release cycle for PyTorch 2.5

  1. Fixing runtime error on Windows: Fail to load torch_xpu_ops_unary_binary_kernels.dll as the bin size is large.

cc @frank-wei @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @gujinghui @EikanWang @guangyey

@fengyuan14 fengyuan14 added open source module: intel Specific to x86 architecture ciflow/trunk Trigger trunk jobs on your pull request topic: not user facing topic category intel This tag is for PR from Intel ciflow/xpu Run XPU CI tasks module: xpu Intel XPU related issues labels Sep 11, 2024
@pytorch-bot
Copy link

pytorch-bot bot commented Sep 11, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/135647

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 137f072 with merge base cd9ee49 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@etaf
Copy link
Collaborator

etaf commented Sep 11, 2024

The failed job xpu / linux-jammy-xpu-py3.9 / test (default, 4, 4, linux.idc.xpu) (gh) is caused by cuda bias code introduced by #135530

@fengyuan14
Copy link
Collaborator Author

#135656

@EikanWang
Copy link
Collaborator

@pytorchbot merge -i

@pytorch-bot
Copy link

pytorch-bot bot commented Sep 11, 2024

This PR needs to be approved by an authorized maintainer before merge.

@EikanWang
Copy link
Collaborator

@pytorchbot merge -i

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged while ignoring the following 1 checks: xpu / linux-jammy-xpu-py3.9 / test (default, 4, 4, linux.idc.xpu)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / linux-focal-cuda12.4-py3.10-gcc9-sm86 / test (default, 2, 5, linux.g5.4xlarge.nvidia.gpu)

Details for Dev Infra team Raised by workflow job

Release cycle for PyTorch 2.5
1. Fixing runtime error on Windows: Fail to load torch_xpu_ops_unary_binary_kernels.dll as the bin size is large.

Signed-off-by: Feng Yuan <feng1.yuan@intel.com>
@zejun-chen
Copy link
Contributor

Hi,

The following failed CI case seems random and not related to XPU profiler. The enabling XPU profiler UT PR has not been landed.

Running 3 items in this shard: test/profiler/test_cpp_thread.py::CppThreadTest::test_profile_memory, test/profiler/test_cpp_thread.py::CppThreadTest::test_with_enable_profiler_in_child_thread, test/profiler/test_cpp_thread.py::CppThreadTest::test_without_enable_profiler_in_child_thread

profiler/test_cpp_thread.py::CppThreadTest::test_profile_memory ERROR: External init callback must run in same thread as registerClient (1739282176 != 1212482176)

Thank you.

@EikanWang
Copy link
Collaborator

@pytorchbot merge -i

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged while ignoring the following 2 checks: xpu / linux-jammy-xpu-py3.9 / test (default, 4, 4, linux.idc.xpu), trunk / linux-focal-cuda12.4-py3.10-gcc9-sm86 / test (default, 2, 5, linux.g5.4xlarge.nvidia.gpu)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

fengyuan14 added a commit that referenced this pull request Sep 12, 2024
Release cycle for PyTorch 2.5
1. Fixing runtime error on Windows: Fail to load torch_xpu_ops_unary_binary_kernels.dll as the bin size is large.

Pull Request resolved: #135647
Approved by: https://github.com/EikanWang
fengyuan14 added a commit that referenced this pull request Sep 14, 2024
Release cycle for PyTorch 2.5
1. Fixing runtime error on Windows: Fail to load torch_xpu_ops_unary_binary_kernels.dll as the bin size is large.

Pull Request resolved: #135647
Approved by: https://github.com/EikanWang
Chao1Han pushed a commit to Chao1Han/pytorch that referenced this pull request Sep 20, 2024
Release cycle for PyTorch 2.5
1. Fixing runtime error on Windows: Fail to load torch_xpu_ops_unary_binary_kernels.dll as the bin size is large.

Pull Request resolved: pytorch#135647
Approved by: https://github.com/EikanWang
kit1980 pushed a commit that referenced this pull request Sep 20, 2024
Update torch-xpu-ops pin (ATen XPU implementation) (#135647)

Release cycle for PyTorch 2.5
1. Fixing runtime error on Windows: Fail to load torch_xpu_ops_unary_binary_kernels.dll as the bin size is large.

Pull Request resolved: #135647
Approved by: https://github.com/EikanWang
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request ciflow/xpu Run XPU CI tasks intel This tag is for PR from Intel Merged module: intel Specific to x86 architecture module: xpu Intel XPU related issues open source topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants