Adapt dtensor tests to be device agnostic by amathewc · Pull Request #154840 · pytorch/pytorch

amathewc · 2025-06-02T08:48:58Z

##MOTIVATION
This PR includes minor changes to skip some unsupported tests on Intel Gaudi devices as well as to make some of the tests more device agnostic.
Please refer to this RFC as well: pytorch/rfcs#66

##CHANGES

test_dtensor_compile.py : Make some of the tests device agnostic . ( Replace "cuda" hard codings with self.device_type)
test_dtensor.py and test_comm_mode_features.py: Skip some tests which are unsupported on Intel Gaudi devices.

cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k, @ankurneog, @EikanWang, @guangyey

pytorch-bot · 2025-06-02T08:49:01Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/154840

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 2b5c4e2 with merge base 2908c10 ():

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

pull / linux-jammy-py3-clang12-executorch / test (executorch, 1, 1, linux.2xlarge, unstable) (gh)
exir/backend/test/test_to_backend_multi_method.py::TestToBackendMultiMethod::test_multi_method_end_to_end

This comment was automatically generated by Dr. CI and updates every 15 minutes.

amathewc · 2025-06-02T10:12:23Z

@pytorchbot label "topic: not user facing"

Skylion007 · 2025-06-02T13:43:41Z

Should these be skipifXPU or should it be xfailIfXPU? We prefer the latter so we can enable them later when the functionality is fixed.

EikanWang · 2025-06-03T02:46:27Z

This PR should be dedicated to Intel Gaudi. For Intel GPU(XPU), we have supported the feature. @zhangxiaoli73

amathewc · 2025-06-03T03:23:35Z

This PR should be dedicated to Intel Gaudi. For Intel GPU(XPU), we have supported the feature. @zhangxiaoli73

Yes - this is specific for Intel Gaudi (HPU) devices.

amathewc · 2025-06-03T06:16:15Z

@albanD , @atalman : Could you help with merging this ?

amathewc · 2025-06-06T03:52:30Z

@albanD , @atalman : Could you help with merging this ? The failures seem to be unrelated to this PR.

amathewc · 2025-06-09T05:49:26Z

@pytorchmergebot rebase

pytorchmergebot · 2025-06-09T05:50:58Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

Signed-off-by: Aby Mathew C <aby.mathew.c@intel.com>

pytorchmergebot · 2025-06-09T05:51:02Z

Successfully rebased dtensor5 onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout dtensor5 && git pull --rebase)

albanD

Sure

amathewc · 2025-06-10T03:22:29Z

Sure

@albanD : Could you initiate the merging as well ?

guangyey · 2025-06-10T09:21:03Z

@pytorchbot merge

pytorchmergebot · 2025-06-10T09:22:54Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

## MOTIVATION This PR is a continuation of #154840 and we are trying to make the tests more device agnostic by removing hard coded references to any particular device. Please refer to this RFC as well: pytorch/rfcs#66 ## CHANGES 1. test_convolution_ops.py: - Replace "cuda" with self.device_type 2. test_random_ops.py: - Remove setting and using TYPE_DEVICE variable since device_type is set as per the environment (device) in DTensorTestBase class. - Replace "cuda" with self.device_type Pull Request resolved: #155687 Approved by: https://github.com/EikanWang, https://github.com/d4l3k

pytorch-bot bot added oncall: distributed Add this issue/PR to distributed oncall triage queue topic: not user facing topic category labels Jun 2, 2025

pytorchbot added the open source label Jun 2, 2025

soulitzer requested a review from bdhirsh June 2, 2025 15:58

soulitzer added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jun 2, 2025

EikanWang approved these changes Jun 3, 2025

View reviewed changes

guangyey approved these changes Jun 3, 2025

View reviewed changes

Adapt dtensor tests to be device agnostic

2b5c4e2

Signed-off-by: Aby Mathew C <aby.mathew.c@intel.com>

pytorchmergebot force-pushed the dtensor5 branch from b6bc421 to 2b5c4e2 Compare June 9, 2025 05:51

albanD approved these changes Jun 9, 2025

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jun 10, 2025

pytorchmergebot added the merging label Jun 10, 2025

pytorchmergebot added the Merged label Jun 10, 2025

pytorchmergebot closed this in e53ddaf Jun 10, 2025

pytorchmergebot removed the merging label Jun 10, 2025

amathewc mentioned this pull request Jun 11, 2025

Make dtensor tests device agnostic #155687

Closed

Conversation

amathewc commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/154840

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

amathewc commented Jun 2, 2025

Uh oh!

Skylion007 commented Jun 2, 2025

Uh oh!

EikanWang commented Jun 3, 2025

Uh oh!

amathewc commented Jun 3, 2025

Uh oh!

amathewc commented Jun 3, 2025

Uh oh!

amathewc commented Jun 6, 2025

Uh oh!

amathewc commented Jun 9, 2025

Uh oh!

pytorchmergebot commented Jun 9, 2025

Uh oh!

pytorchmergebot commented Jun 9, 2025

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

amathewc commented Jun 10, 2025

Uh oh!

guangyey commented Jun 10, 2025

Uh oh!

pytorchmergebot commented Jun 10, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

amathewc commented Jun 2, 2025 •

edited

Loading

pytorch-bot bot commented Jun 2, 2025 •

edited

Loading