Skip to content

Fixes assertEqual and assert_close for DTensor#176895

Closed
arkadip-maitra wants to merge 1 commit intopytorch:mainfrom
arkadip-maitra:fix_167549
Closed

Fixes assertEqual and assert_close for DTensor#176895
arkadip-maitra wants to merge 1 commit intopytorch:mainfrom
arkadip-maitra:fix_167549

Conversation

@arkadip-maitra
Copy link
Copy Markdown
Collaborator

@arkadip-maitra arkadip-maitra commented Mar 9, 2026

Fixes #167549

assertEqual and assert_close crashed when given DTensors with plain tensor with ambiguous message. Now checks specs, unwraps to local tensors and gives clean error message.

modified some existing test asserts

Authored with Claude

@arkadip-maitra arkadip-maitra requested a review from a team as a code owner March 9, 2026 17:14
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Mar 9, 2026

This PR needs a release notes: label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Mar 9, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/176895

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 7ab42be with merge base 99dee05 (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@arkadip-maitra
Copy link
Copy Markdown
Collaborator Author

@pytorchbot rebase -b viable/strict

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Tried to rebase and push PR #176895, but it was already up to date. Try rebasing against main by issuing:
@pytorchbot rebase -b main

@arkadip-maitra
Copy link
Copy Markdown
Collaborator Author

@pytorchbot rebase

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Successfully rebased fix_167549 onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout fix_167549 && git pull --rebase)

# handle DTensor cases explicitly
try:
from torch.distributed.tensor import DTensor
except ImportError:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't use the try/except pattern for cases wehre we have to conditionally import- search the code for uses of importlib.find.. something like that. also, it'd be better to do the HAS_DTENSOR check once at common_utils.py import time and then check that constant from here than run the find import function on every call

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, updated.

y = y.to_local()
elif x_dt != y_dt:
non_dt = y if x_dt else x
if isinstance(non_dt, torch.Tensor):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kinda confused what you want to happen here.

The TypeError is a hard error when we do DT+T comparisons right?
Then why do we have the if/else later to 'make it work' by doing .to_local() - should we just always error in this x_dt != y_dt block?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was because i wanted scalar comparison without doing .full_tensor() but it should fail to be consistent. so changed it

@arkadip-maitra
Copy link
Copy Markdown
Collaborator Author

@pytorchbot rebase

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Successfully rebased fix_167549 onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout fix_167549 && git pull --rebase)

mlazos
mlazos previously approved these changes Mar 16, 2026
@arkadip-maitra
Copy link
Copy Markdown
Collaborator Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 16, 2026
@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / macos-py3-arm64 / test (mps, 1, 1, macos-m2-15)

Details for Dev Infra team Raised by workflow job

@pytorch-bot pytorch-bot bot removed the ciflow/trunk Trigger trunk jobs on your pull request label Mar 16, 2026
@arkadip-maitra
Copy link
Copy Markdown
Collaborator Author

had to make changes to has_dtensor check cus macos trunk was failing

@arkadip-maitra arkadip-maitra requested a review from mlazos March 16, 2026 13:14
@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Successfully rebased fix_167549 onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout fix_167549 && git pull --rebase)

@arkadip-maitra
Copy link
Copy Markdown
Collaborator Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 24, 2026
@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / linux-jammy-cuda13.0-py3.10-gcc11 / test (distributed, 3, 3, lf.linux.g4dn.12xlarge.nvidia.gpu)

Details for Dev Infra team Raised by workflow job

@pytorch-bot pytorch-bot bot removed the ciflow/trunk Trigger trunk jobs on your pull request label Mar 24, 2026
@arkadip-maitra arkadip-maitra requested a review from wconstab March 24, 2026 14:05
@arkadip-maitra
Copy link
Copy Markdown
Collaborator Author

hey @wconstab so this test failure was because of that test being added after my last commit. i rebased and that test failed. sorry for that. i think this pr should be merged before merging prs involving dtensor tests. thanks!

@wconstab
Copy link
Copy Markdown
Contributor

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 24, 2026
@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Copilot AI pushed a commit that referenced this pull request Mar 27, 2026
Fixes #167549

`assertEqual` and `assert_close` crashed when given DTensors with plain tensor with ambiguous message. Now checks specs, unwraps to local tensors and gives clean error message.

modified some existing test asserts

Authored with Claude
Pull Request resolved: #176895
Approved by: https://github.com/mlazos, https://github.com/wconstab

Co-authored-by: Xia-Weiwen <12522207+Xia-Weiwen@users.noreply.github.com>
AaronWang04 pushed a commit to AaronWang04/pytorch that referenced this pull request Mar 31, 2026
Fixes pytorch#167549

`assertEqual` and `assert_close` crashed when given DTensors with plain tensor with ambiguous message. Now checks specs, unwraps to local tensors and gives clean error message.

modified some existing test asserts

Authored with Claude
Pull Request resolved: pytorch#176895
Approved by: https://github.com/mlazos, https://github.com/wconstab
AaronWang04 pushed a commit to AaronWang04/pytorch that referenced this pull request Mar 31, 2026
Fixes pytorch#167549

`assertEqual` and `assert_close` crashed when given DTensors with plain tensor with ambiguous message. Now checks specs, unwraps to local tensors and gives clean error message.

modified some existing test asserts

Authored with Claude
Pull Request resolved: pytorch#176895
Approved by: https://github.com/mlazos, https://github.com/wconstab
xuhancn pushed a commit to xuhancn/pytorch that referenced this pull request Apr 2, 2026
Fixes pytorch#167549

`assertEqual` and `assert_close` crashed when given DTensors with plain tensor with ambiguous message. Now checks specs, unwraps to local tensors and gives clean error message.

modified some existing test asserts

Authored with Claude
Pull Request resolved: pytorch#176895
Approved by: https://github.com/mlazos, https://github.com/wconstab
nklshy-aws pushed a commit to nklshy-aws/pytorch that referenced this pull request Apr 7, 2026
Fixes pytorch#167549

`assertEqual` and `assert_close` crashed when given DTensors with plain tensor with ambiguous message. Now checks specs, unwraps to local tensors and gives clean error message.

modified some existing test asserts

Authored with Claude
Pull Request resolved: pytorch#176895
Approved by: https://github.com/mlazos, https://github.com/wconstab
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-td Do not run TD on this PR ciflow/trunk Trigger trunk jobs on your pull request Merged open source Reverted topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

assertEqual doesn't work with DTensor

7 participants