Fixes assertEqual and assert_close for DTensor#176895
Fixes assertEqual and assert_close for DTensor#176895arkadip-maitra wants to merge 1 commit intopytorch:mainfrom
Conversation
This PR needs a
|
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/176895
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit 7ab42be with merge base 99dee05 ( FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@pytorchbot rebase -b viable/strict |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Successfully rebased |
8f8c034 to
4b0d99c
Compare
| # handle DTensor cases explicitly | ||
| try: | ||
| from torch.distributed.tensor import DTensor | ||
| except ImportError: |
There was a problem hiding this comment.
We shouldn't use the try/except pattern for cases wehre we have to conditionally import- search the code for uses of importlib.find.. something like that. also, it'd be better to do the HAS_DTENSOR check once at common_utils.py import time and then check that constant from here than run the find import function on every call
There was a problem hiding this comment.
thanks, updated.
| y = y.to_local() | ||
| elif x_dt != y_dt: | ||
| non_dt = y if x_dt else x | ||
| if isinstance(non_dt, torch.Tensor): |
There was a problem hiding this comment.
kinda confused what you want to happen here.
The TypeError is a hard error when we do DT+T comparisons right?
Then why do we have the if/else later to 'make it work' by doing .to_local() - should we just always error in this x_dt != y_dt block?
There was a problem hiding this comment.
it was because i wanted scalar comparison without doing .full_tensor() but it should fail to be consistent. so changed it
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Successfully rebased |
1e15886 to
612f4af
Compare
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: trunk / macos-py3-arm64 / test (mps, 1, 1, macos-m2-15) Details for Dev Infra teamRaised by workflow job |
|
had to make changes to has_dtensor check cus macos trunk was failing |
|
Successfully rebased |
da3fdb1 to
43b50fc
Compare
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: trunk / linux-jammy-cuda13.0-py3.10-gcc11 / test (distributed, 3, 3, lf.linux.g4dn.12xlarge.nvidia.gpu) Details for Dev Infra teamRaised by workflow job |
43b50fc to
7ab42be
Compare
|
hey @wconstab so this test failure was because of that test being added after my last commit. i rebased and that test failed. sorry for that. i think this pr should be merged before merging prs involving dtensor tests. thanks! |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Fixes #167549 `assertEqual` and `assert_close` crashed when given DTensors with plain tensor with ambiguous message. Now checks specs, unwraps to local tensors and gives clean error message. modified some existing test asserts Authored with Claude Pull Request resolved: #176895 Approved by: https://github.com/mlazos, https://github.com/wconstab Co-authored-by: Xia-Weiwen <12522207+Xia-Weiwen@users.noreply.github.com>
Fixes pytorch#167549 `assertEqual` and `assert_close` crashed when given DTensors with plain tensor with ambiguous message. Now checks specs, unwraps to local tensors and gives clean error message. modified some existing test asserts Authored with Claude Pull Request resolved: pytorch#176895 Approved by: https://github.com/mlazos, https://github.com/wconstab
This reverts commit b9720a4. Reverted pytorch#176895 on behalf of https://github.com/malfet due to I could be wrong, but it broke some distribtued tests, see https://hud.pytorch.org/hud/pytorch/pytorch/5d775ad0cbde7690840ab9ebe95fd3f75b2fb156/1?per_page=50&name_filter=distributed&mergeEphemeralLF=true ([comment](pytorch#176895 (comment)))
Fixes pytorch#167549 `assertEqual` and `assert_close` crashed when given DTensors with plain tensor with ambiguous message. Now checks specs, unwraps to local tensors and gives clean error message. modified some existing test asserts Authored with Claude Pull Request resolved: pytorch#176895 Approved by: https://github.com/mlazos, https://github.com/wconstab
Fixes pytorch#167549 `assertEqual` and `assert_close` crashed when given DTensors with plain tensor with ambiguous message. Now checks specs, unwraps to local tensors and gives clean error message. modified some existing test asserts Authored with Claude Pull Request resolved: pytorch#176895 Approved by: https://github.com/mlazos, https://github.com/wconstab
Fixes pytorch#167549 `assertEqual` and `assert_close` crashed when given DTensors with plain tensor with ambiguous message. Now checks specs, unwraps to local tensors and gives clean error message. modified some existing test asserts Authored with Claude Pull Request resolved: pytorch#176895 Approved by: https://github.com/mlazos, https://github.com/wconstab
Fixes #167549
assertEqualandassert_closecrashed when given DTensors with plain tensor with ambiguous message. Now checks specs, unwraps to local tensors and gives clean error message.modified some existing test asserts
Authored with Claude