Fixed output memory format mismatch for bicubic2d by vfdev-5 · Pull Request #90470 · pytorch/pytorch

vfdev-5 · 2022-12-08T15:50:51Z

Description:

output memory format is matching input for bicubic2d

Problem: output tensor's memory format does not match input format for bicubic2d

import torch

i = torch.rand(1, 3, 32, 32).contiguous(memory_format=torch.channels_last)
assert i.is_contiguous(memory_format=torch.channels_last)
o = torch.nn.functional.interpolate(i, size=(4, 4), mode="bicubic")
assert o.is_contiguous(memory_format=torch.channels_last), f"Should be channels last but given channels first ({o.is_contiguous(memory_format=torch.contiguous_format)})"

> AssertionError: Should be channels last but given channels first (True)

Related PR fixing bilinear ops: #53535 (cc @VitalyFedyunin @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @bdhirsh )

Discovered together with @NicolasHug while working on https://github.com/pytorch/pytorch/tree/interpolate_uint8_images_linear_cpu_support_dev

Updated code to match grad input / output memory formats
temporary tensor creation matches memory format in separable_upsample_generic_Nd_kernel_impl
Updated tests
Added missing forward AD support for bicubic with antialiasing

pytorch-bot · 2022-12-08T15:50:53Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90470

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 Failures, 1 Pending

As of commit f2ad2d9:

FLAKY - The following jobs failed but were likely due to flakiness present on master:

linux-focal-rocm5.3-py3.8 / test (default, 1, 2, linux.rocm.gpu)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

NicolasHug

Thanks @vfdev-5 !

vfdev-5 · 2022-12-09T08:07:20Z

XLA job failure seems to be related : https://github.com/pytorch/pytorch/actions/runs/3649949448/jobs/6165647497#step:10:12105

@JackCaoG can you help with debuging this issue please

JackCaoG · 2022-12-09T18:02:29Z

hmm, it seems like test just takes too long to compile then it was killed...

JackCaoG · 2022-12-09T18:11:21Z

I opened pytorch/xla#4308, if gpu test works then I think it might be a cpu compiler issue. It is a bit concerning that with this change now compilation significantly increased through.

JackCaoG · 2022-12-09T18:56:12Z

@vfdev-5 Do you ming rebasing this pr? I was not able to build on our CI since there were some offending pr merged in pytorch side.

vfdev-5 · 2022-12-12T08:07:25Z

@JackCaoG I merged the latest master on friday and today. Both commits are failing on xla : https://github.com/pytorch/pytorch/actions/runs/3660231343/jobs/6187319554, https://github.com/pytorch/pytorch/actions/runs/3673786773/jobs/6211503769

JackCaoG · 2022-12-13T00:49:41Z

@wonjoolee95 can you follow up on this one? I trigger the gpu ci in pytorch/xla#4308 again. If you see that GPU test passed we can conclude that this pr will somehow generate a graph that's hard to compile for XLA:CPU, which I think is fine. We can disable the test either on pytorch end or xla end on xla devices. If GPU test also failed with a compilation timeout I think we have a bigger problem since we do have real user for it.

wonjoo-wj · 2022-12-13T01:05:37Z

@wonjoolee95 can you follow up on this one? I trigger the gpu ci in pytorch/xla#4308 again. If you see that GPU test passed we can conclude that this pr will somehow generate a graph that's hard to compile for XLA:CPU, which I think is fine. We can disable the test either on pytorch end or xla end on xla devices. If GPU test also failed with a compilation timeout I think we have a bigger problem since we do have real user for it.

Sounds good, I'll monitor the GPU test CI and keep this thread updated.

wonjoo-wj · 2022-12-13T04:41:25Z

Seems like the XLA's GPU CI is stuck as well, specifically getting an error (I'm guessing timeout) for the test test_upsamplingBiMode2d_antialias_False_align_corners_False_mode_bilinear_xla.

vfdev-5 · 2022-12-13T08:55:31Z

Other related failures:

======================================================================
ERROR [0.351s]: test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_interpolate_bicubic_cuda_float32 (__main__.TestMetaCUDA)
...
   File "/var/lib/jenkins/workspace/test/test_meta.py", line 357, in test_assert
    raise RuntimeError(f"output {i}: {msg_callable(msg)}")
RuntimeError: output 0: meta disagrees with real impl:
aten.upsample_bicubic2d.default(
  tensor(..., device='meta', size=(2, 3, 4, 4)) stride=(48, 1, 12, 3),
  [3, 3],
  True,

) = (
  tensor(..., device='meta', size=(2, 3, 3, 3)) stride=(27, 9, 3, 1)
)
but real stride was (27, 1, 9, 3)

https://github.com/pytorch/pytorch/actions/runs/3673786773/jobs/6211595472

2022-12-12T08:56:34.2366547Z ======================================================================
2022-12-12T08:56:34.2366834Z FAIL [0.097s]: test_upsamplingBiMode2d_antialias_False_align_corners_False_mode_bicubic_cuda (__main__.TestNNDeviceTypeCUDA)
2022-12-12T08:56:34.2367113Z ----------------------------------------------------------------------
2022-12-12T08:56:34.2367253Z Traceback (most recent call last):
2022-12-12T08:56:34.2367609Z   File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2053, in wrapper
2022-12-12T08:56:34.2367730Z     method(*args, **kwargs)
2022-12-12T08:56:34.2368116Z   File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 378, in instantiated_test
2022-12-12T08:56:34.2368235Z     result = test(self, **param_kwargs)
2022-12-12T08:56:34.2368511Z   File "/var/lib/jenkins/workspace/test/test_nn.py", line 9362, in test_upsamplingBiMode2d
2022-12-12T08:56:34.2368679Z     self.assertEqual(a_cuda.grad, a_cpu.grad)
2022-12-12T08:56:34.2369043Z   File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2859, in assertEqual
2022-12-12T08:56:34.2369152Z     assert_equal(
2022-12-12T08:56:34.2369491Z   File "/opt/conda/lib/python3.10/site-packages/torch/testing/_comparison.py", line 1270, in assert_equal
2022-12-12T08:56:34.2369678Z     raise error_metas[0].to_error(msg)
2022-12-12T08:56:34.2369891Z AssertionError: Tensor-likes are not close!
2022-12-12T08:56:34.2369911Z 
2022-12-12T08:56:34.2370026Z Mismatched elements: 44 / 48 (91.7%)
2022-12-12T08:56:34.2370324Z Greatest absolute difference: 1.3735964907929827 at index (1, 0, 1, 2) (up to 1e-07 allowed)
2022-12-12T08:56:34.2370621Z Greatest relative difference: 37.21617048686028 at index (0, 1, 1, 1) (up to 1e-07 allowed)
2022-12-12T08:56:34.2370640Z 
2022-12-12T08:56:34.2370784Z ======================================================================
2022-12-12T08:56:34.2371067Z FAIL [0.095s]: test_upsamplingBiMode2d_antialias_False_align_corners_True_mode_bicubic_cuda (__main__.TestNNDeviceTypeCUDA)
2022-12-12T08:56:34.2371334Z ----------------------------------------------------------------------
2022-12-12T08:56:34.2371472Z Traceback (most recent call last):
2022-12-12T08:56:34.2371834Z   File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2053, in wrapper
2022-12-12T08:56:34.2371936Z     method(*args, **kwargs)
2022-12-12T08:56:34.2372320Z   File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 378, in instantiated_test
2022-12-12T08:56:34.2372455Z     result = test(self, **param_kwargs)
2022-12-12T08:56:34.2372675Z   File "/var/lib/jenkins/workspace/test/test_nn.py", line 9362, in test_upsamplingBiMode2d
2022-12-12T08:56:34.2372834Z     self.assertEqual(a_cuda.grad, a_cpu.grad)
2022-12-12T08:56:34.2373198Z   File "/opt/conda/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2859, in assertEqual
2022-12-12T08:56:34.2373308Z     assert_equal(
2022-12-12T08:56:34.2373650Z   File "/opt/conda/lib/python3.10/site-packages/torch/testing/_comparison.py", line 1270, in assert_equal
2022-12-12T08:56:34.2373766Z     raise error_metas[0].to_error(msg)
2022-12-12T08:56:34.2373981Z AssertionError: Tensor-likes are not close!
2022-12-12T08:56:34.2374001Z 
2022-12-12T08:56:34.2374133Z Mismatched elements: 12 / 48 (25.0%)
2022-12-12T08:56:34.2374425Z Greatest absolute difference: 2.710664614723657 at index (1, 0, 1, 2) (up to 1e-07 allowed)
2022-12-12T08:56:34.2374699Z Greatest relative difference: inf at index (0, 0, 0, 3) (up to 1e-07 allowed)
2022-12-12T08:56:34.2374719Z 
2022-12-12T08:56:34.2374987Z ----------------------------------------------------------------------
2022-12-12T08:56:34.2375107Z Ran 2303 tests in 180.681s
2022-12-12T08:56:34.2375126Z 
2022-12-12T08:56:34.2375300Z FAILED (failures=2, skipped=75, expected failures=10)

https://pipelines.actions.githubusercontent.com/serviceHosts/7d146c05-69c3-4c20-a0e7-818111670117/_apis/pipelines/1/runs/2469538/signedlogcontent/627?urlExpires=2022-12-13T08%3A18%3A01.0731051Z&urlSigningMethod=HMACV1&urlSignature=F9yJPM1Dm3cvK9raM17QBJ1HFHEyvoN%2B%2BbSHserPdtw%3D

I'll investigate these and keep updated.

linux-foundation-easycla · 2022-12-13T15:36:52Z

The committers listed above are authorized under a signed CLA.

✅ login: vfdev-5 / name: vfdev (3abb3fc45c549714de5b2c4f2d1a7fd6a9b37a8e, 449de510c66390c62e737049f0bf25232a9a69af, 2335adec6c85cb2c42f1b9904941394d87ad9ad2, 0dae71fdfb2c9c030ba15b79f54a962030a87465)

vfdev-5 · 2022-12-14T10:24:26Z

@JackCaoG I fixed in the recent commit the issue this PR had with grad output memory format, I reverted to code and it fixed issues mentioned in #90470 (comment) but CI still failing on xla.

JackCaoG · 2022-12-14T14:50:04Z

@wonjoolee95 can follow up, we can dump the hlo and maybe check with XLA folks why the hlo took so long to compile. This back and forth might take a few days, is this pr urgent?

vfdev-5 · 2022-12-14T14:52:07Z

@JackCaoG thanks for the feedback, it is not urgent and we can wait for some time. However, this #90771 depends on this code change.

JackCaoG · 2022-12-14T14:53:33Z

Thanks for the context, we will try to move a bit faster to unblock this pr, thank you for you patience!

wonjoo-wj · 2022-12-19T19:35:12Z

@vfdev-5, apologies for the delay. From XLA's side, we have disabled the test for now. Could you update the XLA pin (https://github.com/pytorch/pytorch/blob/master/.github/ci_commit_pins/xla.txt) to 66c2c15df992c9a683e3b08811a7c08ebeda0a2f and re-trigger the CI? As you do that, rebasing this PR with the master would be helpful, too. Thanks!

Please refer to Jack's comment below to use onlyNativeDeviceTypes (ex: https://github.com/pytorch/pytorch/blob/master/test/test_torch.py#L1277). Thanks!

JackCaoG · 2022-12-19T19:50:44Z

Maybe you can use onlyNativeDeviceTypes to prevent it from running on xla devices, so you don't need to pin the xla pr.

lezcano

Cool!

lezcano · 2023-01-12T14:18:00Z

@pytorchbot merge

pytorchmergebot · 2023-01-12T14:20:50Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2023-01-12T14:20:51Z

Merge failed

Reason: This PR is too stale; the last push date was more than 3 days ago. Please rebase and try again. You can rebase by leaving the following comment on this PR:
@pytorchbot rebase

Details for Dev Infra team

Raised by workflow job

vfdev-5 · 2023-01-12T14:26:48Z

@pytorchbot rebase

pytorchmergebot · 2023-01-12T14:28:58Z

@pytorchbot successfully started a rebase job. Check the current status here

Description: - output memory format is matching input for bicubic2d Problem: output tensor's memory format does not match input format for bicubic2d ```python import torch i = torch.rand(1, 3, 32, 32).contiguous(memory_format=torch.channels_last) assert i.is_contiguous(memory_format=torch.channels_last) o = torch.nn.functional.interpolate(i, size=(4, 4), mode="bicubic") assert o.is_contiguous(memory_format=torch.channels_last), f"Should be channels last but given channels first ({o.is_contiguous(memory_format=torch.contiguous_format)})" ``` Related PR fixing bilinear ops: pytorch#53535 Discovered by Nicolas Hug while working on https://github.com/pytorch/pytorch/tree/interpolate_uint8_images_linear_cpu_support_dev - Updated tests - Added missing forward AD support for bicubic with antialiasing

pytorchmergebot · 2023-01-12T14:29:04Z

Successfully rebased fix-bicubic-out-mf onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout fix-bicubic-out-mf && git pull --rebase)

lezcano · 2023-01-12T17:45:10Z

@pytorchbot merge

pytorchmergebot · 2023-01-12T17:49:17Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2023-01-12T19:35:39Z

Merge failed

Reason: 2 additional jobs have failed, first few of them are: trunk ,trunk / linux-focal-rocm5.3-py3.8 / test (default, 1, 2, linux.rocm.gpu)

Details for Dev Infra team

Raised by workflow job

lezcano · 2023-01-12T19:50:43Z

@pytorchbot merge -f "timed out, unrelated"

pytorchmergebot · 2023-01-12T19:52:23Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

@VitalyFedyunin

Description: - output memory format is matching input for bicubic2d Problem: output tensor's memory format does not match input format for bicubic2d ```python import torch i = torch.rand(1, 3, 32, 32).contiguous(memory_format=torch.channels_last) assert i.is_contiguous(memory_format=torch.channels_last) o = torch.nn.functional.interpolate(i, size=(4, 4), mode="bicubic") assert o.is_contiguous(memory_format=torch.channels_last), f"Should be channels last but given channels first ({o.is_contiguous(memory_format=torch.contiguous_format)})" > AssertionError: Should be channels last but given channels first (True) ``` Related PR fixing bilinear ops: pytorch#53535 (cc @VitalyFedyunin @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @bdhirsh ) Discovered together with @NicolasHug while working on https://github.com/pytorch/pytorch/tree/interpolate_uint8_images_linear_cpu_support_dev - Updated code to match grad input / output memory formats - temporary tensor creation matches memory format in `separable_upsample_generic_Nd_kernel_impl` - Updated tests - Added missing forward AD support for bicubic with antialiasing Pull Request resolved: pytorch#90470 Approved by: https://github.com/NicolasHug, https://github.com/lezcano

vfdev-5 requested review from albanD and soulitzer as code owners December 8, 2022 15:50

pytorch-bot Bot added the release notes: nn release notes category label Dec 8, 2022

github-actions Bot added the module: cpu CPU specific problem (e.g., perf, algorithm) label Dec 8, 2022

vfdev-5 changed the title ~~Fixed output memory format for bicubic2d~~ Fixed output memory format mismatch for bicubic2d Dec 8, 2022

pytorchbot added the open source label Dec 8, 2022

vfdev-5 force-pushed the fix-bicubic-out-mf branch from 176951a to 3abb3fc Compare December 8, 2022 16:01

NicolasHug approved these changes Dec 8, 2022

View reviewed changes

JackCaoG mentioned this pull request Dec 9, 2022

pin to see if gpu ci test will pass pytorch/xla#4308

Closed

vfdev-5 commented Dec 13, 2022

View reviewed changes

Comment thread torch/_decomp/decompositions.py Outdated

vfdev-5 force-pushed the fix-bicubic-out-mf branch from 5d06a01 to d16d64a Compare December 13, 2022 15:52

vfdev-5 closed this Dec 13, 2022

vfdev-5 reopened this Dec 13, 2022

vfdev-5 force-pushed the fix-bicubic-out-mf branch from d16d64a to 0dae71f Compare December 13, 2022 16:09

vfdev-5 closed this Dec 13, 2022

vfdev-5 reopened this Dec 13, 2022

lezcano added the ciflow/trunk Trigger trunk jobs on your pull request label Jan 12, 2023

lezcano approved these changes Jan 12, 2023

View reviewed changes

vfdev-5 added 3 commits January 12, 2023 14:29

Fixed issues with grad mismatch and fixed meta ref memory format

75b7c9a

Added onlyNativeDeviceTypes to test_upsamplingBiMode2d as suggested

f2ad2d9

pytorchmergebot force-pushed the fix-bicubic-out-mf branch from 7f70215 to f2ad2d9 Compare January 12, 2023 14:29

pytorchmergebot added the Merged label Jan 12, 2023

pytorchmergebot closed this in 5f55335 Jan 12, 2023

Conversation

vfdev-5 commented Dec 8, 2022 • edited by pytorch-bot Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Dec 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90470

❌ 1 Failures, 1 Pending

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

vfdev-5 commented Dec 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JackCaoG commented Dec 9, 2022

Uh oh!

JackCaoG commented Dec 9, 2022

Uh oh!

JackCaoG commented Dec 9, 2022

Uh oh!

vfdev-5 commented Dec 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JackCaoG commented Dec 13, 2022

Uh oh!

wonjoo-wj commented Dec 13, 2022

Uh oh!

wonjoo-wj commented Dec 13, 2022

Uh oh!

vfdev-5 commented Dec 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

linux-foundation-easycla Bot commented Dec 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

vfdev-5 commented Dec 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JackCaoG commented Dec 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vfdev-5 commented Dec 14, 2022

Uh oh!

JackCaoG commented Dec 14, 2022

Uh oh!

wonjoo-wj commented Dec 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JackCaoG commented Dec 19, 2022

Uh oh!

lezcano left a comment

Choose a reason for hiding this comment

Uh oh!

lezcano commented Jan 12, 2023

Uh oh!

pytorchmergebot commented Jan 12, 2023

Merge started

Uh oh!

pytorchmergebot commented Jan 12, 2023

Merge failed

Uh oh!

vfdev-5 commented Jan 12, 2023

Uh oh!

pytorchmergebot commented Jan 12, 2023

Uh oh!

pytorchmergebot commented Jan 12, 2023

Uh oh!

lezcano commented Jan 12, 2023

Uh oh!

pytorchmergebot commented Jan 12, 2023

Merge started

Uh oh!

pytorchmergebot commented Jan 12, 2023

Merge failed

Uh oh!

lezcano commented Jan 12, 2023

Uh oh!

vfdev-5 commented Dec 8, 2022 •

edited by pytorch-bot Bot

Loading

pytorch-bot Bot commented Dec 8, 2022 •

edited

Loading

vfdev-5 commented Dec 9, 2022 •

edited

Loading

vfdev-5 commented Dec 12, 2022 •

edited

Loading

vfdev-5 commented Dec 13, 2022 •

edited

Loading

linux-foundation-easycla Bot commented Dec 13, 2022 •

edited

Loading

vfdev-5 commented Dec 14, 2022 •

edited

Loading

JackCaoG commented Dec 14, 2022 •

edited

Loading

wonjoo-wj commented Dec 19, 2022 •

edited

Loading