sum and roll on cuda for complex dtypes by anjali411 · Pull Request #37959 · pytorch/pytorch

anjali411 · 2020-05-06T20:13:06Z

Stack from ghstack:

sum and roll on cuda for complex dtypes #37959 sum and roll on cuda

Resolves #37925

[ghstack-poisoned]

ghstack-source-id: 6f3754f Pull Request resolved: #37959

dr-ci · 2020-05-06T21:14:50Z

💊 CI failures summary and remediations

As of commit 97868ce (more details on the Dr. CI page):

1/2 failures introduced in this PR
1/2 tentatively recognized as flaky ❄️
- Click here to rerun these jobs

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pytorch_windows_vs2019_py36_cuda10.1_test2 (1/1)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

AssertionError: Not within tolerance rtol=0 atol=1e-05 at input[0, 2] (0.0 vs. -2.25) and 9 other locations (40.00%)

  File "C:\Users\circleci\project\build\win_tmp\build\torch\testing\_internal\common_utils.py", line 974, in assertEqual 
    assertTensorsEqual(x, y) 
  File "C:\Users\circleci\project\build\win_tmp\build\torch\testing\_internal\common_utils.py", line 934, in assertTensorsEqual 
    atol=atol, rtol=rtol, message=message) 
  File "C:\Users\circleci\project\build\win_tmp\build\torch\testing\_internal\common_utils.py", line 974, in assertEqual 
    assertTensorsEqual(x, y) 
  File "C:\Users\circleci\project\build\win_tmp\build\torch\testing\_internal\common_utils.py", line 936, in assertTensorsEqual 
    torch.testing.assert_allclose(a, b, atol=atol, rtol=rtol, equal_nan=True, msg=message) 
  File "C:\Users\circleci\project\build\win_tmp\build\torch\testing\__init__.py", line 60, in assert_allclose 
    raise AssertionError(msg) 
AssertionError: Not within tolerance rtol=0 atol=1e-05 at input[0, 2] (0.0 vs. -2.25) and 9 other locations (40.00%) 
 
---------------------------------------------------------------------- 
Ran 5195 tests in 473.005s 
 
FAILED (failures=5, skipped=207) 
 
Generating XML reports... 
Generated XML report: test-reports\python-unittest\TEST-TestDevicePrecisionCUDA-20200507173325.xml 
Generated XML report: test-reports\python-unittest\TEST-TestTensorDeviceOpsCPU-20200507173325.xml 
Generated XML report: test-reports\python-unittest\TEST-TestTensorDeviceOpsCUDA-20200507173325.xml

❄️ 1 failure tentatively classified as flaky

but reruns have not yet been triggered to confirm:

pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test (1/1)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun) ❄️

May 07 17:25:09 ConnectionResetError: [Errno 104] Connection reset by peer

May 07 17:25:09   File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 455, in accept 
May 07 17:25:09     deliver_challenge(c, self._authkey) 
May 07 17:25:09   File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 722, in deliver_challenge 
May 07 17:25:09     response = connection.recv_bytes(256)        # reject large message 
May 07 17:25:09   File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 216, in recv_bytes 
May 07 17:25:09     buf = self._recv_bytes(maxlength) 
May 07 17:25:09   File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes 
May 07 17:25:09     buf = self._recv(4) 
May 07 17:25:09   File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 379, in _recv 
May 07 17:25:09     chunk = read(handle, remaining) 
May 07 17:25:09 ConnectionResetError: [Errno 104] Connection reset by peer 
May 07 17:25:09 /opt/conda/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 14 leaked semaphores to clean up at shutdown 
May 07 17:25:09   len(cache)) 
May 07 17:25:11 Process ErrorTrackingProcess-122: 
May 07 17:25:11 Traceback (most recent call last): 
May 07 17:25:11   File "/opt/conda/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap 
May 07 17:25:11     self.run() 
May 07 17:25:11   File "/var/lib/jenkins/workspace/test/test_dataloader.py", line 362, in run 
May 07 17:25:11     super(ErrorTrackingProcess, self).run() 
May 07 17:25:11   File "/opt/conda/lib/python3.6/multiprocessing/process.py", line 93, in run 
May 07 17:25:11     self._target(*self._args, **self._kwargs)

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

See how this bot performed.

This comment has been revised 7 times.

[ghstack-poisoned]

ghstack-source-id: 2c7f9ea Pull Request resolved: #37959

zasdfgbnm · 2020-05-07T17:32:28Z

-  AT_DISPATCH_ALL_TYPES_AND2(at::ScalarType::Half, at::ScalarType::Bool, in_tensor.scalar_type(), "roll_cuda", [&] {
+  AT_DISPATCH_ALL_TYPES_AND_C10_COMPLEX_AND3(at::ScalarType::Half, at::ScalarType::Bool, at::ScalarType::BFloat16,
+    in_tensor.scalar_type(), "roll_cuda", [&] {
+    using value_t = typename ztype<scalar_t>::value_t;


Why do we need a value_t here? CPU's ztype<scalar_t>::value_t is a noop for c10::complex

yeah forgot to remove it after replacing AT_DISPATCH_ALL_TYPES_AND_COMPLEX_AND3 with AT_DISPATCH_ALL_TYPES_AND_C10_COMPLEX_AND3

zasdfgbnm · 2020-05-07T17:33:36Z

-  AT_DISPATCH_ALL_TYPES_AND(ScalarType::Bool, iter.dtype(), "sum_cuda", [&]() {
-    sum_kernel_impl<scalar_t>(iter);
+  AT_DISPATCH_ALL_TYPES_AND_COMPLEX_AND(ScalarType::Bool, iter.dtype(), "sum_cuda", [&]() {
+    using value_t = typename ztype<scalar_t>::value_t;


I guess AT_DISPATCH_ALL_TYPES_AND_C10_COMPLEX_AND and remove the ztype will just work.

hmm there was an issue with __shfl_up_sync for c10::complex. I'll look into it more

use ::thrust_t

zasdfgbnm · 2020-05-07T18:09:37Z

  auto total_dims = in_tensor.dim();

-  AT_DISPATCH_ALL_TYPES_AND2(at::ScalarType::Half, at::ScalarType::Bool, in_tensor.scalar_type(), "roll_cuda", [&] {
+  AT_DISPATCH_ALL_TYPES_AND_C10_COMPLEX_AND3(at::ScalarType::Half, at::ScalarType::Bool, at::ScalarType::BFloat16,


This will conflict with #37977, whichever lands first, the other needs change.

zasdfgbnm · 2020-05-07T18:12:28Z

Test failure looks real.

anjali411 · 2021-01-15T20:36:57Z

roll and sum are supported on CUDA for complex tensors now (this was added in a different PR)

sum and roll on cuda

fa35098

[ghstack-poisoned]

anjali411 added a commit that referenced this pull request May 6, 2020

sum and roll on cuda

e355504

ghstack-source-id: 6f3754f Pull Request resolved: #37959

Update on "sum and roll on cuda"

ea5e348

[ghstack-poisoned]

Update on "sum and roll on cuda"

97868ce

[ghstack-poisoned]

anjali411 added a commit that referenced this pull request May 7, 2020

sum and roll on cuda

a7ad498

ghstack-source-id: 2c7f9ea Pull Request resolved: #37959

anjali411 requested review from ezyang and zasdfgbnm May 7, 2020 16:06

zasdfgbnm reviewed May 7, 2020

View reviewed changes

ezyang changed the title ~~sum and roll on cuda~~ ComplexFloat sum and roll on cuda May 8, 2020

anjali411 changed the title ~~ComplexFloat sum and roll on cuda~~ sum and roll on cuda for complex dtypes May 12, 2020

zasdfgbnm mentioned this pull request May 13, 2020

Add complex support for torch.sum #38382

Closed

facebook-github-bot added the cla signed label Oct 30, 2020

anjali411 closed this Jan 15, 2021

facebook-github-bot deleted the gh/anjali411/15/head branch February 15, 2021 15:18

shogohida mentioned this pull request Nov 24, 2022

Feature request: complex CUDA nn.silu() #89382

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sum and roll on cuda for complex dtypes#37959

sum and roll on cuda for complex dtypes#37959
anjali411 wants to merge 3 commits intogh/anjali411/15/basefrom
gh/anjali411/15/head

anjali411 commented May 6, 2020 •

edited

Loading

Uh oh!

dr-ci Bot commented May 6, 2020 •

edited

Loading

Uh oh!

zasdfgbnm May 7, 2020

Uh oh!

anjali411 May 7, 2020

Uh oh!

zasdfgbnm May 7, 2020

Uh oh!

anjali411 May 7, 2020

Uh oh!

anjali411 May 12, 2020

Uh oh!

zasdfgbnm May 7, 2020

Uh oh!

zasdfgbnm commented May 7, 2020

Uh oh!

anjali411 commented Jan 15, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

anjali411 commented May 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dr-ci Bot commented May 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

🕵️ 1 new failure recognized by patterns

pytorch_windows_vs2019_py36_cuda10.1_test2 (1/1)

❄️ 1 failure tentatively classified as flaky

pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test (1/1)

Uh oh!

zasdfgbnm May 7, 2020

Choose a reason for hiding this comment

Uh oh!

anjali411 May 7, 2020

Choose a reason for hiding this comment

Uh oh!

zasdfgbnm May 7, 2020

Choose a reason for hiding this comment

Uh oh!

anjali411 May 7, 2020

Choose a reason for hiding this comment

Uh oh!

anjali411 May 12, 2020

Choose a reason for hiding this comment

Uh oh!

zasdfgbnm May 7, 2020

Choose a reason for hiding this comment

Uh oh!

zasdfgbnm commented May 7, 2020

Uh oh!

anjali411 commented Jan 15, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

anjali411 commented May 6, 2020 •

edited

Loading

dr-ci Bot commented May 6, 2020 •

edited

Loading