Move where cuda implementation to TensorIterator by zasdfgbnm · Pull Request #32984 · pytorch/pytorch

zasdfgbnm · 2020-02-04T20:27:01Z

where is special because the arguments do not have the same type, which does not satisfy the assumption in modern #32383. I migrate it to TensorIterator so that there is something to test that this case is not broken. Currently, this case fallback to using legacy (not vectorized, not unrolled) code. It should be supported in the future when I cleanup Loops.cuh.

I also move some sharing part of CUDALoops.cuh and ROCmLoops.cuh into Loops.cuh so that to logic for checking whether func_t has the same arg types could be shared.

zasdfgbnm · 2020-02-04T21:54:22Z

aten/src/ATen/native/cuda/ROCmLoops.cuh


-} // namespace modern
-
-template<typename func_t, int nargs=function_traits<func_t>::arity>


moved to Loops.cuh

zasdfgbnm · 2020-02-04T21:54:43Z

aten/src/ATen/native/cuda/CUDALoops.cuh


-} // namespace modern
-
-template<typename func_t, int nargs=function_traits<func_t>::arity>


moved to Loops.cuh

zasdfgbnm · 2020-02-04T21:55:37Z

aten/src/ATen/native/cuda/Loops.cuh

 #include <ATen/native/cuda/ROCmLoops.cuh>
 #endif
+
+namespace at { namespace native {


Moved from CUDALoops.cuh and ROCmLoops.cuh, this part of the code is identical for CUDA and ROCm.

zasdfgbnm · 2020-02-04T21:55:57Z

aten/src/ATen/native/cuda/Loops.cuh

+
+namespace at { namespace native { namespace modern { namespace detail {
+
+template<typename func_t, int remaining=function_traits<func_t>::arity-1>


this part is newly added

zasdfgbnm · 2020-02-04T21:58:15Z

aten/src/ATen/native/cuda/Loops.cuh

+        arg0_t result = legacy::invoke(f, &data.data[1], &strides.data[1], &dtypes.data[1], idx);
+        c10::cast_and_store<arg0_t>(dtypes[0], out, result);
+      });
+    } else if (iter.has_contiguous_first_dim() && modern::detail::has_same_arg_types<func_t>::value) {


This is the only line changed for this copy-pasted chunk of code.

vadimkantorov · 2020-02-04T22:37:25Z

I hope this enables #9190

zasdfgbnm · 2020-02-04T22:43:41Z

@vadimkantorov Unfortunately this doesn't...

kostmo · 2020-02-04T23:41:42Z

💊 CircleCI build failures summary and remediations

As of commit f5e1114:

None of the build failures appear to be your fault.

1/1 broken upstream at merge base 74c8a8f since Feb 11
Please rebase on the viable/strict branch (expand for instructions)

If your commit is newer than viable/strict, you can try basing on an older, stable commit:
```
git fetch origin viable/strict
git rebase --onto viable/strict $(git merge-base origin/master HEAD)
```
If your commit is older than viable/strict:
```
git fetch origin viable/strict
git rebase viable/strict
```
Check out the recency history of this "viable master" tracking branch.

Detailed failure analysis

One may explore the probable reasons each build failed interactively on the Dr. CI website.

🚧 1 upstream failure recognized by patterns:

These builds matched patterns, but were probably caused by upstream breakages:

caffe2_onnx_py3_6_clang7_ubuntu16_04_test from Feb 11

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

This comment has been revised 16 times.

ngimel · 2020-02-06T23:14:40Z

aten/src/ATen/native/cuda/Loops.cuh

+};
+
+// simple compile time test for has_same_arg_types:
+using func1_t = int (*)(float, float);


This belongs in tests, not in actual source?

Yes, this is a compile-time unit test for has_same_arg_types. Maybe I should remove it from Loops.cuh and move to somewhere else?

ngimel · 2020-02-06T23:15:21Z

aten/src/ATen/native/cuda/Loops.cuh

+  using traits = function_traits<func_t>;
+  static constexpr bool value = std::is_same<
+      typename traits::template arg<remaining>::type,
+      typename traits::template arg<remaining-1>::type


out of curiosity, how does this work with -1?

It is specialized as true as in the code below. For nullary function, arity == 0, therefore has_same_arg_types<func_t> will becomes has_same_arg_types<func_t, function_traits<func_t>::arity-1> which is has_same_arg_types<func_t, -1>

zasdfgbnm · 2020-02-10T18:02:25Z

aten/src/ATen/native/cuda/Loops.cuh

+
+namespace at { namespace native {
+
+// `needs_dynamic_casting` compares the types expected by iterator


@ngimel Docs added here

Cool, so it'll be good to go once you move out tests from Loops.cuh.

Test moved to cuda_vectorized_test.cu

facebook-github-bot

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-02-11T21:22:47Z

@ngimel merged this pull request in 367488b.

Summary: Reopen of #32984 Pull Request resolved: #33228 Differential Revision: D19850862 Pulled By: ngimel fbshipit-source-id: b92446a49b4980188fa4788220a2164650e905c2

Summary: `where` is special because the arguments do not have the same type, which does not satisfy the assumption in modern pytorch#32383. I migrate it to TensorIterator so that there is something to test that this case is not broken. Currently, this case fallback to using legacy (not vectorized, not unrolled) code. It should be supported in the future when I cleanup `Loops.cuh`. I also move some sharing part of `CUDALoops.cuh` and `ROCmLoops.cuh` into `Loops.cuh` so that to logic for checking whether `func_t` has the same arg types could be shared. Pull Request resolved: pytorch#32984 Differential Revision: D19825127 Pulled By: ngimel fbshipit-source-id: bbf4682349d96b4480c4d657f3c18a3a67a9bf17

Summary: Reopen of pytorch#32984 Pull Request resolved: pytorch#33228 Differential Revision: D19850862 Pulled By: ngimel fbshipit-source-id: b92446a49b4980188fa4788220a2164650e905c2

Move where cuda implementation to TensorIterator

f8cb213

pytorchbot added the open source label Feb 4, 2020

zasdfgbnm mentioned this pull request Feb 4, 2020

TensorIterator unrolling and vectorized load for GPU loop #31975

Closed

11 tasks

zasdfgbnm added 6 commits February 4, 2020 13:08

fix

0cf8009

fix

ace9121

fix

00bf215

fix

6fd96d0

fix

2df4594

fix

587b512

zasdfgbnm commented Feb 4, 2020

View reviewed changes

fix

bf2a5cb

fix dtype

77a12af

zasdfgbnm changed the title ~~[WIP] Move where cuda implementation to TensorIterator~~ Move where cuda implementation to TensorIterator Feb 4, 2020

zasdfgbnm requested a review from ngimel February 4, 2020 22:54

zasdfgbnm added 3 commits February 4, 2020 15:41

fix windows

cdc04c3

Merge branch 'master' of github.com:pytorch/pytorch into where

dbf3f8c

Update TensorCompare.cu

518bf19

jerryzh168 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Feb 5, 2020

ngimel reviewed Feb 8, 2020

View reviewed changes

needs_dynamic_casting docs

6fe51dd

zasdfgbnm commented Feb 10, 2020

View reviewed changes

zasdfgbnm added 2 commits February 10, 2020 10:34

move unit test to cuda_vectorized_test.cu

b623ef2

Merge branch 'master' of github.com:pytorch/pytorch into where

0ff8a05

zasdfgbnm added 2 commits February 10, 2020 10:54

namespace

d7438be

fix

240c23e

ngimel approved these changes Feb 11, 2020

View reviewed changes

facebook-github-bot reviewed Feb 11, 2020

View reviewed changes

facebook-github-bot closed this in 367488b Feb 11, 2020

zasdfgbnm deleted the where branch February 11, 2020 19:30

facebook-github-bot added the merged label Feb 11, 2020

zasdfgbnm restored the where branch February 12, 2020 04:07

zasdfgbnm reopened this Feb 12, 2020

Merge branch 'master' of github.com:pytorch/pytorch into where

f5e1114

zasdfgbnm closed this Feb 12, 2020

zasdfgbnm deleted the where branch February 12, 2020 04:12

zasdfgbnm mentioned this pull request Feb 12, 2020

Move where cuda implementation to TensorIterator #33228

Closed

mruberry added the Merged label Oct 28, 2020


		} // namespace modern

		template<typename func_t, int nargs=function_traits<func_t>::arity>


		namespace at { namespace native { namespace modern { namespace detail {

		template<typename func_t, int remaining=function_traits<func_t>::arity-1>


		namespace at { namespace native {

		// `needs_dynamic_casting` compares the types expected by iterator

Conversation

zasdfgbnm commented Feb 4, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vadimkantorov commented Feb 4, 2020

Uh oh!

zasdfgbnm commented Feb 4, 2020

Uh oh!

kostmo commented Feb 4, 2020 • edited by dr-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CircleCI build failures summary and remediations

Detailed failure analysis

🚧 1 upstream failure recognized by patterns:

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Feb 11, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

zasdfgbnm commented Feb 4, 2020 •

edited

Loading

kostmo commented Feb 4, 2020 •

edited by dr-ci bot

Loading