implement NumPy-like functionality column_stack, row_stack by RockingJavaBean · Pull Request #46313 · pytorch/pytorch

RockingJavaBean · 2020-10-14T11:43:44Z

This PR implements column_stack as the composite ops of torch.reshape and torch.hstack, and makes row_stack as the alias of torch.vstack.

Todo

docs
alias pattern for row_stack

facebook-github-bot · 2020-10-14T11:44:14Z

💊 CI failures summary and remediations

As of commit 42e674aadf (more details on the Dr. CI page):

1/1 failures possibly* introduced in this PR
- 1/1 non-CircleCI failure(s)---

Extra GitHub checks: 1 failed

Failed: GitHub Actions - clang-tidy

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 3 times.

dr-ci · 2020-10-14T12:03:31Z

💊 CI failures summary and remediations

As of commit dceaf17 (more details on the Dr. CI page):

1/2 failures introduced in this PR
1/2 tentatively recognized as flaky ❄️
- Click here to rerun these jobs

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pytorch_xla_linux_bionic_py3_6_clang9_test (1/1)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Oct 29 04:11:58 sccache: error: couldn't connect to server

Oct 29 04:11:58 +++ eval 'extract_trap_cmd ' 
Oct 29 04:11:58 ++++ extract_trap_cmd 
Oct 29 04:11:58 ++++ printf '%s\n' '' 
Oct 29 04:11:58 +++ printf '%s\n' cleanup 
Oct 29 04:11:58 ++ trap -- ' 
Oct 29 04:11:58 cleanup' EXIT 
Oct 29 04:11:58 ++ [[ pytorch-xla-linux-bionic-py3.6-clang9-test != *pytorch-win-* ]] 
Oct 29 04:11:58 ++ which sccache 
Oct 29 04:11:58 ++ sccache --stop-server 
Oct 29 04:11:58 Stopping sccache server... 
Oct 29 04:11:58 sccache: error: couldn't connect to server 
Oct 29 04:11:58 sccache: caused by: Connection refused (os error 111) 
Oct 29 04:11:58 ++ true 
Oct 29 04:11:58 ++ rm /var/lib/jenkins/sccache_error.log 
Oct 29 04:11:58 ++ [[ pytorch-xla-linux-bionic-py3.6-clang9-test == *rocm* ]] 
Oct 29 04:11:58 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 
Oct 29 04:11:58 ++ SCCACHE_IDLE_TIMEOUT=1200 
Oct 29 04:11:58 ++ RUST_LOG=sccache::server=error 
Oct 29 04:11:58 ++ sccache --start-server 
Oct 29 04:11:58 sccache: Starting the server... 
Oct 29 04:11:58 ++ sccache --zero-stats

❄️ 1 failure tentatively classified as flaky

but reruns have not yet been triggered to confirm:

pytorch_bazel_test (1/1)

Step: "Test" (full log | diagnosis details | 🔁 rerun) ❄️

Oct 29 03:20:12 TIMEOUT: //:optim_test (Summary)

Oct 29 03:19:44 [       OK ] AnyValueTest.TypeInfoIsCorrectForStringLiteral (0 ms) 
Oct 29 03:19:44 [ RUN      ] AnyValueTest.TypeInfoIsCorrectForString 
Oct 29 03:19:44 [       OK ] AnyValueTest.TypeInfoIsCorrectForString (0 ms) 
Oct 29 03:19:44 [----------] 12 tests from AnyValueTest (1 ms total) 
Oct 29 03:19:44  
Oct 29 03:19:44 [----------] Global test environment tear-down 
Oct 29 03:19:44 [==========] 30 tests from 2 test suites ran. (6 ms total) 
Oct 29 03:19:44 [  PASSED  ] 30 tests. 
Oct 29 03:19:44 ================================================================================ 
Oct 29 03:20:12  
Oct 29 03:20:12 TIMEOUT: //:optim_test (Summary) 
Oct 29 03:20:12       /var/lib/jenkins/.cache/bazel/_bazel_jenkins/fdf6d09bf4b4f04a71e2a7dfceb40620/execroot/pytorch/bazel-out/k8-fastbuild/testlogs/optim_test/test.log 
Oct 29 03:20:12 INFO: From Testing //:optim_test: 
Oct 29 03:20:12 ==================== Test output for //:optim_test: 
Oct 29 03:20:12 Running main() from gmock_main.cc 
Oct 29 03:20:12 Note: Google Test filter = -*CUDA 
Oct 29 03:20:12 [==========] Running 33 tests from 1 test suite. 
Oct 29 03:20:12 [----------] Global test environment set-up. 
Oct 29 03:20:12 [----------] 33 tests from OptimTest 
Oct 29 03:20:12 [ RUN      ] OptimTest.OptimizerAccessors 
Oct 29 03:20:12 [       OK ] OptimTest.OptimizerAccessors (3 ms)

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 56 times.

RockingJavaBean · 2020-10-15T15:52:26Z

@mruberry This PR implements the NumPy-like functions column_stack and row_stack functions, please kindly review it.
And the test/test_op_aliases.py is extended to support testing alias functions whose input is a sequence of tensors, this helps to test torch.row_stack as it is implemented as the alias of torch.vstack.

mruberry · 2020-10-16T06:29:10Z

+           r"""
+column_stack(tensors, *, out=None) -> Tensor
+
+Stack 1D tensors as columns into a 2D tensor, 2D tensors are stacked just like :func:`torch.hstack`.


This description is already good, but I think something like this would be a little clearer:

Creates a new 2-dimensional tensor by horizontally stacking the tensors in :attr:`tensors`.

:attr:`tensors` must be composed entirely of 1-dimensional and 2-dimensional tensors whose first dimension is the same size. The 1-dimensional tensors in :attr:`tensors` are reshaped into (*, 1) columns before being stacked horizontally.

Not related to the documentation, but we should be sure we error out if a tensor and not a list of tensors is given. NumPy has some odd behavior when given a single array, probably because it's so aggressive at interpreting its inputs:

# This seems correct a = np.array(((1, 2), (3, 4), (5, 6))) np.column_stack((a,)) : array([[1, 2], [3, 4], [5, 6]]) # Maybe technically correct, but confusing a = np.array(((1, 2), (3, 4), (5, 6))) np.column_stack((a)) : array([[1, 3, 5], [2, 4, 6]])

Thanks so much for the advice of the docs, the description has been updated accordingly.
And the test case of the single array input is added o make sure the appropriate errors are raised.

mruberry · 2020-10-16T06:34:58Z

+    tensor([[1, 4],
+        [2, 5],
+        [3, 6]])
+    >>> a = torch.tensor([[1],[2],[3]])


I get what this example is trying to do but I think it could more distinct from the previous example by creating two tensors (using arange, for example), then reshaping them to be 2D, then column_stacking them plus a 1D tensor or two. That would highlight that 1D and 2D tensors can both appear in tensors.

Thanks for pointing this out.
The latest example manipulates torch.arange to generate the 1D tensor, torch.arange with torch.reshape for the 2D tensor, and construct a tuple with one 1D tensor and two 2D tensors as the input of torch.column_stack.

mruberry · 2020-10-16T06:35:23Z

-    def test_hstack(self, device, dtype):
-        self._test_special_stacks(1, 1, torch.hstack, np.hstack, device, dtype)
+    def test_hstack_column_stack(self, device, dtype):
+        ops = ((torch.hstack, np.hstack), (torch.column_stack, np.column_stack))


This is a great way to extend this test, but hstack != column_stack, right? Where are the differences between them tested?

Yes, the hstack != column_stack as the hstack requires all input to have the same dim.
The latest commit extends this test method by adding test cases where input tensors are combinations of 1D and 2D tensors for the torch.column_stack function.

mruberry · 2020-10-16T06:41:27Z

 _(aten, rnn_tanh_cell) \
 _(aten, rot90) \
 _(aten, round) \
+_(aten, row_stack) \


Since row_stack is an alias, see the instructions for adding an alias here:

pytorch/aten/src/ATen/native/native_functions.yaml

Line 236 in ec5f81f

# Note [Adding an alias]

Its entry needs to go in aten/src/ATen/core/interned_strings.h and torch/csrc/jit/passes/normalize_ops.cpp needs an update, too.

Thanks for the notes on adding aliases. the vstack entry is removed from aten/src/ATen/core/aten_interned_strings.h and both vstack and row_stack entries are added in the aten/src/ATen/core/interned_strings.h.
torch/csrc/jit/passes/normalize_ops.cpp is updated as well.

mruberry · 2020-10-16T06:47:21Z

+              "column_stack expects a non-empty TensorList");
+
+  auto rep = at::atleast_1d(tensors);
+  if (rep[0].dim() == 1) {


Is this assuming that every tensor in the list of tensors is 1D, or that every tensor in the list of tensors is 2D? I don't think this has to be the case:

a = np.array((1, 2, 3)) b = np.array(((1, 2), (3, 4), (5, 6))) np.column_stack((a, b)) : array([[1, 1, 2], [2, 3, 4], [3, 5, 6]])

one = np.array((-1, -2, -3)) two = np.array(((1, 2), (3, 4), (5, 6))) result_0 = np.column_stack((one, two, two)) result_1 = np.hstack((one.reshape(3, 1), two, two)) result_0 == result_1 : array([[ True, True, True, True, True], [ True, True, True, True, True], [ True, True, True, True, True]])

I think the transform just needs to be made conditional on the tensor being 1D?

Thanks so much for the detailed explanations, the transform here only needs to be applied to 1D tensors.

mruberry · 2020-10-16T06:57:35Z

+  TORCH_CHECK(tensors.size() > 0,
+              "column_stack expects a non-empty TensorList");
+
+  auto rep = at::atleast_1d(tensors);


Is this to try and support 0D tensors?

Yes, it is intended to support 0D tensors.
atleast_1d is no longer needed in the latest commit as the transform code will reshape 0D tensors into (1, 1).

mruberry

Hey @RockingJavaBean!

Thanks for submitting this PR, I'm excited to add {row, column}_stack to PyTorch!

I made a few comments on the documentation. The tests may need to be expanded for column_stack and hstack. A couple things need to be done for the row_stack alias (nice use of aliasing, btw). And I have a question about column_stack's implementation.

Looking forward to hearing more from you! Ping me when you're ready for another review or if you have a question.

…y_column_row_stack

RockingJavaBean · 2020-10-22T15:06:34Z

@mruberry I sincerely appreciate the insights and guidance you provided throughout this pull request.
Math keys are added for both column_stack and row_stack functions, the description of column_stack has been updated and forrmated.

RockingJavaBean · 2020-10-22T15:07:38Z

The test_streaming_backward_sync_graph_root of test/test_cuda.py failed in the pr/pytorch-linux-bionic-rocm3.8-py3.6 check.

20:05:45 FAIL: test_streaming_backward_sync_graph_root (__main__.TestCuda)
20:05:45 ----------------------------------------------------------------------
20:05:45 Traceback (most recent call last):
20:05:45   File "/var/lib/jenkins/.local/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py", line 833, in wrapper
20:05:45     method(*args, **kwargs)
20:05:45   File "test_cuda.py", line 1775, in test_streaming_backward_sync_graph_root
20:05:45     self.assertEqual(a.grad, grad * b)
20:05:45   File "/var/lib/jenkins/.local/lib/python3.6/site-packages/torch/testing/_internal/common_utils.py", line 1144, in assertEqual
20:05:45     self.assertTrue(result, msg=msg)
20:05:45 AssertionError: False is not true : Tensors failed to compare as equal! With rtol=1.3e-06 and atol=1e-05, found 1000 element(s) (out of 1000) whose difference(s) exceeded the margin of error (including 0 nan comparisons). The greatest difference was 3.0 (6.0 vs. 9.0), which occurred at index 0.

It seems unrelated to this PR, and I cannot reproduce locally as the test_streaming_backward_sync_graph_root passed on my local Windows10 and Ubuntu machine.

mruberry

Nice work, @RockingJavaBean!

I also don't think the test failure is related to your PR. Sorry for the confusion.

codecov · 2020-10-22T17:23:12Z

Codecov Report

Merging #46313 into master will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master   #46313      +/-   ##
==========================================
+ Coverage   68.94%   68.96%   +0.02%     
==========================================
  Files         434      434              
  Lines       56188    55998     -190     
==========================================
- Hits        38740    38621     -119     
+ Misses      17448    17377      -71

…y_column_row_stack

RockingJavaBean · 2020-10-26T12:48:02Z

@mruberry Thank you so much for your kind guidance throughout this PR. I have updated the pull request with the latest commit and the previous failed ROCM check passed.
Please kindly confirm whether this PR is ready for landing, or should I further update the code.

facebook-github-bot

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

mruberry · 2020-10-26T15:15:05Z

@mruberry Thank you so much for your kind guidance throughout this PR. I have updated the pull request with the latest commit and the previous failed ROCM check passed.
Please kindly confirm whether this PR is ready for landing, or should I further update the code.

Thank you. I think we just need to wait for the tests.

RockingJavaBean · 2020-10-27T01:54:35Z

@mruberry Thanks for importing this PR, I saw the Facebook Internal check failed due to fbcode-target-determinator.
Could you please check the logs of fbcode-target-determinator and I would update the code with a fix?

facebook-github-bot

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

mruberry · 2020-10-27T15:30:54Z

@mruberry Thanks for importing this PR, I saw the Facebook Internal check failed due to fbcode-target-determinator.
Could you please check the logs of fbcode-target-determinator and I would update the code with a fix?

It's just an internal failure where I need to retrigger the build. It's not related to the PR.

mruberry · 2020-10-27T16:05:54Z

Hey @RockingJavaBean, would you rebase this, please? The internal tool is hitting an issue that a rebase should solve.

…y_column_row_stack

facebook-github-bot

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…y_column_row_stack

facebook-github-bot

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

RockingJavaBean · 2020-10-29T14:58:40Z

@mruberry I’m so thankful for your kind help of landing this PR.
I saw the recent PR #46970 is to desugar missing dispatch field into singleton Math entry. According to #46970 (comment), the Math key would be the default dispatch entry,
Should we consider removing the Math keys for torch.row_stack and torch.column_stack functions, then import this PR for the Facebook Internal Check?

mruberry · 2020-10-29T16:20:13Z

@mruberry I’m so thankful for your kind help of landing this PR.
I saw the recent PR #46970 is to desugar missing dispatch field into singleton Math entry. According to #46970 (comment), the Math key would be the default dispatch entry,
Should we consider removing the Math keys for torch.row_stack and torch.column_stack functions, then import this PR for the Facebook Internal Check?

It's OK. This should be landing soon and it's just a style difference.

facebook-github-bot · 2020-10-29T20:22:00Z

@mruberry merged this pull request in 74d730c.

…6313) Summary: Related pytorch#38349 This PR implements `column_stack` as the composite ops of `torch.reshape` and `torch.hstack`, and makes `row_stack` as the alias of `torch.vstack`. Todo - [x] docs - [x] alias pattern for `row_stack` Pull Request resolved: pytorch#46313 Reviewed By: ngimel Differential Revision: D24585471 Pulled By: mruberry fbshipit-source-id: 62fc0ffd43d051dc3ecf386a3e9c0b89086c1d1c

RockingJavaBean marked this pull request as draft October 14, 2020 11:43

pytorchbot added the open source label Oct 14, 2020

RockingJavaBean mentioned this pull request Oct 14, 2020

Clang-tidy failed to run due to Python 3.9.0 of setup-python action #46314

Closed

RockingJavaBean added 3 commits October 15, 2020 20:08

implement column_stack and row_stack as composite ops

ec5542d

implement alias for row_stack

a44dca0

docs

f46275a

RockingJavaBean force-pushed the numpy_column_row_stack branch from 42e674a to f46275a Compare October 15, 2020 12:09

RockingJavaBean marked this pull request as ready for review October 15, 2020 12:09

RockingJavaBean requested a review from apaszke as a code owner October 15, 2020 12:09

RockingJavaBean changed the title ~~[WIP] implement NumPy-like functionality column_stack, row_stack~~ implement NumPy-like functionality column_stack, row_stack Oct 15, 2020

malfet added the module: numpy Related to numpy support, and also numpy compatibility of our operators label Oct 15, 2020

malfet requested a review from mruberry October 15, 2020 19:23

malfet added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Oct 15, 2020

mruberry removed the request for review from apaszke October 16, 2020 06:06

mruberry reviewed Oct 16, 2020

View reviewed changes

Comment thread test/test_op_aliases.py

mruberry reviewed Oct 16, 2020

View reviewed changes

RockingJavaBean added 5 commits October 19, 2020 22:23

Merge branch 'master' of https://github.com/pytorch/pytorch into nump…

0dddd5a

…y_column_row_stack

feedbacks

145b58a

add comments and fix flake issue

f9121e7

Merge branch 'master' of https://github.com/pytorch/pytorch into nump…

ec96bc2

…y_column_row_stack

add test for single array

95a5f48

Math key for column_stack

78b597c

mruberry self-requested a review October 22, 2020 16:47

mruberry approved these changes Oct 22, 2020

View reviewed changes

RockingJavaBean added 2 commits October 23, 2020 13:12

Merge branch 'master' of https://github.com/pytorch/pytorch into nump…

9cad980

…y_column_row_stack

Merge branch 'master' of https://github.com/pytorch/pytorch into nump…

8b93a96

…y_column_row_stack

facebook-github-bot reviewed Oct 26, 2020

View reviewed changes

facebook-github-bot reviewed Oct 27, 2020

View reviewed changes

RockingJavaBean added 2 commits October 28, 2020 07:06

Merge branch 'master' of https://github.com/pytorch/pytorch into nump…

a5c54db

…y_column_row_stack

Merge branch 'master' of https://github.com/pytorch/pytorch into nump…

9a0c814

…y_column_row_stack

facebook-github-bot reviewed Oct 27, 2020

View reviewed changes

facebook-github-bot reviewed Oct 28, 2020

View reviewed changes

Merge branch 'master' of https://github.com/pytorch/pytorch into nump…

dceaf17

…y_column_row_stack

facebook-github-bot reviewed Oct 29, 2020

View reviewed changes

facebook-github-bot closed this in 74d730c Oct 29, 2020

facebook-github-bot added the Merged label Oct 29, 2020

RockingJavaBean deleted the numpy_column_row_stack branch October 30, 2020 01:56

mruberry mentioned this pull request Nov 25, 2020

NumPy-like Functionality Request Rollup #38349

Closed

41 tasks

Conversation

RockingJavaBean commented Oct 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Oct 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

Extra GitHub checks: 1 failed

Uh oh!

dr-ci Bot commented Oct 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

🕵️ 1 new failure recognized by patterns

pytorch_xla_linux_bionic_py3_6_clang9_test (1/1)

❄️ 1 failure tentatively classified as flaky

pytorch_bazel_test (1/1)

Uh oh!

RockingJavaBean commented Oct 15, 2020

Uh oh!

mruberry Oct 16, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mruberry Oct 16, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mruberry left a comment

Choose a reason for hiding this comment

Uh oh!

RockingJavaBean commented Oct 22, 2020

Uh oh!

RockingJavaBean commented Oct 22, 2020

Uh oh!

mruberry left a comment

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Oct 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

RockingJavaBean commented Oct 26, 2020

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

mruberry commented Oct 26, 2020

Uh oh!

RockingJavaBean commented Oct 27, 2020

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

mruberry commented Oct 27, 2020

Uh oh!

RockingJavaBean commented Oct 14, 2020 •

edited

Loading

facebook-github-bot commented Oct 14, 2020 •

edited

Loading

dr-ci Bot commented Oct 14, 2020 •

edited

Loading

mruberry Oct 16, 2020 •

edited

Loading

mruberry Oct 16, 2020 •

edited

Loading

codecov Bot commented Oct 22, 2020 •

edited

Loading