implement NumPy-like functionality column_stack, row_stack#46313
implement NumPy-like functionality column_stack, row_stack#46313RockingJavaBean wants to merge 16 commits intopytorch:masterfrom
Conversation
💊 CI failures summary and remediationsAs of commit 42e674aadf (more details on the Dr. CI page):
Extra GitHub checks: 1 failed
This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 3 times. |
💊 CI failures summary and remediationsAs of commit dceaf17 (more details on the Dr. CI page):
🕵️ 1 new failure recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
42e674a to
f46275a
Compare
|
@mruberry This PR implements the NumPy-like functions |
| r""" | ||
| column_stack(tensors, *, out=None) -> Tensor | ||
|
|
||
| Stack 1D tensors as columns into a 2D tensor, 2D tensors are stacked just like :func:`torch.hstack`. |
There was a problem hiding this comment.
This description is already good, but I think something like this would be a little clearer:
Creates a new 2-dimensional tensor by horizontally stacking the tensors in :attr:`tensors`.
:attr:`tensors` must be composed entirely of 1-dimensional and 2-dimensional tensors whose first dimension is the same size. The 1-dimensional tensors in :attr:`tensors` are reshaped into (*, 1) columns before being stacked horizontally.
There was a problem hiding this comment.
Not related to the documentation, but we should be sure we error out if a tensor and not a list of tensors is given. NumPy has some odd behavior when given a single array, probably because it's so aggressive at interpreting its inputs:
# This seems correct
a = np.array(((1, 2), (3, 4), (5, 6)))
np.column_stack((a,))
: array([[1, 2],
[3, 4],
[5, 6]])
# Maybe technically correct, but confusing
a = np.array(((1, 2), (3, 4), (5, 6)))
np.column_stack((a))
: array([[1, 3, 5],
[2, 4, 6]])
There was a problem hiding this comment.
Thanks so much for the advice of the docs, the description has been updated accordingly.
And the test case of the single array input is added o make sure the appropriate errors are raised.
| tensor([[1, 4], | ||
| [2, 5], | ||
| [3, 6]]) | ||
| >>> a = torch.tensor([[1],[2],[3]]) |
There was a problem hiding this comment.
I get what this example is trying to do but I think it could more distinct from the previous example by creating two tensors (using arange, for example), then reshaping them to be 2D, then column_stacking them plus a 1D tensor or two. That would highlight that 1D and 2D tensors can both appear in tensors.
There was a problem hiding this comment.
Thanks for pointing this out.
The latest example manipulates torch.arange to generate the 1D tensor, torch.arange with torch.reshape for the 2D tensor, and construct a tuple with one 1D tensor and two 2D tensors as the input of torch.column_stack.
| def test_hstack(self, device, dtype): | ||
| self._test_special_stacks(1, 1, torch.hstack, np.hstack, device, dtype) | ||
| def test_hstack_column_stack(self, device, dtype): | ||
| ops = ((torch.hstack, np.hstack), (torch.column_stack, np.column_stack)) |
There was a problem hiding this comment.
This is a great way to extend this test, but hstack != column_stack, right? Where are the differences between them tested?
There was a problem hiding this comment.
Yes, the hstack != column_stack as the hstack requires all input to have the same dim.
The latest commit extends this test method by adding test cases where input tensors are combinations of 1D and 2D tensors for the torch.column_stack function.
| _(aten, rnn_tanh_cell) \ | ||
| _(aten, rot90) \ | ||
| _(aten, round) \ | ||
| _(aten, row_stack) \ |
There was a problem hiding this comment.
Since row_stack is an alias, see the instructions for adding an alias here:
Its entry needs to go in aten/src/ATen/core/interned_strings.h and torch/csrc/jit/passes/normalize_ops.cpp needs an update, too.
There was a problem hiding this comment.
Thanks for the notes on adding aliases. the vstack entry is removed from aten/src/ATen/core/aten_interned_strings.h and both vstack and row_stack entries are added in the aten/src/ATen/core/interned_strings.h.
torch/csrc/jit/passes/normalize_ops.cpp is updated as well.
| "column_stack expects a non-empty TensorList"); | ||
|
|
||
| auto rep = at::atleast_1d(tensors); | ||
| if (rep[0].dim() == 1) { |
There was a problem hiding this comment.
Is this assuming that every tensor in the list of tensors is 1D, or that every tensor in the list of tensors is 2D? I don't think this has to be the case:
a = np.array((1, 2, 3))
b = np.array(((1, 2), (3, 4), (5, 6)))
np.column_stack((a, b))
: array([[1, 1, 2],
[2, 3, 4],
[3, 5, 6]])
There was a problem hiding this comment.
one = np.array((-1, -2, -3))
two = np.array(((1, 2), (3, 4), (5, 6)))
result_0 = np.column_stack((one, two, two))
result_1 = np.hstack((one.reshape(3, 1), two, two))
result_0 == result_1
: array([[ True, True, True, True, True],
[ True, True, True, True, True],
[ True, True, True, True, True]])
I think the transform just needs to be made conditional on the tensor being 1D?
There was a problem hiding this comment.
Thanks so much for the detailed explanations, the transform here only needs to be applied to 1D tensors.
| TORCH_CHECK(tensors.size() > 0, | ||
| "column_stack expects a non-empty TensorList"); | ||
|
|
||
| auto rep = at::atleast_1d(tensors); |
There was a problem hiding this comment.
Is this to try and support 0D tensors?
There was a problem hiding this comment.
Yes, it is intended to support 0D tensors.
atleast_1d is no longer needed in the latest commit as the transform code will reshape 0D tensors into (1, 1).
mruberry
left a comment
There was a problem hiding this comment.
Hey @RockingJavaBean!
Thanks for submitting this PR, I'm excited to add {row, column}_stack to PyTorch!
I made a few comments on the documentation. The tests may need to be expanded for column_stack and hstack. A couple things need to be done for the row_stack alias (nice use of aliasing, btw). And I have a question about column_stack's implementation.
Looking forward to hearing more from you! Ping me when you're ready for another review or if you have a question.
|
@mruberry I sincerely appreciate the insights and guidance you provided throughout this pull request. |
|
The It seems unrelated to this PR, and I cannot reproduce locally as the |
mruberry
left a comment
There was a problem hiding this comment.
Nice work, @RockingJavaBean!
I also don't think the test failure is related to your PR. Sorry for the confusion.
Codecov Report
@@ Coverage Diff @@
## master #46313 +/- ##
==========================================
+ Coverage 68.94% 68.96% +0.02%
==========================================
Files 434 434
Lines 56188 55998 -190
==========================================
- Hits 38740 38621 -119
+ Misses 17448 17377 -71 |
…y_column_row_stack
…y_column_row_stack
|
@mruberry Thank you so much for your kind guidance throughout this PR. I have updated the pull request with the latest commit and the previous failed |
facebook-github-bot
left a comment
There was a problem hiding this comment.
@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Thank you. I think we just need to wait for the tests. |
|
@mruberry Thanks for importing this PR, I saw the |
facebook-github-bot
left a comment
There was a problem hiding this comment.
@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
It's just an internal failure where I need to retrigger the build. It's not related to the PR. |
|
Hey @RockingJavaBean, would you rebase this, please? The internal tool is hitting an issue that a rebase should solve. |
…y_column_row_stack
…y_column_row_stack
facebook-github-bot
left a comment
There was a problem hiding this comment.
@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
facebook-github-bot
left a comment
There was a problem hiding this comment.
@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
…y_column_row_stack
facebook-github-bot
left a comment
There was a problem hiding this comment.
@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
@mruberry I’m so thankful for your kind help of landing this PR. |
It's OK. This should be landing soon and it's just a style difference. |
…6313) Summary: Related pytorch#38349 This PR implements `column_stack` as the composite ops of `torch.reshape` and `torch.hstack`, and makes `row_stack` as the alias of `torch.vstack`. Todo - [x] docs - [x] alias pattern for `row_stack` Pull Request resolved: pytorch#46313 Reviewed By: ngimel Differential Revision: D24585471 Pulled By: mruberry fbshipit-source-id: 62fc0ffd43d051dc3ecf386a3e9c0b89086c1d1c
Related #38349
This PR implements
column_stackas the composite ops oftorch.reshapeandtorch.hstack, and makesrow_stackas the alias oftorch.vstack.Todo
row_stack