Implement hstack, vstack, dstack by muthuArivoli · Pull Request #42799 · pytorch/pytorch

muthuArivoli · 2020-08-10T07:42:52Z

Related to #38349

dr-ci · 2020-08-10T07:49:59Z

💊 CI failures summary and remediations

As of commit a03e866 (more details on the Dr. CI page):

1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pytorch_macos_10_13_py3_test (1/1)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

Aug 14 11:06:56 [E request_callback_no_python.cpp:618] Received error while processing request type 2: RuntimeError: Can not pickle torch.futures.Future

Aug 14 11:06:56 At: 
Aug 14 11:06:56   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(93): serialize 
Aug 14 11:06:56   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(145): serialize 
Aug 14 11:06:56  
Aug 14 11:06:56 [E request_callback_no_python.cpp:618] Received error while processing request type 2: RuntimeError: Can not pickle torch.futures.Future 
Aug 14 11:06:56  
Aug 14 11:06:56 At: 
Aug 14 11:06:56   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(93): serialize 
Aug 14 11:06:56   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(145): serialize 
Aug 14 11:06:56  
Aug 14 11:06:56 [E request_callback_no_python.cpp:618] Received error while processing request type 2: RuntimeError: Can not pickle torch.futures.Future 
Aug 14 11:06:56  
Aug 14 11:06:56 At: 
Aug 14 11:06:56   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(93): serialize 
Aug 14 11:06:56   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(145): serialize 
Aug 14 11:06:56  
Aug 14 11:06:56 ok (1.377s) 
Aug 14 11:06:58   test_return_future_remote (__main__.ProcessGroupRpcTestWithSpawn) ... ok (1.331s) 
Aug 14 11:06:59   test_return_local_rrefs (__main__.ProcessGroupRpcTestWithSpawn) ... ok (1.377s) 
Aug 14 11:07:00   test_rpc_return_rref (__main__.ProcessGroupRpcTestWithSpawn) ... ok (1.401s) 
Aug 14 11:07:08   test_rpc_timeouts (__main__.ProcessGroupRpcTestWithSpawn) ... ok (7.890s)

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 86 times.

muthuArivoli · 2020-08-11T01:38:04Z

@mruberry PTAL

mruberry · 2020-08-11T07:10:17Z

Nit: these examples are excellent but maybe

a = torch.tensor([[1],[2],[3]]) b = torch.tensor([[4],[5],[6]])

would be clearer?

mruberry · 2020-08-11T07:10:24Z

See numbering suggestion below.

mruberry · 2020-08-11T07:10:39Z

See numbering suggestion above.

mruberry · 2020-08-11T07:27:06Z

These tests are good, but this test's case generation is limited to replicating the same tensor shape for each element of the input list. Here are some cases I was thinking about:

op(t)

the behavior of np.hstack(a) is strange and np.hstack(a) != np.hstack((a,)) (the same is true for np.dstack)

do we even support non-tuple arguments? if not we should validate this throws a runtime error

if we support single tensor arguments, is np.hstack(a)'s and np.dstack(a)'s behavior correct?

op((a, b, c, ...))

validating that if they differ on an unexpected dim an error is thrown (maybe _test_special_stacks should take a dim argument corresponding to the op?)

validating that if they differ only on the expected dim the result is equivalent to NumPy

are tensors with a size zero dim handled correctly? (if not that's OK, but let's assert it doesn't work)

np.hstack has special-handling of 1D tensors (as your implementation does), does test_hstack need a custom elaboration to test that behavior, in particular?

validating that tensors with different shapes but the same post-atleast_Xd shapes meet the criteria work

For an example of the last bullet:

a = np.array([[[1],[2],[3]]]) b = np.array((4, 5, 6)) np.dstack((a, b)) : array([[[1, 4], [2, 5], [3, 6]]])

This is a good number of cases but validating each one by hand shouldn't be too laborious, I hope.

What are your thoughts? Are there other cases I missed?

mruberry

Overall looks excellent. A couple minor nits about the doc examples and questions about test coverage.

muthuArivoli · 2020-08-14T02:48:11Z

@mruberry I have added tests that I think cover all of the cases. They cover:

Runtime Error for non-tuple argument
0 dimension support
For vstack and dstack, test with tensors with same post at_least shape, but different original shape
Test whether the function is accurate when the only dim that is different is the dim that will be concatenated
Test whether a runtime error is thrown when the changed dim is different than the one that is concatenated

For the last two, those are tested from 1 to 4 dimensions, so the special behavior for hstack is included with that. I have also added some autograd tests in a similar manner to the existing stack autograd test.

Does this sound good, or do I need more tests?

mruberry · 2020-08-14T07:42:57Z

+                else:
+                    # Invalid dimensions, test for error
+                    with self.assertRaisesRegex(RuntimeError, "Sizes of tensors must match except in dimension"):
+                        torch_fn(torch_input)


Would you add an assert that NumPy also throws a runtime error in this case? You don't need to assert a string is thrown:

with self.assertRaises(RuntimeError): np_fn(np_input)

mruberry

Nice work, @muthuArivoli!

Would you just fix that one minor nit on the tests and we'll get this merged?

Let me know if you're interested in working on a new problem.

muthuArivoli · 2020-08-14T19:58:02Z

@mruberry I added the numpy error check. Is it ok that numpy throws a ValueError, while we throw a RuntimeError?

Yes, I'm interested in working on a new problem, do you have any recommendations?

facebook-github-bot

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

mruberry · 2020-08-14T20:25:21Z

@mruberry I added the numpy error check. Is it ok that numpy throws a ValueError, while we throw a RuntimeError?

Absolutely OK. Nice work.

Yes, I'm interested in working on a new problem, do you have any recommendations?

For symmetry there are the split functions, hsplit, vsplit, and dsplit. A slightly more challenging binary function is divmod, because it returns two tensors. There are the "polynomial" functions, like polyadd and polyder, but I'm hoping someone will write all of them near simultaneously because they have a lot of common structure. There are also unary functions, like nan_to_num, that would be very helpful.

If you'd like something more exotic or especially numerically challenging there are also functions like the kaiser windowing function.

muthuArivoli · 2020-08-14T21:48:03Z

Two questions:

For hsplit, vsplit, dsplit, do the functions need to be exactly like their numpy equivalents, or can they just wrap around the torch.split function? The torch.split function is different than the np.split function when the second argument is a list, with np using indices and torch using lengths, but they essentially perform the same role.
For the poly functions, it seems like the ones linked in the tracking issue have been superseded by np.polynomial, and that the main reason they are still around are for BC reasons. Should the implementation follow the functions linked in the tracking issue, or the ones linked here, and would a new namespace need to be created? Mainly asking since the representation of the degrees of the polynomial was reversed between the two.

mruberry · 2020-08-15T00:39:21Z

Two questions:

For hsplit, vsplit, dsplit, do the functions need to be exactly like their numpy equivalents, or can they just wrap around the torch.split function? The torch.split function is different than the np.split function when the second argument is a list, with np using indices and torch using lengths, but they essentially perform the same role.

For the poly functions, it seems like the ones linked in the tracking issue have been superseded by np.polynomial, and that the main reason they are still around are for BC reasons. Should the implementation follow the functions linked in the tracking issue, or the ones linked here, and would a new namespace need to be created? Mainly asking since the representation of the degrees of the polynomial was reversed between the two.

Excellent questions.

Analogous to torch.split's behavior.
You're correct and I should update the references to point to the newest ones in the new namespace, and you're right that would require adding a new namespace, which is kinda a pain to do. I should probably remove those until we decide to implement the polynomial namespace.

facebook-github-bot · 2020-08-16T04:11:36Z

@mruberry merged this pull request in 5bcf9b0.

Summary: Related to pytorch#38349 Pull Request resolved: pytorch#42799 Reviewed By: izdeby Differential Revision: D23140704 Pulled By: mruberry fbshipit-source-id: 6a36363562c50d0abce87021b84b194bb32825fb

muthuArivoli marked this pull request as draft August 10, 2020 07:44

pytorchbot added the open source label Aug 10, 2020

muthuArivoli changed the title ~~[WIP] Implement hstack, vstack, dstack~~ Implement hstack, vstack, dstack Aug 11, 2020

muthuArivoli marked this pull request as ready for review August 11, 2020 01:37

mruberry reviewed Aug 11, 2020

View reviewed changes

Comment thread torch/_torch_docs.py Outdated

Copy link
Copy Markdown

Collaborator

mruberry Aug 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See numbering suggestion below.

mruberry reviewed Aug 11, 2020

View reviewed changes

Comment thread torch/_torch_docs.py Outdated

Copy link
Copy Markdown

Collaborator

mruberry Aug 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See numbering suggestion above.

mruberry reviewed Aug 11, 2020

View reviewed changes

mruberry requested changes Aug 11, 2020

View reviewed changes

zou3519 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 11, 2020

muthuArivoli added 9 commits August 12, 2020 02:58

implement hstack, vstack, dstack

3913c95

add test

52d5035

add docs

9d8e8b4

update tests

55af5db

change single tensor to list of single tensor

35ad81b

fix test

ad3b0ee

reduce tests

72e255c

catch empty tensorlist error

b32da36

rewrite tests

18e9013

muthuArivoli force-pushed the implement-stack branch from 315a28d to 18e9013 Compare August 12, 2020 06:59

muthuArivoli added 8 commits August 12, 2020 13:11

fix type of error

a5b1577

add at_least_dim param to tests

683aa03

Merge remote-tracking branch 'upstream/master' into implement-stack

b538184

fix loop variable and at least dim

9d04a56

add better scalar test and fix tuples

cc94a12

fix tuple and add autograd tests

4fe1c21

fix xla failure

9bb926a

Merge remote-tracking branch 'upstream/master' into implement-stack

e294a12

mruberry reviewed Aug 14, 2020

View reviewed changes

mruberry approved these changes Aug 14, 2020

View reviewed changes

add numpy error

a03e866

facebook-github-bot reviewed Aug 14, 2020

View reviewed changes

facebook-github-bot closed this in 5bcf9b0 Aug 16, 2020

facebook-github-bot added the merged label Aug 16, 2020

mruberry mentioned this pull request Aug 16, 2020

NumPy-like Functionality Request Rollup #38349

Closed

41 tasks

mruberry mentioned this pull request Aug 25, 2020

N dimensional histogram function in pytorch as in numpy.histogramdd #29209

Closed

mruberry added the Merged label Oct 28, 2020

Conversation

muthuArivoli commented Aug 10, 2020

Uh oh!

dr-ci Bot commented Aug 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

🕵️ 1 new failure recognized by patterns

pytorch_macos_10_13_py3_test (1/1)

Uh oh!

muthuArivoli commented Aug 11, 2020

Uh oh!

mruberry Aug 11, 2020

Choose a reason for hiding this comment

Uh oh!

mruberry Aug 11, 2020

Choose a reason for hiding this comment

Uh oh!

mruberry Aug 11, 2020

Choose a reason for hiding this comment

Uh oh!

mruberry Aug 11, 2020

Choose a reason for hiding this comment

Uh oh!

mruberry left a comment

Choose a reason for hiding this comment

Uh oh!

muthuArivoli commented Aug 14, 2020

Uh oh!

mruberry Aug 14, 2020

Choose a reason for hiding this comment

Uh oh!

mruberry left a comment

Choose a reason for hiding this comment

Uh oh!

muthuArivoli commented Aug 14, 2020

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

mruberry commented Aug 14, 2020

Uh oh!

muthuArivoli commented Aug 14, 2020

Uh oh!

mruberry commented Aug 15, 2020

Uh oh!

facebook-github-bot commented Aug 16, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

dr-ci Bot commented Aug 10, 2020 •

edited

Loading