Rewrite kron with broadcasting at::mul by zasdfgbnm · Pull Request #50927 · pytorch/pytorch

zasdfgbnm · 2021-01-22T02:08:11Z

Because it is shorter, faster, and does not have TF32 issue.

Benchmark: https://github.com/zasdfgbnm/things/blob/master/2021Q1/kron.ipynb

facebook-github-bot · 2021-01-22T02:08:23Z

💊 CI failures summary and remediations

As of commit f1aa841 (more details on the Dr. CI page):

1/1 failures possibly* introduced in this PR
- 1/1 non-CircleCI failure(s)

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

codecov · 2021-01-22T07:22:54Z

Codecov Report

Merging #50927 (f1aa841) into master (8e9ed27) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master   #50927      +/-   ##
==========================================
- Coverage   81.00%   80.99%   -0.01%     
==========================================
  Files        1916     1916              
  Lines      209481   209473       -8     
==========================================
- Hits       169690   169671      -19     
- Misses      39791    39802      +11

IvanYashchuk

Tests are passing and it is faster now, so I'm happy with these changes.
Thank you!

ngimel · 2021-01-22T17:33:11Z

-  return result;
+Tensor kron(const Tensor& self, const Tensor& other) {
+  at::Tensor result;
+  return at::kron_out(result, self, other);


we silently enabled type promotion for kron :-)

ngimel · 2021-01-22T18:21:50Z

+  auto maxdim = std::max(self.dim(), other.dim());
+  auto pad_self = maxdim - self.dim();
+  auto pad_other = maxdim - other.dim();
+  DimVector a_reshape(2 * maxdim);


This will be worse than std::vector for 3+d tensors, because it will have to grow original 5-element vector

changed to c10::SmallVector<int64_t, 10>

facebook-github-bot

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

ngimel · 2021-01-22T18:51:17Z

If you have it handy, can you post final benchmarking numbers for this implementation instead of python?

zasdfgbnm · 2021-01-22T20:11:16Z

@ngimel
CPU 64.4us
CUDA 25.1us

facebook-github-bot · 2021-01-23T05:01:28Z

@ngimel merged this pull request in ab331da.

mruberry · 2021-01-25T09:18:23Z

+    a_reshape[2 * i] = i >= pad_self ? self.sizes()[i - pad_self] : 1;
+    a_reshape[2 * i + 1] = 1;
+    b_reshape[2 * i] = 1;
+    b_reshape[2 * i + 1] = i >= pad_other ? other.sizes()[i - pad_other] : 1;


style nit: parens after the equals sign for readability

mruberry · 2021-01-25T09:18:55Z

+  c10::SmallVector<int64_t, 10> a_reshape(2 * maxdim);
+  c10::SmallVector<int64_t, 10> b_reshape(2 * maxdim);
+  c10::SmallVector<int64_t, 10> result_reshape(maxdim);
+  for (int i = 0; i < maxdim; i++) {


decltype i to maxdim's type to be consistent with auto usage

mruberry · 2021-01-25T09:27:03Z

+    result = at::_unsafe_view(at::mul(self_view, other_view), result_reshape);
+  } else {
+    at::mul_out(result, self_view, other_view);
+    result.resize_(result_reshape);


Does this need to use resize_output? Actually, in this case things seem trickier than that.

What kind of tensor can a use pass to out= here without this complaining? If a user passes a tensor with shape result_reshape then can mul_out yell that it received a tensor of the incorrect size? If a user passes a tensor of the shape that mul_out likes then can it be resized without warning? That'd be bad if out= was a view, for example.

Is out=view a defined behavior?

Yes, typically it uses resize_ on out, so if out is correct size, it is directly used and view relationship persists, if it is not then anything could happen, so in that sense it's not defined. Here though preserving this behavior would require first re-viewing out with expected size of mul if possible (for reductions with similar problems I inserted unsqueezes and not cared about actual sizes, if they are wrong the resize_ would handle it), and then re-viewing it back.

Looks like the current approach preserve storage, but will erase discontiguity:

>>> import torch >>> a = torch.randn(4,6,8,10).contiguous(memory_format=torch.channels_last) >>> a.stride() (480, 1, 60, 6) >>> a.view(2,2,2,3,2,4,2,5).stride() (960, 480, 3, 1, 240, 60, 30, 6) >>> a.resize_(2,2,2,3,2,4,2,5).stride() (960, 480, 240, 80, 40, 10, 5, 1) >>> a = torch.randn(4,6,8,10).contiguous(memory_format=torch.channels_last) >>> a.view(2,2,2,3,2,4,2,5).resize_(2,2,2,3,2,4,2,5).stride() (960, 480, 3, 1, 240, 60, 30, 6) >>> a = torch.randn(4,6,8,10).contiguous(memory_format=torch.channels_last) >>> a.storage().data_ptr() 94728832336256 >>> a.resize_(2,2,2,3,2,4,2,5).storage().data_ptr() 94728832336256

>>> a = torch.randn(2,3,4,5).contiguous(memory_format=torch.channels_last) >>> b = torch.kron(a, a) >>> b.stride() (3600, 1, 225, 9) >>> torch.kron(a, a, out=b).stride() (3600, 400, 25, 1)

Should we do

result.resize_(result_reshape, self.suggest_memory_format());

instead of resize_output?

Sorry, why? We want to be consistent with PyTorch, where the out= semantics are that the operation is performed and then its output is "safe copied" to the tensor passed to out=. If the tensor has to be altered then a warning is thrown, unless the tensor had no elements.

No, we should preserve any permutation, like we automatically do now, not just channels last.
For a warning, we can do result.view(...with inserted 1 dimensions...) and rely on mul_out to throw a warning if sizes are wrong, or do as Xiang suggested a few comments up, and use resize_output there instead of resize_, it doesn't really matter.

Or maybe add an argument c10::optional<MemoryFormat> optional_memory_format to resize_output, and inside resize_output, check for both shape and memory format.

Please no, use of memory_format argument is wrong in most cases outside of torch.nn.

mruberry · 2021-01-25T09:27:24Z

-  result.copy_(result_tmp);
-  return result;
+Tensor kron(const Tensor& self, const Tensor& other) {
+  at::Tensor result;


We should use this pattern more often.

No, Sebastian wants to disallow it.

mruberry · 2021-01-25T09:31:15Z

@zasdfgbnm Would you check a broader set of tensor sizes, too, in your benchmark? Like, what's the smallest vs. the biggest tensors we can reasonably benchmark? Are odd shapes an issue?

zasdfgbnm · 2021-01-25T19:19:08Z

Let's move discussions to #51045

Summary: Followup of #50927 Pull Request resolved: #51045 Reviewed By: mruberry Differential Revision: D26089204 Pulled By: ngimel fbshipit-source-id: 77291dd83fba32d6f80a8540910b112a1d85a892

Summary: Because it is shorter, faster, and does not have TF32 issue. Benchmark: https://github.com/zasdfgbnm/things/blob/master/2021Q1/kron.ipynb Pull Request resolved: pytorch#50927 Reviewed By: glaringlee Differential Revision: D26022385 Pulled By: ngimel fbshipit-source-id: 513c9e9138c35c70d3a475a8407728af21321dae

Summary: Followup of pytorch#50927 Pull Request resolved: pytorch#51045 Reviewed By: mruberry Differential Revision: D26089204 Pulled By: ngimel fbshipit-source-id: 77291dd83fba32d6f80a8540910b112a1d85a892

Implement kron with elementwise op

a8ae35a

facebook-github-bot added the cla signed label Jan 22, 2021

zasdfgbnm requested review from IvanYashchuk, mruberry and ngimel January 22, 2021 02:08

pytorchbot added the open source label Jan 22, 2021

zasdfgbnm mentioned this pull request Jan 22, 2021

Fix TF32 failures in test_linalg.py #50453

Closed

zasdfgbnm changed the title ~~Rewrite kron with elementwise op~~ Rewrite kron with broadcasting at::mul Jan 22, 2021

IvanYashchuk approved these changes Jan 22, 2021

View reviewed changes

ngimel reviewed Jan 22, 2021

View reviewed changes

save

c8d54d8

ngimel reviewed Jan 22, 2021

View reviewed changes

c10::SmallVector<int64_t, 10>

f1aa841

ngimel approved these changes Jan 22, 2021

View reviewed changes

facebook-github-bot reviewed Jan 22, 2021

View reviewed changes

facebook-github-bot closed this in ab331da Jan 23, 2021

facebook-github-bot added the Merged label Jan 23, 2021

zasdfgbnm deleted the kron branch January 23, 2021 06:22

mruberry reviewed Jan 25, 2021

View reviewed changes

zasdfgbnm mentioned this pull request Jan 25, 2021

Followup of kron PR #51045

Closed

Conversation

zasdfgbnm commented Jan 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Jan 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

Uh oh!

codecov Bot commented Jan 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

IvanYashchuk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

ngimel commented Jan 22, 2021

Uh oh!

zasdfgbnm commented Jan 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Jan 23, 2021

Uh oh!

mruberry Jan 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zasdfgbnm Jan 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zasdfgbnm Jan 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mruberry Jan 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zasdfgbnm Jan 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mruberry commented Jan 25, 2021

Uh oh!

zasdfgbnm commented Jan 25, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

zasdfgbnm commented Jan 22, 2021 •

edited

Loading

facebook-github-bot commented Jan 22, 2021 •

edited

Loading

codecov Bot commented Jan 22, 2021 •

edited

Loading

zasdfgbnm commented Jan 22, 2021 •

edited

Loading

mruberry Jan 25, 2021 •

edited

Loading

zasdfgbnm Jan 25, 2021 •

edited

Loading

zasdfgbnm Jan 25, 2021 •

edited

Loading

mruberry Jan 25, 2021 •

edited

Loading

zasdfgbnm Jan 25, 2021 •

edited

Loading