Implement NumPy-like function torch.float_power()#44937
Implement NumPy-like function torch.float_power()#44937Kiyosora wants to merge 12 commits intopytorch:masterfrom
Conversation
b4b6ae7 to
3d60565
Compare
💊 CI failures summary and remediationsAs of commit 453287e (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 231 times. |
4c75c49 to
6f5552f
Compare
6f5552f to
a98486e
Compare
|
Hi, @mruberry ! A new NumPy-like function |
|
This is awesome, @Kiyosora! Just an fyi that we're finishing the PyTorch 1.7 release this week, so this review may be delayed a few days. Sorry for not being more responsive. |
|
Thanks for the rapid reply, @mruberry ! |
a98486e to
bd88b41
Compare
Codecov Report
@@ Coverage Diff @@
## master #44937 +/- ##
=======================================
Coverage 80.92% 80.93%
=======================================
Files 1855 1855
Lines 200156 200181 +25
=======================================
+ Hits 161981 162011 +30
+ Misses 38175 38170 -5 |
There was a problem hiding this comment.
resize_output to resize instead of resize_as. (pow also needs to be updated, but it doesn't need to be updated in this PR.)
There was a problem hiding this comment.
Would you please fix this for pow, too, actually. Sample code demonstrating the error:
torch.tensor((1 + 1j, 2 + 2j)) ** complex(1, 0)
: RuntimeError: fill_ only supports 0-dimension value tensor but got tensor with 1 dimensions.
There was a problem hiding this comment.
Do these need explicit dispatches? I think the system will work out which functions to call. Explicitly defining the dispatch will, perhaps surprisingly, prevent float_power from deriving its derivative if it's implemented as a composite operation (as suggested above).
There was a problem hiding this comment.
pow also has an inplace variant, pow_:
Those entries should probably be moved next to the other pow entries, too.
There was a problem hiding this comment.
Once I remove the explicit dispatches, the following error comes out:
AssertionError: There's a formula for float_power(or its functional variant) in derivatives.yaml. It's required to add a dispatch section for it with explicit supported backends e.g CPU/CUDA or DefaultBackend in native_functions.yaml. Please see https://github.com/pytorch/pytorch/tree/master/aten/src/ATen/native#choosing-the-right-dispatch-keyword for instructions to choose the right dispatch keyword.
So, I guess we need the explicit dispatches here for autograd.
In addition, The improvement for pow_ is now progressing in the separate PR #46830.
There was a problem hiding this comment.
The way dispatch happens has actually changed. I think the correct dispatch for these is now Math: instead of CPU, CUDA since float_power is implemented using pow.
There was a problem hiding this comment.
It seems that when using Math: as dispatch, both autograd test and XLA test will suffer from the precision lack. Even if I directly call pow without any dtype casting like this below, the problem still exists.
Tensor float_power(const Tensor& base, const Tensor& exp) {
return at::pow(base, exp);
}
Since using CPU, CUDA can avoid from precision lack, maybe we should revert to it?
I am not familiar with autograd yet, maybe I have missed something... 😕
There was a problem hiding this comment.
Sorry, can you elaborate on the lack of precision you're seeing, and how changing the dispatch can help with it?
There was a problem hiding this comment.
Sorry for the late reply, @mruberry
When I use Math as dispatch, an assertEqual error occurs in test_autograd, saying that the gradients calculated by in-place variant is inconsistent with the general, just like:
>>> a=torch.tensor([1.,2.,3.], dtype=torch.double, requires_grad=True)
>>> b=torch.tensor([1.,2.,3.], dtype=torch.double, requires_grad=True)
>>> grad=torch.randn_like(a).double()
>>> a.float_power(2.).backward(grad)
>>> a.grad
tensor([-4.0256, -1.6108, 1.2338], dtype=torch.float64)
>>> b.float_power_(2.).backward(grad)
>>> b.grad
tensor([-6.0385, -2.0134, 1.4394], dtype=torch.float64)
But in fact, the in-place variants usually not allow to calculating gradients, the original pow is also doing as so.
>>> a.pow_(2.).backward(grad)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.
And when I changed dispatch from Math to CPU, CUDA, making a define in tools/autograd/derivatives.yaml (as we do in the previous version), The above abnormal phenomenon was eliminated.
It seems that there still not have any in-place variant used Math as dispatch so far, so I doubt it may related with this phenomenon...
There was a problem hiding this comment.
RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.
This error is unrelated to the pow formula, it only happens because you modify your leaf inplace. Doing a.clone().pow_(2.) should work just fine.
saying that the gradients calculated by in-place variant is inconsistent with the general
If you don't provide a formula directly in derivatives.yaml, you need to make sure to only ever call functions that do from your generic aten implementation. In particular, always call the at:: version and not native:: version of ops.
This reverts commit f4b646eb8aa2db98ddb696036f59af26925e41d6.
There was a problem hiding this comment.
Instead of just continuing here, attempt to perform the operation and verify it throws an error using self.assertRaisesRegex.
There was a problem hiding this comment.
As above, verify this throws an error instead of continuing
There was a problem hiding this comment.
I would remove this paragraph. It's correct, but a little misleading since this PR also allows :attr:`input` to be a scalar. The type support can be mentioned below, too, instead of here.
There was a problem hiding this comment.
This mathematical portion is also correct but can be removed. The description above it is already very clear.
There was a problem hiding this comment.
"{input}" -> "input (Tensor or Number): the base value(s)"
mruberry
left a comment
There was a problem hiding this comment.
Hey @Kiyosora!
Sorry for the delay in reviewing this PR. This week is a holiday for the PyTorch team.
This PR is very good and close to being ready. I made a few small final suggestions. Looking forward to reviewing the final update!
edb2c7e to
453287e
Compare
Happy Thanksgiving @mruberry ! Really appreciate for your consistant help! I updated this PR by you suggestions, |
facebook-github-bot
left a comment
There was a problem hiding this comment.
@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: - Related with #44937 - Use `resize_output` instead of `resize_as` - Tuning the `native_functions.yaml`, move the inplace variant `pow_` next to the other `pow` entries Pull Request resolved: #46830 Reviewed By: mrshenli Differential Revision: D24567702 Pulled By: anjali411 fbshipit-source-id: a352422c9d4e356574dbfdf21fb57f7ca7c6075d
Summary: - Related with pytorch#38349 - Implementing the NumPy-like function `torch.float_power()` . Pull Request resolved: pytorch#44937 Reviewed By: ngimel Differential Revision: D25192119 Pulled By: mruberry fbshipit-source-id: 2e446b8e0c2825f045fe057e30c9419335557a05
Summary: - Related with pytorch#44937 - Use `resize_output` instead of `resize_as` - Tuning the `native_functions.yaml`, move the inplace variant `pow_` next to the other `pow` entries Pull Request resolved: pytorch#46830 Reviewed By: mrshenli Differential Revision: D24567702 Pulled By: anjali411 fbshipit-source-id: a352422c9d4e356574dbfdf21fb57f7ca7c6075d
torch.float_power().