Stop using c10::scalar_to_tensor in float_power.#50105
Stop using c10::scalar_to_tensor in float_power.#50105gchanan wants to merge 6 commits intogh/gchanan/348/basefrom
Conversation
There should be no functional change here. A couple of reasons here: 1) This function is generally an anti-pattern (#49758) and it is good to minimize its usage in the code base. 2) pow itself has a fair amount of smarts like not broadcasting scalar/tensor combinations and we should defer to it.
💊 CI failures summary and remediationsAs of commit 6ee0044 (more details on the Dr. CI page):
This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. This comment has been revised 54 times. |
There should be no functional change here. A couple of reasons here: 1) This function is generally an anti-pattern (#49758) and it is good to minimize its usage in the code base. 2) pow itself has a fair amount of smarts like not broadcasting scalar/tensor combinations and we should defer to it. ghstack-source-id: 3222bdf Pull Request resolved: #50105
There should be no functional change here. A couple of reasons here: 1) This function is generally an anti-pattern (#49758) and it is good to minimize its usage in the code base. 2) pow itself has a fair amount of smarts like not broadcasting scalar/tensor combinations and we should defer to it. Differential Revision: [D25786172](https://our.internmc.facebook.com/intern/diff/D25786172)
There should be no functional change here. A couple of reasons here: 1) This function is generally an anti-pattern (#49758) and it is good to minimize its usage in the code base. 2) pow itself has a fair amount of smarts like not broadcasting scalar/tensor combinations and we should defer to it. Differential Revision: [D25786172](https://our.internmc.facebook.com/intern/diff/D25786172)
| Tensor& float_power_out(Tensor& result, const Tensor& base, Scalar exp) { | ||
| return at::float_power_out(result, base, c10::scalar_to_tensor(exp, base.device())); | ||
| auto dtype = (at::isComplexType(base.scalar_type()) || exp.isComplex()) ? at::kComplexDouble : at::kDouble; | ||
| exp = dtype == at::kComplexDouble ? exp.toComplexDouble() : exp.toDouble(); |
There was a problem hiding this comment.
Parens after the "=" might clarify this expression
| return base.float_power_(c10::scalar_to_tensor(exp, base.device())); | ||
| auto dtype = (at::isComplexType(base.scalar_type()) || exp.isComplex()) ? at::kComplexDouble : at::kDouble; | ||
| TORCH_CHECK(base.scalar_type() == dtype, | ||
| "self tensor type ", base.scalar_type(), "is not the desired type ", dtype); |
There was a problem hiding this comment.
We're inconsistent in our checks (note: we should pick a style an be consistent) but typically want to mention the operation and use the name in the docs, not "self" which is typically an internal name. Also, "tensor type" is ambiguous and this should refer to dtype.
Maybe an error message like:
"the base given to float_power_ has dtype DTYPE, but the operation's result requires dtype DTYPE"
?
There was a problem hiding this comment.
Note also there's not a space between the first base.scalar_type() and "is in the string. A similar printing error can be seen in the error messages below.
There was a problem hiding this comment.
is this even correct behavior? Don't we generally check that the computation type is castable to the result type, not that they are the same?
There was a problem hiding this comment.
Good point. I extended the Developer FAQ entry to address this explicitly (see here).
So yes, a corrected version of the check would validate that the operation's common dtype is "safe castable" to base's dtype.
There was a problem hiding this comment.
In this case, delegating to the out= variant of float_power here would probably save adding special logic for this case. That logic needs to be added to the out= variant (per other comments), anyway.
There was a problem hiding this comment.
Ok, since this is a different issue, I filed #50213.
|
While many of the failures here are in upstream, some are real: |
|
The error message, with its incorrect spacing, comes from: pytorch/aten/src/ATen/native/Pow.cpp Line 30 in 97c17b4 |
|
It's not obvious to me where this failure is coming from. We could change float_power to use an OpInfo if there's a test issue. |
| auto dtype = (at::isComplexType(base.scalar_type()) || at::isComplexType(exp.scalar_type())) ? at::kComplexDouble : at::kDouble; | ||
| return at::pow(base.to(dtype), exp.to(dtype)); | ||
| auto dtype = (at::isComplexType(exp.scalar_type()) || base.isComplex()) ? at::kComplexDouble : at::kDouble; | ||
| base = dtype == at::kComplexDouble ? base.toComplexDouble() : base.toDouble(); |
There was a problem hiding this comment.
Ideally we'd add an output checking helper (or maybe one exists that we can reuse) here (and in the above _out variant). Otherwise if result has the wrong dtype or the wrong device type the error will be throw in pow, and will not correctly identify float_power as the source of the issue. This could be confusing to users.
There should be no functional change here. A couple of reasons here: 1) This function is generally an anti-pattern (#49758) and it is good to minimize its usage in the code base. 2) pow itself has a fair amount of smarts like not broadcasting scalar/tensor combinations and we should defer to it. Differential Revision: [D25786172](https://our.internmc.facebook.com/intern/diff/D25786172)
There should be no functional change here. A couple of reasons here: 1) This function is generally an anti-pattern (#49758) and it is good to minimize its usage in the code base. 2) pow itself has a fair amount of smarts like not broadcasting scalar/tensor combinations and we should defer to it. Differential Revision: [D25786172](https://our.internmc.facebook.com/intern/diff/D25786172)
There should be no functional change here. A couple of reasons here: 1) This function is generally an anti-pattern (#49758) and it is good to minimize its usage in the code base. 2) pow itself has a fair amount of smarts like not broadcasting scalar/tensor combinations and we should defer to it. Differential Revision: [D25786172](https://our.internmc.facebook.com/intern/diff/D25786172)
Summary: Pull Request resolved: pytorch#50105 There should be no functional change here. A couple of reasons here: 1) This function is generally an anti-pattern (pytorch#49758) and it is good to minimize its usage in the code base. 2) pow itself has a fair amount of smarts like not broadcasting scalar/tensor combinations and we should defer to it. Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D25786172 Pulled By: gchanan fbshipit-source-id: 89de03aa0b900ce011a62911224a5441f15e331a
Summary: Pull Request resolved: pytorch#50105 There should be no functional change here. A couple of reasons here: 1) This function is generally an anti-pattern (pytorch#49758) and it is good to minimize its usage in the code base. 2) pow itself has a fair amount of smarts like not broadcasting scalar/tensor combinations and we should defer to it. Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D25786172 Pulled By: gchanan fbshipit-source-id: 89de03aa0b900ce011a62911224a5441f15e331a
Stack from ghstack:
There should be no functional change here.
A couple of reasons here:
Differential Revision: D25786172