Vectorize copysign on CPU#51792
Conversation
[ghstack-poisoned]
[ghstack-poisoned]
ghstack-source-id: 4089993 Pull Request resolved: pytorch#51792
💊 CI failures summary and remediationsAs of commit cc2a5eb (more details on the Dr. CI page):
🕵️ 1 new failure recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
ghstack-source-id: 76b0813 Pull Request resolved: pytorch#51792
[ghstack-poisoned]
ghstack-source-id: bef766e Pull Request resolved: pytorch#51792
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
| store(tmp); | ||
| exp.store(tmp_exp); | ||
| for (int64_t i = 0; i < size(); i++) { | ||
| for (int i = 0; i < size(); i++) { |
There was a problem hiding this comment.
type i like size():
auto i = decltype(size()){0}
There was a problem hiding this comment.
I really don't like decltype(size()) so added a using size_type to Vec256 that I can use instead.
| namespace std { | ||
|
|
||
| /// Used by vec256<c10::Half>::map | ||
| inline c10::Half acos(c10::Half a) { return std::acos(float(a));} |
There was a problem hiding this comment.
I'm not sure what the history of half math is. @ngimel, do you know?
There was a problem hiding this comment.
No, I don't. I once wondered how math overloads work, but didn't get a full picture, I wonder why it's necessary, shouldn't there be an implicit Half->float conversion (that's defined)? I thought that's how std math functions work in cuda kernels.
There was a problem hiding this comment.
I assumed BFloat-math.h was an accepted pattern, but if that's not true then I've just removed the Half-math.h file and focused on copysign.
| std::is_floating_point<T>::value || | ||
| std::is_same<T, at::Half>::value> { | ||
| std::is_same<T, at::Half>::value || | ||
| std::is_same<T, at::BFloat16>::value> { |
There was a problem hiding this comment.
Can it be put into separate PR (as it have nothing to do with vectorization goal)
| }); | ||
| cpu_kernel_vec(iter, | ||
| [](scalar_t a, scalar_t b) -> scalar_t { | ||
| return std::copysign(a, b); |
There was a problem hiding this comment.
Again, we can not use std::copysign otherwise it will cause an ICE
[ghstack-poisoned]
[ghstack-poisoned]
|
Actually, going to wait on @malfet, as I don't quite understand the ICE concerns. |
|
Looks good to me, thank you for the updates |
Summary: Pull Request resolved: pytorch#51792 Test Plan: Imported from OSS Reviewed By: agolynski Differential Revision: D27769007 Pulled By: mruberry fbshipit-source-id: 65fceb9f59ed6afee4452278992340da104ed5fe
Summary: Pull Request resolved: pytorch#51792 Test Plan: Imported from OSS Reviewed By: agolynski Differential Revision: D27769007 Pulled By: mruberry fbshipit-source-id: 65fceb9f59ed6afee4452278992340da104ed5fe
Stack from ghstack:
Differential Revision: D27769007