[Pytorch] Specialize guts of c10::optional for 32-bit scalars#47015
[Pytorch] Specialize guts of c10::optional for 32-bit scalars#47015swolchok wants to merge 8 commits intogh/swolchok/4/basefrom
Conversation
c10::optional has non-trivial copy and move operations always. This change specializes it for 32-bit scalars so that it has trivial copy and move operations in that case. Ideally, we would instead rely on P0602 "variant and optional should propagate copy/move triviality" and use `std::optional` (or implement that functionality ourselves). We can't use `std::optional` because we are stuck with C++14. Implementing the full P0602 ourselves would add even more complexity. We could do it, but this should be a helpful first step. Differential Revision: [D24552280](https://our.internmc.facebook.com/intern/diff/D24552280/) [ghstack-poisoned]
c10::optional has non-trivial copy and move operations always. This change specializes it for 32-bit scalars so that it has trivial copy and move operations in that case. Ideally, we would instead rely on P0602 "variant and optional should propagate copy/move triviality" and use `std::optional` (or implement that functionality ourselves). We can't use `std::optional` because we are stuck with C++14. Implementing the full P0602 ourselves would add even more complexity. We could do it, but this should be a helpful first step. Differential Revision: [D24552280](https://our.internmc.facebook.com/intern/diff/D24552280/) ghstack-source-id: 115395876 Pull Request resolved: #47015
💊 CI failures summary and remediationsAs of commit c6d8a51 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 26 times. |
|
I tried to replicate the benchmark on #46598 so as to provide an apples-to-apple comparison, but I couldn't run the benchmark with the most recent version of that branch checked out. |
…r 32-bit scalars" c10::optional has non-trivial copy and move operations always. This change specializes it for 32-bit scalars so that it has trivial copy and move operations in that case. Ideally, we would instead rely on P0602 "variant and optional should propagate copy/move triviality" and use `std::optional` (or implement that functionality ourselves). We can't use `std::optional` because we are stuck with C++14. Implementing the full P0602 ourselves would add even more complexity. We could do it, but this should be a helpful first step. Differential Revision: [D24552280](https://our.internmc.facebook.com/intern/diff/D24552280/) [ghstack-poisoned]
Pull Request resolved: #47015 c10::optional has non-trivial copy and move operations always. This change specializes it for 32-bit scalars so that it has trivial copy and move operations in that case. Ideally, we would instead rely on P0602 "variant and optional should propagate copy/move triviality" and use `std::optional` (or implement that functionality ourselves). We can't use `std::optional` because we are stuck with C++14. Implementing the full P0602 ourselves would add even more complexity. We could do it, but this should be a helpful first step. ghstack-source-id: 115425175 Differential Revision: [D24552280](https://our.internmc.facebook.com/intern/diff/D24552280/)
…tional for 32-bit scalars" c10::optional has non-trivial copy and move operations always. This change specializes it for 32-bit scalars so that it has trivial copy and move operations in that case. Ideally, we would instead rely on P0602 "variant and optional should propagate copy/move triviality" and use `std::optional` (or implement that functionality ourselves). We can't use `std::optional` because we are stuck with C++14. Implementing the full P0602 ourselves would add even more complexity. We could do it, but this should be a helpful first step. Differential Revision: [D24552280](https://our.internmc.facebook.com/intern/diff/D24552280/) [ghstack-poisoned]
Pull Request resolved: #47015 c10::optional has non-trivial copy and move operations always. This change specializes it for 32-bit scalars so that it has trivial copy and move operations in that case. Ideally, we would instead rely on P0602 "variant and optional should propagate copy/move triviality" and use `std::optional` (or implement that functionality ourselves). We can't use `std::optional` because we are stuck with C++14. Implementing the full P0602 ourselves would add even more complexity. We could do it, but this should be a helpful first step. ghstack-source-id: 115447317 Differential Revision: [D24552280](https://our.internmc.facebook.com/intern/diff/D24552280/)
| std::is_nothrow_move_assignable<T>::value && | ||
| std::is_nothrow_move_constructible<T>::value) { | ||
| if (init_ && !rhs.init_) { | ||
| clear(); |
There was a problem hiding this comment.
clear() itself has a branch checking init_ again. The compiler might be able to optimize it away and branch prediction can probably handle it quite well too, but it might be safer to have a separate private method for clearing when we already know that init_==true.
Same above in the copy constructor and for the constexpr version.
c10/util/Optional.h
Outdated
| constexpr_optional_base<typename std::remove_const< | ||
| T>::type>, // use base with trivial destructor | ||
| optional_base<typename std::remove_const<T>::type>>::type; | ||
| std::is_scalar<T>::value && sizeof(T) <= 4, |
There was a problem hiding this comment.
c10::ScalarType and c10::Layout are enums, so they'll hit the optimization here, but we'd also like to hit the optimization for c10::Device, which is a 32bit struct that can for most purposes be treated like a Scalar.
Should we use checks for trivial movability/copyability here instead of std::is_scalar? That might hit c10::Device too. Also not sure if we need the 32bit sizeof constraint. If a type is trivially movable/copyable, then the optimization could make sense even if the type is larger.
We might need to rewrite the optimization a bit though. What it currently does is it unconditionally calls the copy/move constructor, possibly on uninitialized elements. With scalars, that is likely within the standard. I'm not sure if doing that for trivially copy/movable types is within the standard or UB. Do you happen to know?
There was a problem hiding this comment.
it unconditionally calls the copy/move constructor, possibly on uninitialized elements. With scalars, that is likely within the standard. I'm not sure if doing that for trivially copy/movable types is within the standard or UB. Do you happen to know?
We're not using a union, so all we require is that we can copy and move default-constructed objects of type T. If a type can't do that, it's broken. I'm not sure what's possibly uninitialized by the optional implementation; initialization is left up to T's default constructor.
There was a problem hiding this comment.
We're not using a union
Ah, and that's a problem because T may not have a default constructor. I'll see what I can do.
There was a problem hiding this comment.
I feel like, for all the types we care about this optimization applying to, we have default constructors
…ars" c10::optional has non-trivial copy and move operations always. This change specializes it for 32-bit scalars so that it has trivial copy and move operations in that case. Ideally, we would instead rely on P0602 "variant and optional should propagate copy/move triviality" and use `std::optional` (or implement that functionality ourselves). We can't use `std::optional` because we are stuck with C++14. Implementing the full P0602 ourselves would add even more complexity. We could do it, but this should be a helpful first step. Differential Revision: [D24552280](https://our.internmc.facebook.com/intern/diff/D24552280/) [ghstack-poisoned]
Pull Request resolved: #47015 c10::optional has non-trivial copy and move operations always. This change specializes it for 32-bit scalars so that it has trivial copy and move operations in that case. Ideally, we would instead rely on P0602 "variant and optional should propagate copy/move triviality" and use `std::optional` (or implement that functionality ourselves). We can't use `std::optional` because we are stuck with C++14. Implementing the full P0602 ourselves would add even more complexity. We could do it, but this should be a helpful first step. ghstack-source-id: 115514082 Differential Revision: [D24552280](https://our.internmc.facebook.com/intern/diff/D24552280/)
|
This was a very interesting and informative PR to review, learned something. Thanks! |
| class optional : private OptionalBase<T> { | ||
| template <class U> // re-declaration for nvcc on Windows. | ||
| using OptionalBase = typename std::conditional< | ||
| std::is_trivially_destructible<U>::value, // if possible |
There was a problem hiding this comment.
We could use this PR to fix the duplication here. But not mandatory, up to you.
…ars" c10::optional has non-trivial copy and move operations always. This change specializes it for 32-bit scalars so that it has trivial copy and move operations in that case. Ideally, we would instead rely on P0602 "variant and optional should propagate copy/move triviality" and use `std::optional` (or implement that functionality ourselves). We can't use `std::optional` because we are stuck with C++14. Implementing the full P0602 ourselves would add even more complexity. We could do it, but this should be a helpful first step. Differential Revision: [D24552280](https://our.internmc.facebook.com/intern/diff/D24552280/) [ghstack-poisoned]
Pull Request resolved: #47015 c10::optional has non-trivial copy and move operations always. This change specializes it for 32-bit scalars so that it has trivial copy and move operations in that case. Ideally, we would instead rely on P0602 "variant and optional should propagate copy/move triviality" and use `std::optional` (or implement that functionality ourselves). We can't use `std::optional` because we are stuck with C++14. Implementing the full P0602 ourselves would add even more complexity. We could do it, but this should be a helpful first step. ghstack-source-id: 115769681 Differential Revision: [D24552280](https://our.internmc.facebook.com/intern/diff/D24552280/)
…tional for 32-bit scalars" c10::optional has non-trivial copy and move operations always. This change specializes it for 32-bit scalars so that it has trivial copy and move operations in that case. Ideally, we would instead rely on P0602 "variant and optional should propagate copy/move triviality" and use `std::optional` (or implement that functionality ourselves). We can't use `std::optional` because we are stuck with C++14. Implementing the full P0602 ourselves would add even more complexity. We could do it, but this should be a helpful first step. Differential Revision: [D24552280](https://our.internmc.facebook.com/intern/diff/D24552280/) [ghstack-poisoned]
… on "[Pytorch] Specialize guts of c10::optional for 32-bit scalars" c10::optional has non-trivial copy and move operations always. This change specializes it for 32-bit scalars so that it has trivial copy and move operations in that case. Ideally, we would instead rely on P0602 "variant and optional should propagate copy/move triviality" and use `std::optional` (or implement that functionality ourselves). We can't use `std::optional` because we are stuck with C++14. Implementing the full P0602 ourselves would add even more complexity. We could do it, but this should be a helpful first step. Differential Revision: [D24552280](https://our.internmc.facebook.com/intern/diff/D24552280/) [ghstack-poisoned]
Pull Request resolved: #47015 c10::optional has non-trivial copy and move operations always. This change specializes it for 32-bit scalars so that it has trivial copy and move operations in that case. Ideally, we would instead rely on P0602 "variant and optional should propagate copy/move triviality" and use `std::optional` (or implement that functionality ourselves). We can't use `std::optional` because we are stuck with C++14. Implementing the full P0602 ourselves would add even more complexity. We could do it, but this should be a helpful first step. ghstack-source-id: 115804292 Differential Revision: [D24552280](https://our.internmc.facebook.com/intern/diff/D24552280/)
…pecialize guts of c10::optional for 32-bit scalars" c10::optional has non-trivial copy and move operations always. This change specializes it for 32-bit scalars so that it has trivial copy and move operations in that case. Ideally, we would instead rely on P0602 "variant and optional should propagate copy/move triviality" and use `std::optional` (or implement that functionality ourselves). We can't use `std::optional` because we are stuck with C++14. Implementing the full P0602 ourselves would add even more complexity. We could do it, but this should be a helpful first step. Differential Revision: [D24552280](https://our.internmc.facebook.com/intern/diff/D24552280/) [ghstack-poisoned]
Pull Request resolved: #47015 c10::optional has non-trivial copy and move operations always. This change specializes it for 32-bit scalars so that it has trivial copy and move operations in that case. Ideally, we would instead rely on P0602 "variant and optional should propagate copy/move triviality" and use `std::optional` (or implement that functionality ourselves). We can't use `std::optional` because we are stuck with C++14. Implementing the full P0602 ourselves would add even more complexity. We could do it, but this should be a helpful first step. ghstack-source-id: 115886743 Differential Revision: [D24552280](https://our.internmc.facebook.com/intern/diff/D24552280/)
Codecov Report
@@ Coverage Diff @@
## gh/swolchok/4/base #47015 +/- ##
======================================================
- Coverage 60.82% 60.81% -0.02%
======================================================
Files 2749 2750 +1
Lines 254038 254009 -29
======================================================
- Hits 154531 154481 -50
- Misses 99507 99528 +21 |
|
This pull request has been merged in df5b469. |
Stack from ghstack:
c10::optional has non-trivial copy and move operations always. This change specializes it for 32-bit scalars so that it has trivial copy and move operations in that case. Ideally, we would instead rely on P0602 "variant and optional should propagate copy/move triviality" and use
std::optional(or implement that functionality ourselves). We can't usestd::optionalbecause we are stuck with C++14. Implementing the full P0602 ourselves would add even more complexity. We could do it, but this should be a helpful first step.Differential Revision: D24552280