Overload _get_operation_for_overload_or_packet & friends to accept ArrayRef#162219
Overload _get_operation_for_overload_or_packet & friends to accept ArrayRef#162219swolchok wants to merge 4 commits intogh/swolchok/827/basefrom
Conversation
…rayRef Avoids requiring vector allocation to call this. [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/162219
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit d5990af with merge base 2b8a839 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
| op, symbol, args, kwargs, /*is_overload*/ true, dk_); | ||
| }); | ||
| return py::make_tuple( | ||
| func, func_dk, py::cast(op->getTags().vec())); |
There was a problem hiding this comment.
These values should be std::move to prevent ref count increases
There was a problem hiding this comment.
I don't see anything movable here; if we move op from the lambda capture then the lambda won't work if called a second time.
There was a problem hiding this comment.
This just moves the Python reference to the cpp_function pybind11 object. Isn't that reference created every time?
This just avoid a pointer inc / decrease of the Python reference counter, it shouldn't move the function inside of the Python function since it's effectively a shared ptr, right?
There was a problem hiding this comment.
To clarify, suggesting moving func and func_dk
…o accept ArrayRef" Avoids requiring vector allocation to call this. cc EikanWang jgong5 wenzhe-nrv sanchitintel [ghstack-poisoned]
…o accept ArrayRef" Avoids requiring vector allocation to call this. cc EikanWang jgong5 wenzhe-nrv sanchitintel [ghstack-poisoned]
…o accept ArrayRef" Avoids requiring vector allocation to call this. cc EikanWang jgong5 wenzhe-nrv sanchitintel [ghstack-poisoned]
|
Starting merge as part of PR stack under #162220 |
Per @Skylion007 on #162219 Pull Request resolved: #162428 Approved by: https://github.com/Skylion007
These seem to have been costing us 5-10 usec per detach (out of ~~95 usec total). If they need to ship let's talk about requirements and how we can make this more efficient given that we would prefer if an entire DTensor op could finish in 10 usec. Differential Revision: [D81530106](https://our.internmc.facebook.com/intern/diff/D81530106) Pull Request resolved: #161596 Approved by: https://github.com/ezyang, https://github.com/Skylion007 ghstack dependencies: #161591, #161595, #161633, #161634, #161692, #162219, #162220, #162218
We control DTensor, so we can just guarantee there isn't a programming error with __torch_dispatch__. (The guard is already less-than-perfect; see the note that the deleted comment refers to.) Pull Request resolved: #162337 Approved by: https://github.com/Skylion007 ghstack dependencies: #161591, #161595, #161633, #161634, #161692, #162219, #162220, #162218, #161596
…rayRef (pytorch#162219) Avoids requiring vector allocation to call this. Pull Request resolved: pytorch#162219 Approved by: https://github.com/Skylion007 ghstack dependencies: pytorch#161591, pytorch#161595, pytorch#161633, pytorch#161634, pytorch#161692
Optimize for common case and remove a pair of refcount operations (see new comments.) Pull Request resolved: pytorch#162220 Approved by: https://github.com/jansel, https://github.com/williamwen42 ghstack dependencies: pytorch#161591, pytorch#161595, pytorch#161633, pytorch#161634, pytorch#161692, pytorch#162219
Per @Skylion007 on pytorch#162219 Pull Request resolved: pytorch#162428 Approved by: https://github.com/Skylion007
…orch#162218) It returns a const reference to a vector. Pull Request resolved: pytorch#162218 Approved by: https://github.com/Skylion007 ghstack dependencies: pytorch#161591, pytorch#161595, pytorch#161633, pytorch#161634, pytorch#161692, pytorch#162219, pytorch#162220
These seem to have been costing us 5-10 usec per detach (out of ~~95 usec total). If they need to ship let's talk about requirements and how we can make this more efficient given that we would prefer if an entire DTensor op could finish in 10 usec. Differential Revision: [D81530106](https://our.internmc.facebook.com/intern/diff/D81530106) Pull Request resolved: pytorch#161596 Approved by: https://github.com/ezyang, https://github.com/Skylion007 ghstack dependencies: pytorch#161591, pytorch#161595, pytorch#161633, pytorch#161634, pytorch#161692, pytorch#162219, pytorch#162220, pytorch#162218
…162337) We control DTensor, so we can just guarantee there isn't a programming error with __torch_dispatch__. (The guard is already less-than-perfect; see the note that the deleted comment refers to.) Pull Request resolved: pytorch#162337 Approved by: https://github.com/Skylion007 ghstack dependencies: pytorch#161591, pytorch#161595, pytorch#161633, pytorch#161634, pytorch#161692, pytorch#162219, pytorch#162220, pytorch#162218, pytorch#161596
…rayRef (pytorch#162219) Avoids requiring vector allocation to call this. Pull Request resolved: pytorch#162219 Approved by: https://github.com/Skylion007 ghstack dependencies: pytorch#161591, pytorch#161595, pytorch#161633, pytorch#161634, pytorch#161692
Optimize for common case and remove a pair of refcount operations (see new comments.) Pull Request resolved: pytorch#162220 Approved by: https://github.com/jansel, https://github.com/williamwen42 ghstack dependencies: pytorch#161591, pytorch#161595, pytorch#161633, pytorch#161634, pytorch#161692, pytorch#162219
Per @Skylion007 on pytorch#162219 Pull Request resolved: pytorch#162428 Approved by: https://github.com/Skylion007
…orch#162218) It returns a const reference to a vector. Pull Request resolved: pytorch#162218 Approved by: https://github.com/Skylion007 ghstack dependencies: pytorch#161591, pytorch#161595, pytorch#161633, pytorch#161634, pytorch#161692, pytorch#162219, pytorch#162220
These seem to have been costing us 5-10 usec per detach (out of ~~95 usec total). If they need to ship let's talk about requirements and how we can make this more efficient given that we would prefer if an entire DTensor op could finish in 10 usec. Differential Revision: [D81530106](https://our.internmc.facebook.com/intern/diff/D81530106) Pull Request resolved: pytorch#161596 Approved by: https://github.com/ezyang, https://github.com/Skylion007 ghstack dependencies: pytorch#161591, pytorch#161595, pytorch#161633, pytorch#161634, pytorch#161692, pytorch#162219, pytorch#162220, pytorch#162218
…162337) We control DTensor, so we can just guarantee there isn't a programming error with __torch_dispatch__. (The guard is already less-than-perfect; see the note that the deleted comment refers to.) Pull Request resolved: pytorch#162337 Approved by: https://github.com/Skylion007 ghstack dependencies: pytorch#161591, pytorch#161595, pytorch#161633, pytorch#161634, pytorch#161692, pytorch#162219, pytorch#162220, pytorch#162218, pytorch#161596
…rayRef (pytorch#162219) Avoids requiring vector allocation to call this. Pull Request resolved: pytorch#162219 Approved by: https://github.com/Skylion007 ghstack dependencies: pytorch#161591, pytorch#161595, pytorch#161633, pytorch#161634, pytorch#161692
Optimize for common case and remove a pair of refcount operations (see new comments.) Pull Request resolved: pytorch#162220 Approved by: https://github.com/jansel, https://github.com/williamwen42 ghstack dependencies: pytorch#161591, pytorch#161595, pytorch#161633, pytorch#161634, pytorch#161692, pytorch#162219
Per @Skylion007 on pytorch#162219 Pull Request resolved: pytorch#162428 Approved by: https://github.com/Skylion007
…orch#162218) It returns a const reference to a vector. Pull Request resolved: pytorch#162218 Approved by: https://github.com/Skylion007 ghstack dependencies: pytorch#161591, pytorch#161595, pytorch#161633, pytorch#161634, pytorch#161692, pytorch#162219, pytorch#162220
These seem to have been costing us 5-10 usec per detach (out of ~~95 usec total). If they need to ship let's talk about requirements and how we can make this more efficient given that we would prefer if an entire DTensor op could finish in 10 usec. Differential Revision: [D81530106](https://our.internmc.facebook.com/intern/diff/D81530106) Pull Request resolved: pytorch#161596 Approved by: https://github.com/ezyang, https://github.com/Skylion007 ghstack dependencies: pytorch#161591, pytorch#161595, pytorch#161633, pytorch#161634, pytorch#161692, pytorch#162219, pytorch#162220, pytorch#162218
…162337) We control DTensor, so we can just guarantee there isn't a programming error with __torch_dispatch__. (The guard is already less-than-perfect; see the note that the deleted comment refers to.) Pull Request resolved: pytorch#162337 Approved by: https://github.com/Skylion007 ghstack dependencies: pytorch#161591, pytorch#161595, pytorch#161633, pytorch#161634, pytorch#161692, pytorch#162219, pytorch#162220, pytorch#162218, pytorch#161596
…rayRef (pytorch#162219) Avoids requiring vector allocation to call this. Pull Request resolved: pytorch#162219 Approved by: https://github.com/Skylion007 ghstack dependencies: pytorch#161591, pytorch#161595, pytorch#161633, pytorch#161634, pytorch#161692
Optimize for common case and remove a pair of refcount operations (see new comments.) Pull Request resolved: pytorch#162220 Approved by: https://github.com/jansel, https://github.com/williamwen42 ghstack dependencies: pytorch#161591, pytorch#161595, pytorch#161633, pytorch#161634, pytorch#161692, pytorch#162219
Per @Skylion007 on pytorch#162219 Pull Request resolved: pytorch#162428 Approved by: https://github.com/Skylion007
…orch#162218) It returns a const reference to a vector. Pull Request resolved: pytorch#162218 Approved by: https://github.com/Skylion007 ghstack dependencies: pytorch#161591, pytorch#161595, pytorch#161633, pytorch#161634, pytorch#161692, pytorch#162219, pytorch#162220
These seem to have been costing us 5-10 usec per detach (out of ~~95 usec total). If they need to ship let's talk about requirements and how we can make this more efficient given that we would prefer if an entire DTensor op could finish in 10 usec. Differential Revision: [D81530106](https://our.internmc.facebook.com/intern/diff/D81530106) Pull Request resolved: pytorch#161596 Approved by: https://github.com/ezyang, https://github.com/Skylion007 ghstack dependencies: pytorch#161591, pytorch#161595, pytorch#161633, pytorch#161634, pytorch#161692, pytorch#162219, pytorch#162220, pytorch#162218
…162337) We control DTensor, so we can just guarantee there isn't a programming error with __torch_dispatch__. (The guard is already less-than-perfect; see the note that the deleted comment refers to.) Pull Request resolved: pytorch#162337 Approved by: https://github.com/Skylion007 ghstack dependencies: pytorch#161591, pytorch#161595, pytorch#161633, pytorch#161634, pytorch#161692, pytorch#162219, pytorch#162220, pytorch#162218, pytorch#161596
Stack from ghstack (oldest at bottom):
Avoids requiring vector allocation to call this.
cc @EikanWang @jgong5 @wenzhe-nrv @sanchitintel