RecordFunction in Dispatcher#37587
Conversation
Summary: Lifting RecordFunction up into the dispatcher code [ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit 8983654 (more details on the Dr. CI page):
Extra GitHub checks: 1 failed
This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 699 times. |
Summary: Lifting RecordFunction up into the dispatcher code [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code Differential Revision: [D21374246](https://our.internmc.facebook.com/intern/diff/D21374246) [ghstack-poisoned]
|
Rebase |
|
(looks like changes in tracer introduced some build errors, looking into them) |
Summary: Lifting RecordFunction up into the dispatcher code Differential Revision: [D21374246](https://our.internmc.facebook.com/intern/diff/D21374246) [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code Differential Revision: [D21374246](https://our.internmc.facebook.com/intern/diff/D21374246) [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code Differential Revision: [D21374246](https://our.internmc.facebook.com/intern/diff/D21374246) [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code Differential Revision: [D21374246](https://our.internmc.facebook.com/intern/diff/D21374246) [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code Differential Revision: [D21374246](https://our.internmc.facebook.com/intern/diff/D21374246) [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code Differential Revision: [D21374246](https://our.internmc.facebook.com/intern/diff/D21374246) [ghstack-poisoned]
|
addressing comments, adding filtering of observed ops |
Summary: Lifting RecordFunction up into the dispatcher code Differential Revision: [D21374246](https://our.internmc.facebook.com/intern/diff/D21374246) [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code Differential Revision: [D21374246](https://our.internmc.facebook.com/intern/diff/D21374246) [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code Differential Revision: [D21374246](https://our.internmc.facebook.com/intern/diff/D21374246) [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code Differential Revision: [D21374246](https://our.internmc.facebook.com/intern/diff/D21374246) [ghstack-poisoned]
| const auto& kernel = entry.lookup(dispatchKey); | ||
| kernel.callBoxed(op, stack); | ||
|
|
||
| if (entry.isObserved()) { |
There was a problem hiding this comment.
Note: the absence of useRecordFunction here introduces an API inconsistency between callBoxed and callWithDispatchKey - one lets you control RecordFunction and the other doesn't. If we go with the useRecordFunction approach, I'm thinking we should add it to callBoxed as well (we can avoid callsite churn by defaulting it).
But also, if you're inclined to dislike call-site-decides because of leakage, here's some more :P
There was a problem hiding this comment.
I was wondering in what cases callBoxed would be called with BackendSelect, or is it for the symmetry
There was a problem hiding this comment.
Yeah, just for the symmetry. Unless there's some history that I'm missing, there's no reason to think we don't want the same knobs on boxed calls as unboxed.
Note: I wouldn't necessarily burden your PR with the addition - I just wanted to note it in the context of the discussion about filtering BackendSelect et al.
There was a problem hiding this comment.
Hi Ilia - so after discussing with @ezyang, we came to two conclusions:
- at some point in the future, we might want to move to a "true" callee-profiling setup, which wouldn't be in the dispatcher at all - instead, we'd use metaprogramming behind the registration API to instrument kernels as they were registered. This would have a few benefits besides incrementally better precision (full per-kernel granularity with no dispatch overhead; direct calls wouldn't be missed), but it's not compelling enough to disrupt the current approach and sink a bunch of work into now.
- for now, the best approximation to this future state is to revert the
useRecordFunctionparam and go back to (a variation of) your original approach, wherecallWithDispatchKeydoes a dynamic test on dispatch key. Two changes seem worthwhile though:- the new test, along with the test on
op.isObserved(), can be done inside theif (C10_UNLIKELY(guard.active)) { ... }block, as noted inline. This will at least keep the testing overhead to a minimum. - instead of doing a direct test on
BackendSelect, let's wrap it in a (inline) filter function that takes a dispatch key and returns a bool, sayshouldRecord()or something. That'll give us a place to put additional filtered keys as they come up.
- the new test, along with the test on
Otherwise LGTM (I happened to notice a nit from @smessmer as I scrolled past, don't know if that's a remaining TODO...) - just re-request or gimme a ping once these changes are up and we should be good to go!
| // Note: for perf reasons we wouldn't want to pass arguments into | ||
| // the function call or prematurely box them | ||
| at::RecordFunction guard(at::RecordScope::FUNCTION); | ||
| if (C10_UNLIKELY(guard.active)) { |
There was a problem hiding this comment.
I think you can put the dispatch key filter test inside this if, and also move the op.isObserved() test down here.
Summary: Lifting RecordFunction up into the dispatcher code Differential Revision: [D21374246](https://our.internmc.facebook.com/intern/diff/D21374246) [ghstack-poisoned]
|
(rebase) |
Summary: Lifting RecordFunction up into the dispatcher code Differential Revision: [D21374246](https://our.internmc.facebook.com/intern/diff/D21374246) [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code Differential Revision: [D21374246](https://our.internmc.facebook.com/intern/diff/D21374246) [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code Differential Revision: [D21374246](https://our.internmc.facebook.com/intern/diff/D21374246) [ghstack-poisoned]
bhosmer
left a comment
There was a problem hiding this comment.
Looks good! one ~trivial naming idea and one nontrivial question that might be a suggested change (the redispatch thing). Other changes look great, thanks!
|
|
||
| template <typename T, std::enable_if_t<!c10::impl::ok_to_box<T>::value>* = nullptr> | ||
| inline bool pushIValueOrCannotBox(std::vector<c10::IValue>& stack, const T& v) { | ||
| torch::jit::push(stack, "cannot box"); |
There was a problem hiding this comment.
@ezyang @ilia-cher to tie up the points that Ed raises here:
- pushing a string means a dynamic allocation: this is true, though we'll stop incurring the cost once everything is supported. So I'd probably let it go in, unless there's a super easy way to push it in a non-owning way, like make "cannot box" a static std::string and push references to it or something. cc @smessmer any obvious way to do this?
- move these functions closer to the callers: this is another case where it's a matter of how much work to do for a temporary situation. These functions use locally-defined templates that shouldn't really be public, so I think we can treat these functions as ephemeral bits of the API presented by
boxing.hto the rest of the system. - naming: I agree it's a little confusing rn, since
boxArgumentsIntoStacksounds like it performs a function that's actually done inline inboxAndCallBoxedFunc, but in fact does something subtly different :D ...but it's not too big a deal, and I can tweak names in my upcoming PR. Ilia if you're pushing another commit anyway, you might consider renamingboxArgumentsIntoStackto beboxArgumentsOrCannotBoxIntoStackto clarify what's happening, but otherwise I wouldn't worry about it.
| args... | ||
| ); | ||
| return callWithDispatchKey<Return, Args...>(op, dispatchKey, args...); | ||
| // do not use RecordFunction on redispatch |
There was a problem hiding this comment.
Hmm is there a reason to break symmetry here? I think we want redispatch to be exactly equivalent to callWithDispatchKey on the computed key - basically just a convenience method. Introducing this logic change seems like a potential source of hard-to-track-down inconsistencies in recording results.
If the change is to avoid double-counting some key other than BackendSelect, maybe that key should be added to shouldRecord? cc @smessmer maybe you guys have talked about this already...
There was a problem hiding this comment.
the reason I do it this way here is that I don't want to run RecordFunction and we don't want to pass a parameter into callWithDispatchKey (as we discussed). Since it's two lines of code I just called the kernel itself
There was a problem hiding this comment.
Yeah, I definitely have no problem with the implementation. My reaction was to the semantics - my default assumption was that we want redispatch(op, args) to have the exactly same behavior as the equivalent callWithDispatchKey(op, key, args) call - w.r.t. RecordFunction and everything else.
But the other POV makes sense too - that a natural consequence of the meaning of "redispatch" is: no (re)recording, so they really are different.
It's not totally clear to me that the suppression shouldn't move into shouldRecord, but it's not a discussion we need to have before you land this 😄 thanks for being patient!
Summary: Lifting RecordFunction up into the dispatcher code Differential Revision: [D21374246](https://our.internmc.facebook.com/intern/diff/D21374246) [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code Differential Revision: [D21374246](https://our.internmc.facebook.com/intern/diff/D21374246) [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code Differential Revision: [D21374246](https://our.internmc.facebook.com/intern/diff/D21374246) [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code Differential Revision: [D21374246](https://our.internmc.facebook.com/intern/diff/D21374246) [ghstack-poisoned]
Summary: Lifting RecordFunction up into the dispatcher code Differential Revision: [D21374246](https://our.internmc.facebook.com/intern/diff/D21374246) [ghstack-poisoned]
Stack from ghstack:
Summary:
Lifting RecordFunction up into the dispatcher code
Differential Revision: D21374246