Move argument grouping into FunctionSchema#48195
Move argument grouping into FunctionSchema#48195ezyang wants to merge 5 commits intogh/ezyang/869/basefrom
Conversation
Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]
|
sorry short on time, no description; coming soon |
💊 CI failures summary and remediationsAs of commit 00c8118 (more details on the Dr. CI page):
🕵️ 1 new failure recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]
There was a problem hiding this comment.
Just a quick drive-by based on the PR description, front-running a full review later: the choices you're describing in your last bullet point kind of point to the brittleness of modeling the sequence in a maximally partitioned way like this. An alternative (I've pitched this before) is to model self and tensor options as optional pairs of (thing, insertion index) - it's true that the indexes wouldn't be correct by construction, but it avoids the pre-post splintering, and the arbitrary choices (and dynamic checks) for degenerate cases. Still worth modeling args and kwargs as separate lists ofc.
|
@bhosmer Engh, I still don't like it. I gave one reason (tricky bookkeeping if you ever need multiple indices--though tbf we don't have this right now) in our original conversation. But another reason why I don't really like splicing arguments into the lists is that the list prior to splicing isn't really usable for... anything, really. Imagine: Does Use of the If it's just getting rid of illegal states that we want, the pre/post version can be made to play ball like this: but the extra typing here didn't seem worth it to me, at least. |
|
@ezyang thanks for writing out the example, it hits the nail on the head. You're exactly right, the taste/distaste for this approach comes down to whether you feel like the ground representation should be "useful" on its own, or if it's just raw material for derived views. Will a full review introduce me to client code that uses the pre/post fragments directly (like in a semantically meaningful way)? If so then I'd agree that what I'm pitching is probably a net loss because it'd have to add derived views for both the full list and the fragments. I'd assumed the full list was the only view that mattered, which I think would put the two representations roughly at parity in terms of view complexity. You might also be reacting to (per your example) |
| if self.tensor_options is not None: | ||
| ret.extend(self.tensor_options.all()) | ||
| ret.extend(self.post_tensor_options_kwarg_only) | ||
| return ret |
There was a problem hiding this comment.
This and positional above are a use of the fields.
There was a problem hiding this comment.
Ah ok, this is what I meant by derived views. By parity I meant that the above and something like below (mod typos) are basically the same complexity, and you wouldn't have to tiebreak the degenerate representations.
non_self_positional: Tuple[Argument, ...]
self_arg: Optional[Tuple[SelfArgument, int]]
non_tensor_options_kwarg_only: Tuple[Argument, ...]
tensor_options: Optional[Tuple[TensorOptionsArguments, int]]
def positional(self) -> Sequence[Argument]:
ret: List[Argument] = []
ret.extend(self.non_self_positional)
if self.self_arg is not None:
arg, i = self.self_arg
ret.insert(i, arg)
return ret
def kwarg_only(self) -> Sequence[Argument]:
ret: List[Argument] = []
ret.extend(self.non_tensor_options_kwarg_only)
if self.tensor_options is not None:
args, i = self.tensor_options
ret[i:i] = args.all()
return ret
But TBH it's not worth litigating, the way it is currently is fine by me if you just like it better. My main motivation here is to sketch the alternative out concretely, not to push super hard for its adoption.
| args.extend(func.arguments.pre_tensor_options_kwarg_only) | ||
| if func.arguments.tensor_options is not None: | ||
| args.append(func.arguments.tensor_options) | ||
| args.extend(func.arguments.post_tensor_options_kwarg_only) |
There was a problem hiding this comment.
This is another use of the fields, although the intention is to kill this function and make use of the fields directly
I went ahead and marked the relevant spots. But I didn't really try to refactor anything else after this, so most of the usage is just "squashing" the individual parts back into the list of union which is what the rest of the code uses right now. Projecting ahead to further refactors, the other things most likely to happen:
You got me, I was intentionally being a little hyperbolic in the example naming ;) |
The general approach is to change Arguments, splitting `positional`, `kwarg_only` and `out`, into `pre_self_positional`, `self_arg`, `post_self_positional`, and `pre_tensor_options_kwarg_only`, `tensor_options` and `post_tensor_options_kwarg_only`. The splits are as you'd expect: we extract out the self argument and the tensor options arguments, and record the other arguments that came before and after. To do this, we move the logic in `group_arguments` to the parsing process. Some fuzz in the process: * I renamed `ThisArgument` to `SelfArgument`, since we don't actually use the terminology "this" outside of C++ (and the model is Python biased) * I kept the `group_arguments` function, which now just reads out the arguments from the structured model in the correct order. In the long term, we should get rid of this function entirely, but for now I kept it as is to reduce churn. * I decided to arbitrarily say that when self is missing, everything goes in "post-self", but when tensor options is missing, everything goes in "pre-tensor-options". This was based on where you typically find the argument in question: self is usually at front (so most args are after it), while tensor options are typically at the end (so most args go before it). Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]
| # If there is enough space... | ||
| if i <= len(kwarg_only) - len(predicates): | ||
| # And the next len(predicates) arguments look like TensorOptions arguments | ||
| if all(p(a) for p, a in zip(predicates, kwarg_only[i : i + len(predicates)])): |
There was a problem hiding this comment.
It still feels like a small devex pothole to be failing silently on near misses here, but probably out of scope for this work.
There was a problem hiding this comment.
Yeah, this is preexisting code, so out of scope for this PR. Note that we do have plenty of "near-ish misses" (e.g., one or two arguments from TensorOptions) in various functions, including
- func: to.device(Tensor self, Device device, ScalarType dtype, bool non_blocking=False, bool copy=False, MemoryFormat? memory_format=None) -> Tensor
use_c10_dispatcher: full
variants: method
device_guard: False
so it's not altogether clear to me what a "you mistyped this" error message would look like.
There was a problem hiding this comment.
Yeah, the line isn't trivially easy to draw. I guess if the goal state is "nobody uses TensorOptions except for factory functions in the user C++ API", then the most robust setup might be to drive the gather logic from an explicit annotation in native_functions rather than a pattern match - could even just use category_override: factory.
The general approach is to change Arguments, splitting `positional`, `kwarg_only` and `out`, into `pre_self_positional`, `self_arg`, `post_self_positional`, and `pre_tensor_options_kwarg_only`, `tensor_options` and `post_tensor_options_kwarg_only`. The splits are as you'd expect: we extract out the self argument and the tensor options arguments, and record the other arguments that came before and after. To do this, we move the logic in `group_arguments` to the parsing process. Some fuzz in the process: * I renamed `ThisArgument` to `SelfArgument`, since we don't actually use the terminology "this" outside of C++ (and the model is Python biased) * I kept the `group_arguments` function, which now just reads out the arguments from the structured model in the correct order. In the long term, we should get rid of this function entirely, but for now I kept it as is to reduce churn. * I decided to arbitrarily say that when self is missing, everything goes in "post-self", but when tensor options is missing, everything goes in "pre-tensor-options". This was based on where you typically find the argument in question: self is usually at front (so most args are after it), while tensor options are typically at the end (so most args go before it). Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]
Signed-off-by: Edward Z. Yang <ezyang@fb.com> ghstack-source-id: 249fbfa Pull Request resolved: pytorch#48195
Signed-off-by: Edward Z. Yang <ezyang@fb.com> ghstack-source-id: 249fbfa Pull Request resolved: pytorch#48195
The general approach is to change Arguments, splitting `positional`, `kwarg_only` and `out`, into `pre_self_positional`, `self_arg`, `post_self_positional`, and `pre_tensor_options_kwarg_only`, `tensor_options` and `post_tensor_options_kwarg_only`. The splits are as you'd expect: we extract out the self argument and the tensor options arguments, and record the other arguments that came before and after. To do this, we move the logic in `group_arguments` to the parsing process. Some fuzz in the process: * I renamed `ThisArgument` to `SelfArgument`, since we don't actually use the terminology "this" outside of C++ (and the model is Python biased) * I kept the `group_arguments` function, which now just reads out the arguments from the structured model in the correct order. In the long term, we should get rid of this function entirely, but for now I kept it as is to reduce churn. * I decided to arbitrarily say that when self is missing, everything goes in "post-self", but when tensor options is missing, everything goes in "pre-tensor-options". This was based on where you typically find the argument in question: self is usually at front (so most args are after it), while tensor options are typically at the end (so most args go before it). Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: [D25231166](https://our.internmc.facebook.com/intern/diff/D25231166) [ghstack-poisoned]
Signed-off-by: Edward Z. Yang <ezyang@fb.com> ghstack-source-id: cd0aa6e Pull Request resolved: pytorch#48195
Summary: Pull Request resolved: pytorch#48195 The general approach is to change Arguments, splitting `positional`, `kwarg_only` and `out`, into `pre_self_positional`, `self_arg`, `post_self_positional`, and `pre_tensor_options_kwarg_only`, `tensor_options` and `post_tensor_options_kwarg_only`. The splits are as you'd expect: we extract out the self argument and the tensor options arguments, and record the other arguments that came before and after. To do this, we move the logic in `group_arguments` to the parsing process. Some fuzz in the process: * I renamed `ThisArgument` to `SelfArgument`, since we don't actually use the terminology "this" outside of C++ (and the model is Python biased) * I kept the `group_arguments` function, which now just reads out the arguments from the structured model in the correct order. In the long term, we should get rid of this function entirely, but for now I kept it as is to reduce churn. * I decided to arbitrarily say that when self is missing, everything goes in "post-self", but when tensor options is missing, everything goes in "pre-tensor-options". This was based on where you typically find the argument in question: self is usually at front (so most args are after it), while tensor options are typically at the end (so most args go before it). Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: zhangguanheng66 Differential Revision: D25231166 Pulled By: ezyang fbshipit-source-id: 25d77ad8319c4ce0bba4ad82e451bf536ef823ad
Previously, this function had nontrivial algorithmic content, but after #48195, this was just a swiss army knife for pasting together arguments while maintaining structure. I added some more properties for Arguments for convenient access in this way, and then inlined the implementation of group_arguments into all of its call sites, simplifying whenever contextual. This might be controversial, but I think the resulting code is easier to understand. You may notice that there is some modest code duplication between dispatcher.cpparguments_exprs and CppSignature.argument_packs. This is a known problem and I will be attempting to fix it in a follow up PR. Confirmed to be byte-for-byte compatible. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]
Previously, this function had nontrivial algorithmic content, but after #48195, this was just a swiss army knife for pasting together arguments while maintaining structure. I added some more properties for Arguments for convenient access in this way, and then inlined the implementation of group_arguments into all of its call sites, simplifying whenever contextual. This might be controversial, but I think the resulting code is easier to understand. You may notice that there is some modest code duplication between dispatcher.cpparguments_exprs and CppSignature.argument_packs. This is a known problem and I will be attempting to fix it in a follow up PR. Confirmed to be byte-for-byte compatible. Signed-off-by: Edward Z. Yang <ezyang@fb.com> ghstack-source-id: 7ff71ff Pull Request resolved: #49043
Previously, this function had nontrivial algorithmic content, but after #48195, this was just a swiss army knife for pasting together arguments while maintaining structure. I added some more properties for Arguments for convenient access in this way, and then inlined the implementation of group_arguments into all of its call sites, simplifying whenever contextual. This might be controversial, but I think the resulting code is easier to understand. You may notice that there is some modest code duplication between dispatcher.cpparguments_exprs and CppSignature.argument_packs. This is a known problem and I will be attempting to fix it in a follow up PR. Confirmed to be byte-for-byte compatible. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]
Previously, this function had nontrivial algorithmic content, but after #48195, this was just a swiss army knife for pasting together arguments while maintaining structure. I added some more properties for Arguments for convenient access in this way, and then inlined the implementation of group_arguments into all of its call sites, simplifying whenever contextual. This might be controversial, but I think the resulting code is easier to understand. You may notice that there is some modest code duplication between dispatcher.cpparguments_exprs and CppSignature.argument_packs. This is a known problem and I will be attempting to fix it in a follow up PR. Confirmed to be byte-for-byte compatible. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]
Summary: Pull Request resolved: #49043 Previously, this function had nontrivial algorithmic content, but after #48195, this was just a swiss army knife for pasting together arguments while maintaining structure. I added some more properties for Arguments for convenient access in this way, and then inlined the implementation of group_arguments into all of its call sites, simplifying whenever contextual. This might be controversial, but I think the resulting code is easier to understand. You may notice that there is some modest code duplication between dispatcher.cpparguments_exprs and CppSignature.argument_packs. This is a known problem and I will be attempting to fix it in a follow up PR. Confirmed to be byte-for-byte compatible. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: H-Huang Differential Revision: D25455885 Pulled By: ezyang fbshipit-source-id: 8fbe066e8c3cb7ee8adb5b87296ec5bd7b49e01f
Stack from ghstack:
The general approach is to change Arguments, splitting
positional,kwarg_onlyandout, intopre_self_positional,self_arg,post_self_positional, andpre_tensor_options_kwarg_only,tensor_optionsandpost_tensor_options_kwarg_only. The splits are as you'd expect: we extract out the self argument and the tensor options arguments, and record the other arguments that came before and after. To do this, we move the logic ingroup_argumentsto the parsing process.Some fuzz in the process:
ThisArgumenttoSelfArgument, since we don't actually use the terminology "this" outside of C++ (and the model is Python biased)group_argumentsfunction, which now just reads out the arguments from the structured model in the correct order. In the long term, we should get rid of this function entirely, but for now I kept it as is to reduce churn.Signed-off-by: Edward Z. Yang ezyang@fb.com
Differential Revision: D25231166