Make named tensor implementations more robust#26968
Make named tensor implementations more robust#26968zou3519 wants to merge 1 commit intogh/zou3519/193/basefrom
Conversation
To make implementations of an operator more robust, we should have a separate "named area" where name propagation happens and an "unnamed area" where the implementation is. Right now, many functions are implemented without an "unnamed area". The problem with that is that if someone modifies the implementation, it is very easy to break namedtensor support by using a helper function that does not propagate names correctly. The test coverage for named tensors is also insufficient to catch such breakages. This PR modifies some named tensor implementations to have separate "named area" and "unnamed area". The following implementations were changed: - dropout, softmax, log_softmax, bernoulli - dot, mm, addmm, addmv, mv Test Plan: - [namedtensor ci]
To make implementations of an operator more robust, we should have a separate "named area" where name propagation happens and an "unnamed area" where the implementation is. Right now, many functions are implemented without an "unnamed area". The problem with that is that if someone modifies the implementation, it is very easy to break namedtensor support by using a helper function that does not propagate names correctly. The test coverage for named tensors is also insufficient to catch such breakages. This PR modifies some named tensor implementations to have separate "named area" and "unnamed area". The following implementations were changed: - dropout, softmax, log_softmax, bernoulli - dot, mm, addmm, addmv, mv Test Plan: - [namedtensor ci] ghstack-source-id: f758aba Pull Request resolved: #26968
| } | ||
| }(); | ||
| #ifdef BUILD_NAMEDTENSOR | ||
| namedinference::propagate_names(result, input_); |
There was a problem hiding this comment.
is this adding named tensor support that didn't previously exist? from the description it sounds like the intent is just to refactor but I don't know where this is getting moved from.
Does at::_softmax have propagated names without the name guard? if so should we remove it since we're name guarding it anyway?
There was a problem hiding this comment.
at::_softmax propagates names "coincidentally" because it calls empty_like under the hood. Named tensor support for softmax existed before this PR but still exists after this PR.
We need to explicitly propagate names now because the returned tensor from the NoNamesGuard block has no names.
There was a problem hiding this comment.
ah, I was expecting to see a propagate_names removed from _softmax but that makes sense
| Tensor converted = dtype.has_value() ? input_.toType(dtype.value()) : input_; | ||
| return at::_softmax(converted, dim_, false); | ||
| } | ||
| }(); |
There was a problem hiding this comment.
curious, does c++ have other ways to do this that don't require creating a function object? (e.g. in some other languages I could just have result = { whatever; }).
Are there cleaner options that only work in newer versions of C++?
There was a problem hiding this comment.
Not that I know of. The alternative is to create a new scope with { } and assign to variable outside the scope. This is a little less nice because sometimes there are multiple return statements in a block.
nairbv
left a comment
There was a problem hiding this comment.
might want to update the description if not just a refactor but also changing behavior for softmax / log_softmax, otherwise looks good.
Summary: Pull Request resolved: pytorch/pytorch#26968 To make implementations of an operator more robust, we should have a separate "named area" where name propagation happens and an "unnamed area" where the implementation is. Right now, many functions are implemented without an "unnamed area". The problem with that is that if someone modifies the implementation, it is very easy to break namedtensor support by using a helper function that does not propagate names correctly. The test coverage for named tensors is also insufficient to catch such breakages. This PR modifies some named tensor implementations to have separate "named area" and "unnamed area". The following implementations were changed: - dropout, softmax, log_softmax, bernoulli - dot, mm, addmm, addmv, mv Test Plan: - [namedtensor ci] Differential Revision: D17627920 Pulled By: zou3519 fbshipit-source-id: 9300ac3962219b1fcd8c4c8705a2cea6f8c9d23d
Summary: Pull Request resolved: pytorch#26968 To make implementations of an operator more robust, we should have a separate "named area" where name propagation happens and an "unnamed area" where the implementation is. Right now, many functions are implemented without an "unnamed area". The problem with that is that if someone modifies the implementation, it is very easy to break namedtensor support by using a helper function that does not propagate names correctly. The test coverage for named tensors is also insufficient to catch such breakages. This PR modifies some named tensor implementations to have separate "named area" and "unnamed area". The following implementations were changed: - dropout, softmax, log_softmax, bernoulli - dot, mm, addmm, addmv, mv Test Plan: - [namedtensor ci] Differential Revision: D17627920 Pulled By: zou3519 fbshipit-source-id: 9300ac3962219b1fcd8c4c8705a2cea6f8c9d23d
Summary: Pull Request resolved: pytorch#26968 To make implementations of an operator more robust, we should have a separate "named area" where name propagation happens and an "unnamed area" where the implementation is. Right now, many functions are implemented without an "unnamed area". The problem with that is that if someone modifies the implementation, it is very easy to break namedtensor support by using a helper function that does not propagate names correctly. The test coverage for named tensors is also insufficient to catch such breakages. This PR modifies some named tensor implementations to have separate "named area" and "unnamed area". The following implementations were changed: - dropout, softmax, log_softmax, bernoulli - dot, mm, addmm, addmv, mv Test Plan: - [namedtensor ci] Differential Revision: D17627920 Pulled By: zou3519 fbshipit-source-id: 9300ac3962219b1fcd8c4c8705a2cea6f8c9d23d
Stack from ghstack:
To make implementations of an operator more robust, we should have a
separate "named area" where name propagation happens and an "unnamed
area" where the implementation is. Right now, many functions are
implemented without an "unnamed area". The problem with that is that if
someone modifies the implementation, it is very easy to break
namedtensor support by using a helper function that does not propagate
names correctly. The test coverage for named tensors is also
insufficient to catch such breakages.
This PR modifies some named tensor implementations to have separate
"named area" and "unnamed area". The following implementations were
changed:
Test Plan:
Differential Revision: D17627920