Change return type of Tensor::dtype() from ScalarType to TypeMeta#12766
Change return type of Tensor::dtype() from ScalarType to TypeMeta#12766
Conversation
Differential Revision: D10232118 Differential Version: 60878911
|
I see a bunch of call sites changed to use tensor.scalar_type() instead of tensor.dtype(), but this seems like a pessimization to me. Ideally, at::kDouble and friends would be TypeMetas, actually. A lot of sites that I see here could be "fixed" if we added something like operator==(const TypeMeta&, ScalarType) (and the other direction), prior to actually switching the constants to be DataType themselves (which I agree shouldn't be done in this patch.) BTW: I'm not sure about this. Happy to talk about this some more. We can also get feedback from other people, like @gchanan and @goldsborough |
|
I agree with @ezyang, operator== is your friend |
|
@Roy-Li keep in mind every line you change in the C++ frontend equals one user who gets a confusing build error when they update and costs us github issues and friction with the community |
Differential Revision: D10232118 Differential Version: 60941370
Differential Revision: D10232118 Differential Version: 61134801
| /// Returns a `Tensor`'s dtype (`ScalarType`). Defined in Type.h | ||
| ScalarType dtype() const noexcept; | ||
| /// Returns a `Tensor`'s dtype (`TypeMeta`). Defined in TensorMethods.h | ||
| const caffe2::TypeMeta& dtype() const noexcept; |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
ezyang
left a comment
There was a problem hiding this comment.
Approved but with comments.
Differential Revision: D10232118 Differential Version: 61149897
| at::Tensor a; | ||
| pop(stack, a); | ||
| push(stack, static_cast<int64_t>(a.dtype())); | ||
| push(stack, static_cast<int64_t>(a.scalar_type())); |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| auto new_data = data.to( | ||
| device.value_or(data.device()), | ||
| dtype.value_or(data.dtype()), | ||
| dtype.value_or(data.scalar_type()), |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
Differential Revision: D10232118 Differential Version: 61337003
Differential Revision: D10232118 Differential Version: 61345559
Differential Revision: D10232118 Differential Version: 61438578
Differential Revision: D10232118 Differential Version: 61590769
Differential Revision: D10232118 Differential Version: 61603922
Differential Revision: D10232118 Differential Version: 61616763
Differential Revision: D10232118 Differential Version: 61685407
Differential Revision: D10232118 Differential Version: 61759287
…2766) Summary: Pull Request resolved: pytorch/pytorch#12766 In preparation of using TypeMeta in TensorOptions. Reviewed By: ezyang Differential Revision: D10232118 fbshipit-source-id: 5c69a524fa38e50aa555fb9feb87540bc3575a63
|
This change broke the following code: https://github.com/uber/horovod/blob/master/horovod/torch/adapter_v2.cc#L40. Is there a plan to make it backwards compatible? |
|
@alsrgv The terminal state for DataType is that it will be an open data type, so yeah, that particular use is not going to work, if you're tracking master, use |
|
(Let us know if we can add anything that will help out your use case.) |
|
@ezyang, would be great to stabilize the interface or make those breaking changes in bursts. This is the fourth break after switching to C++ API on Aug 30th. |
|
@alsrgv Unfortunately I don't think we can promise a lot of stability for the C++ interface in the next few months. We are undertaking large, really important changes to unify ATen with Caffe2 and improve its performance and portability to new platforms (like AMD or custom accelerators) over the long term, and guaranteeing a stable C++ interface would stand in the way of this effort. For this reason we never actually released a stable version of the C++ API, and we currently only release the C++ API in "beta" with the right to make frequent and breaking changes. This is also documented hopefully quite loudly in our docs at https://pytorch.org/cppdocs. In a sense, this should have been clear when you decided to depend on this C++ API (which, just to highlight this, is really the PyTorch backend). I think for your case, it would make the most sense if you pinned particular nightlies of PyTorch and only updated on a monthly-or-so basis. In that case it would be more clear what has changed over the last month, and we could also work together to update your code (which is fortunately not that much). Of course, we do also try our best not to break BC if possible, since many users do depend on the C++ API, and we want to avoid friction unless that friction stands in the way of our progress. Let me know if you like the idea of doing less frequent upgrades of PyTorch. |
|
@goldsborough, thanks for the comprehensive write-up! Unfortunately, pinning doesn't work for me because I don't bundle PyTorch, and Horovod is supposed to work with PyTorch that users got from you "today." We migrated to C++ API primarily C API was being broken as well, and we weren't allowed to make changes, e.g., add Half support. I suggest we work together on maintaining the compatibility. I'm not strongly against having a couple of #ifdef's here and there in the code for the sake of supporting PyTorch 0.4.1, 1.0, 1.1, etc., as long as they're used sparingly. Would you be willing to add |
|
@alsrgv I think the tricky thing is that we don't actually have a release of the C++ API yet, so there is no version guarding we could do. We're also a bit concerned about the idea that Horovod is supposed to work with PyTorch of "today". If I understand correctly, you are guaranteeing that Horovod works with PyTorch master? But then, isn't any project's master branch supposed to be the land of bold, unconstrained breaking changes? I don't think we can ever guarantee any stability for the C++ API on master, because that's fundamentally really not what master is intended for. We discussed adding the horovod build to our CI, but we don't like the idea of particularly favoring one partner repository to run in our CI. If other folks see that we added some kind of higher-class support for one project, I don't see it taking long until lots of other projects expect us to run their build in our CI and ensure backwards compatibility and that will just slow us down a whole deal. I still think you should just pin PyTorch to some particular nightly and have users depend on that version. I feel like depending on released (or in this case at least fixed) versions of projects is a well established practice of ensuring stability, and depending on master is always dangerous and comes with responsibility for the dependee to keep up to date with the changes of upstream. |
|
@goldsborough, the reality is that you keep adding new great features to PyTorch, and subsequently users keep upgrading to latest version of PyTorch which may be incompatible with plugins like Horovod. Horovod only supporting certain pin of PyTorch master won't work, because users won't be able to use those new great features (and they're typically very impatient!). If making a formal release of C++ API is not an option at this point, the ask is to make breaking changes in bursts, rather than trickling through. This way, plugin authors would have to make less frequently "catch up" releases. |
|
@alsrgv between 0.4.1 and 1.0 stable there is a big gap, close to 5 months. We also urged users to try 1.0 preview. This was a one-time situation as we transitioned to our first stable release. Usually our release cycle is 3 months, and we always urge users to be on stable. |
|
@soumith, great! It'd still be great if breaking changes could be batched - even before 0.4.1 I knew a lot of researchers who were sitting on the master branch. :-) |
…torch#12766) Summary: Pull Request resolved: pytorch#12766 In preparation of using TypeMeta in TensorOptions. Reviewed By: ezyang Differential Revision: D10232118 fbshipit-source-id: 5c69a524fa38e50aa555fb9feb87540bc3575a63
Summary: Fixes colliding changes in pytorch#12766 and pytorch#12368 Pull Request resolved: pytorch#13171 Differential Revision: D12109430 Pulled By: li-roy fbshipit-source-id: f068c7df227d920aa3840762e892ce6e9c109237

Stack:
:black_circle: #12766 Change return type of Tensor::dtype() from ScalarType to TypeMeta 💛
:white_circle: #12767 reduce Device to 32bits 💛
:white_circle: #13172 use TypeMeta instead of ScalarType in TensorOptions 💛
In preparation of using TypeMeta in TensorOptions.
Differential Revision: D10232118