Refactor tensor_new.cpp to use TensorOptions instead of DispatchKey by ezyang · Pull Request #54034 · pytorch/pytorch

ezyang · 2021-03-15T22:32:36Z

Stack from ghstack:

Restore storage on meta tensors; increase meta coverage #53973 Add some trivial meta functions for commonly used inplace operations
Fix inaccurate dispatch tables #54127 Fix inaccurate dispatch tables
Add Tensor::is_cpu, genericize TensorIterator #54079 Add Tensor::is_cpu, genericize TensorIterator
Refactor tensor_new.cpp to use TensorOptions instead of DispatchKey #54034 Refactor tensor_new.cpp to use TensorOptions instead of DispatchKey

I had to touch a bunch of lines but the refactoring was fairly
mechanical. Here's how it works.

The basic concept behind this PR is that tensor_new.cpp was previously
abusing DispatchKey when it actually meant TensorOptions. The provided
DispatchKey argument to most of the constructor functions typically
comes from torch::tensors::get_default_dispatch_key(); it doesn't
really make sense for people to set the default dispatch key, but
this got grandfathered in due to the old API set_default_tensor_type
(where the "Type" concept got refactored into "DispatchKey" concept
over time). See also #53124. But the upshot is that, semantically,
what we refer to as the default dispatch key really is more like
torch.set_default_tensor_type(torch.Tensor) versus
torch.set_default_tensor_type(torch.cuda.Tensor): clearly the user
wants to do something about construction of the tensor, and
TensorOptions captures that exactly.

So, how exactly to translate from one to the other?

Sources (things that used to PRODUCE DispatchKey)
- Most top level functions take a DispatchKey as their argument. I
  use the new function dispatchKeyToTensorOptions to convert it into
  a TensorOptions
- typeIdWithDefault now produces a TensorOptions (probably could do
  with a rename, though I didn't)
Sinks (things that used to CONSUME DispatchKey)
- Previously, the function options() was typically used to convert the
  DispatchKey into a TensorOptions. Now its replacement build_options
  just takes a TensorOptions and sets some extra fields on it.
  Irritatingly, I can't just replace
  build_options(options, scalar_type, device) with
  options.dtype(scalar_type).device(device) because the semantics
  are slightly different: if device is nullopt, we should preserve
  the usage of the device specified in options (what options.device()
  does is overwrite the device unconditionally; e.g., if device is
  nullopt, unset device from options)
- The other major sink for DispatchKey was internal_new_from_data,
  but it turns out it only really extracts the device type from
  the dispatch key. Now it just pulls out the device from
  TensorOptions.
To actually do the translation of DispatchKey to TensorOptions, I
introduce new functions dispatchKeyToLayout (replicating
layout_from_backend--there are still a few uses of this function
so I couldn't delete it) and dispatchKeyToDeviceType (replacing
computeDeviceType)
In all internal functions, whenever DispatchKey is taken as an argument,
I instead take TensorOptions as an argument, and pass it along.
Anywhere legacyExtractDispatchKey(other.key_set()) equality was
previously used, I now do other.options().type_equal(), which
is the intended BC for doing "backend to backend" comparisons
There are a few places in the sparse constructors where we allocated
a tensor for values, and then read out the dispatch key from the
result to allocate the keys. As best as I can tell, this is totally
equivalent to just passing in the options to both values and indices
(the only difference is dtype, which is captured via a separate
argument)

This refactor doesn't really go far enough: for example, there are now
functions that take both TensorOptions and ScalarType, when really
the TensorOptions can capture this all. I kept it solely just
s/DispatchKey/TensorOptions/ to reduce the number of possible bugs;
also, a lot of this will be mooted by a proper fix to #53124.

Even with this limited refactor, the payoff is sweet. I can delete:

backendToCPU
backendToXPU
backendToCUDA
backendToHIP
backendToBackendOfDeviceType

The reason I can do this is because I can simply overwrite layout in TensorOptions
to do the conversion, rather than having to type out each backend case
explicitly.

Signed-off-by: Edward Z. Yang ezyang@fb.com

Differential Revision: D27109509

Fixes #53544 I had to touch a bunch of lines but the refactoring was fairly mechanical. Here's how it works. The basic concept behind this PR is that tensor_new.cpp was previously abusing DispatchKey when it actually meant TensorOptions. The provided DispatchKey argument to most of the constructor functions typically comes from torch::tensors::get_default_dispatch_key(); it doesn't really make sense for people to set the default dispatch key, but this got grandfathered in due to the old API set_default_tensor_type (where the "Type" concept got refactored into "DispatchKey" concept over time). See also #53124. But the upshot is that, semantically, what we refer to as the default dispatch key really is more like torch.set_default_tensor_type(torch.Tensor) versus torch.set_default_tensor_type(torch.cuda.Tensor): clearly the user wants to do something about *construction* of the tensor, and TensorOptions captures that exactly. So, how exactly to translate from one to the other? - Sources (things that used to PRODUCE DispatchKey) - Most top level functions take a DispatchKey as their argument. I use the new function dispatchKeyToTensorOptions to convert it into a TensorOptions - typeIdWithDefault now produces a TensorOptions (probably could do with a rename, though I didn't) - Sinks (things that used to CONSUME DispatchKey) - Previously, the function options() was typically used to convert the DispatchKey into a TensorOptions. Now its replacement build_options just takes a TensorOptions and sets some extra fields on it. Irritatingly, I can't just replace `build_options(options, scalar_type, device)` with `options.dtype(scalar_type).device(device)` because the semantics are slightly different: if device is nullopt, we should preserve the usage of the device specified in options (what options.device() does is overwrite the device unconditionally; e.g., if device is nullopt, unset device from options) - The other major sink for DispatchKey was `internal_new_from_data`, but it turns out it only really extracts the device type from the dispatch key. Now it just pulls out the device from TensorOptions. - To actually do the translation of DispatchKey to TensorOptions, I introduce new functions dispatchKeyToLayout (replicating layout_from_backend--there are still a few uses of this function so I couldn't delete it) and dispatchKeyToDeviceType (replacing computeDeviceType) - In all internal functions, whenever DispatchKey is taken as an argument, I instead take TensorOptions as an argument, and pass it along. - Anywhere `legacyExtractDispatchKey(other.key_set())` equality was previously used, I now do `other.options().type_equal()`, which is the intended BC for doing "backend to backend" comparisons - There are a few places in the sparse constructors where we allocated a tensor for values, and then read out the dispatch key from the result to allocate the keys. As best as I can tell, this is totally equivalent to just passing in the options to both values and indices (the only difference is dtype, which is captured via a separate argument) This refactor doesn't really go far enough: for example, there are now functions that take both TensorOptions and ScalarType, when really the TensorOptions can capture this all. I kept it solely just s/DispatchKey/TensorOptions/ to reduce the number of possible bugs; also, a lot of this will be mooted by a proper fix to #53124. Even with this limited refactor, the payoff is sweet. I can delete: - backendToCPU - backendToXPU - backendToCUDA - backendToHIP - backendToBackendOfDeviceType The reason I can do this is because I can simply overwrite layout in TensorOptions to do the conversion, rather than having to type out each backend case explicitly. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]

facebook-github-bot · 2021-03-15T22:32:45Z

💊 CI failures summary and remediations

As of commit a48b54a (more details on the Dr. CI page):

1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

clang-tidy (1/1)

Step: "Add annotations" (full log | diagnosis details | 🔁 rerun)

2021-03-18T16:25:47.3819241Z update-alternatives: error: no alternatives for nsight-sys

2021-03-18T16:25:47.3641658Z Setting up x11proto-damage-dev (1:2018.4-4) ...
2021-03-18T16:25:47.3642543Z Setting up cuda-nvjpeg-10-2 (10.2.89-1) ...
2021-03-18T16:25:47.3643460Z Setting up libxcb-dri3-dev:amd64 (1.13-2~ubuntu18.04) ...
2021-03-18T16:25:47.3644343Z Setting up libcublas10 (10.2.3.254-1) ...
2021-03-18T16:25:47.3645131Z Setting up libcublas-dev (10.2.3.254-1) ...
2021-03-18T16:25:47.3645951Z Setting up cuda-cufft-10-2 (10.2.89-1) ...
2021-03-18T16:25:47.3646868Z Setting up cuda-nsight-compute-10-2 (10.2.89-1) ...
2021-03-18T16:25:47.3647900Z Setting up nsight-systems-2020.4.3 (2020.4.3.7-10543b6) ...
2021-03-18T16:25:47.3725774Z update-alternatives: using /opt/nvidia/nsight-systems/2020.4.3/target-linux-x64/nsys to provide /usr/local/bin/nsys (nsys) in auto mode
2021-03-18T16:25:47.3813115Z update-alternatives: error: alternative path /opt/nvidia/nsight-systems/2020.4.3/host-linux-x64/nsight-sys doesn't exist
2021-03-18T16:25:47.3819241Z update-alternatives: error: no alternatives for nsight-sys
2021-03-18T16:25:47.3876172Z update-alternatives: using /opt/nvidia/nsight-systems/2020.4.3/host-linux-x64/nsys-ui to provide /usr/local/bin/nsys-ui (nsys-ui) in auto mode
2021-03-18T16:25:47.3912851Z Setting up libxcb-shape0-dev:amd64 (1.13-2~ubuntu18.04) ...
2021-03-18T16:25:47.3936232Z Setting up cuda-cusparse-10-2 (10.2.89-1) ...
2021-03-18T16:25:47.4003838Z Setting up cuda-cuobjdump-10-2 (10.2.89-1) ...
2021-03-18T16:25:47.4027566Z Setting up libxcb-glx0-dev:amd64 (1.13-2~ubuntu18.04) ...
2021-03-18T16:25:47.4049401Z Setting up libgles2:amd64 (1.0.0-2ubuntu2.3) ...
2021-03-18T16:25:47.4070486Z Setting up libglu1-mesa:amd64 (9.0.0-2.1build1) ...
2021-03-18T16:25:47.4090571Z Setting up libegl-mesa0:amd64 (20.0.8-0ubuntu1~18.04.1) ...
2021-03-18T16:25:47.4162420Z Setting up cuda-sanitizer-api-10-2 (10.2.89-1) ...
2021-03-18T16:25:47.4228243Z Setting up cuda-nvjpeg-dev-10-2 (10.2.89-1) ...

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

…spatchKey" Fixes #53544 I had to touch a bunch of lines but the refactoring was fairly mechanical. Here's how it works. The basic concept behind this PR is that tensor_new.cpp was previously abusing DispatchKey when it actually meant TensorOptions. The provided DispatchKey argument to most of the constructor functions typically comes from torch::tensors::get_default_dispatch_key(); it doesn't really make sense for people to set the default dispatch key, but this got grandfathered in due to the old API set_default_tensor_type (where the "Type" concept got refactored into "DispatchKey" concept over time). See also #53124. But the upshot is that, semantically, what we refer to as the default dispatch key really is more like torch.set_default_tensor_type(torch.Tensor) versus torch.set_default_tensor_type(torch.cuda.Tensor): clearly the user wants to do something about *construction* of the tensor, and TensorOptions captures that exactly. So, how exactly to translate from one to the other? - Sources (things that used to PRODUCE DispatchKey) - Most top level functions take a DispatchKey as their argument. I use the new function dispatchKeyToTensorOptions to convert it into a TensorOptions - typeIdWithDefault now produces a TensorOptions (probably could do with a rename, though I didn't) - Sinks (things that used to CONSUME DispatchKey) - Previously, the function options() was typically used to convert the DispatchKey into a TensorOptions. Now its replacement build_options just takes a TensorOptions and sets some extra fields on it. Irritatingly, I can't just replace `build_options(options, scalar_type, device)` with `options.dtype(scalar_type).device(device)` because the semantics are slightly different: if device is nullopt, we should preserve the usage of the device specified in options (what options.device() does is overwrite the device unconditionally; e.g., if device is nullopt, unset device from options) - The other major sink for DispatchKey was `internal_new_from_data`, but it turns out it only really extracts the device type from the dispatch key. Now it just pulls out the device from TensorOptions. - To actually do the translation of DispatchKey to TensorOptions, I introduce new functions dispatchKeyToLayout (replicating layout_from_backend--there are still a few uses of this function so I couldn't delete it) and dispatchKeyToDeviceType (replacing computeDeviceType) - In all internal functions, whenever DispatchKey is taken as an argument, I instead take TensorOptions as an argument, and pass it along. - Anywhere `legacyExtractDispatchKey(other.key_set())` equality was previously used, I now do `other.options().type_equal()`, which is the intended BC for doing "backend to backend" comparisons - There are a few places in the sparse constructors where we allocated a tensor for values, and then read out the dispatch key from the result to allocate the keys. As best as I can tell, this is totally equivalent to just passing in the options to both values and indices (the only difference is dtype, which is captured via a separate argument) This refactor doesn't really go far enough: for example, there are now functions that take both TensorOptions and ScalarType, when really the TensorOptions can capture this all. I kept it solely just s/DispatchKey/TensorOptions/ to reduce the number of possible bugs; also, a lot of this will be mooted by a proper fix to #53124. Even with this limited refactor, the payoff is sweet. I can delete: - backendToCPU - backendToXPU - backendToCUDA - backendToHIP - backendToBackendOfDeviceType The reason I can do this is because I can simply overwrite layout in TensorOptions to do the conversion, rather than having to type out each backend case explicitly. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]

Fixes #53544 I had to touch a bunch of lines but the refactoring was fairly mechanical. Here's how it works. The basic concept behind this PR is that tensor_new.cpp was previously abusing DispatchKey when it actually meant TensorOptions. The provided DispatchKey argument to most of the constructor functions typically comes from torch::tensors::get_default_dispatch_key(); it doesn't really make sense for people to set the default dispatch key, but this got grandfathered in due to the old API set_default_tensor_type (where the "Type" concept got refactored into "DispatchKey" concept over time). See also #53124. But the upshot is that, semantically, what we refer to as the default dispatch key really is more like torch.set_default_tensor_type(torch.Tensor) versus torch.set_default_tensor_type(torch.cuda.Tensor): clearly the user wants to do something about *construction* of the tensor, and TensorOptions captures that exactly. So, how exactly to translate from one to the other? - Sources (things that used to PRODUCE DispatchKey) - Most top level functions take a DispatchKey as their argument. I use the new function dispatchKeyToTensorOptions to convert it into a TensorOptions - typeIdWithDefault now produces a TensorOptions (probably could do with a rename, though I didn't) - Sinks (things that used to CONSUME DispatchKey) - Previously, the function options() was typically used to convert the DispatchKey into a TensorOptions. Now its replacement build_options just takes a TensorOptions and sets some extra fields on it. Irritatingly, I can't just replace `build_options(options, scalar_type, device)` with `options.dtype(scalar_type).device(device)` because the semantics are slightly different: if device is nullopt, we should preserve the usage of the device specified in options (what options.device() does is overwrite the device unconditionally; e.g., if device is nullopt, unset device from options) - The other major sink for DispatchKey was `internal_new_from_data`, but it turns out it only really extracts the device type from the dispatch key. Now it just pulls out the device from TensorOptions. - To actually do the translation of DispatchKey to TensorOptions, I introduce new functions dispatchKeyToLayout (replicating layout_from_backend--there are still a few uses of this function so I couldn't delete it) and dispatchKeyToDeviceType (replacing computeDeviceType) - In all internal functions, whenever DispatchKey is taken as an argument, I instead take TensorOptions as an argument, and pass it along. - Anywhere `legacyExtractDispatchKey(other.key_set())` equality was previously used, I now do `other.options().type_equal()`, which is the intended BC for doing "backend to backend" comparisons - There are a few places in the sparse constructors where we allocated a tensor for values, and then read out the dispatch key from the result to allocate the keys. As best as I can tell, this is totally equivalent to just passing in the options to both values and indices (the only difference is dtype, which is captured via a separate argument) This refactor doesn't really go far enough: for example, there are now functions that take both TensorOptions and ScalarType, when really the TensorOptions can capture this all. I kept it solely just s/DispatchKey/TensorOptions/ to reduce the number of possible bugs; also, a lot of this will be mooted by a proper fix to #53124. Even with this limited refactor, the payoff is sweet. I can delete: - backendToCPU - backendToXPU - backendToCUDA - backendToHIP - backendToBackendOfDeviceType The reason I can do this is because I can simply overwrite layout in TensorOptions to do the conversion, rather than having to type out each backend case explicitly. Signed-off-by: Edward Z. Yang <ezyang@fb.com> ghstack-source-id: 798965d Pull Request resolved: #54034

ezyang · 2021-03-16T15:36:32Z

Apparently I broke torch.DoubleTensor(torch.DoubleTensor([0.0])). Whacky!

ezyang · 2021-03-16T16:41:11Z

-                   toString(dispatch_key), " (got ", toString(legacyExtractDispatchKey(other.key_set())), ")");
+Tensor new_with_tensor(c10::TensorOptions options, at::ScalarType scalar_type, const Tensor& other) {
+  TORCH_CHECK_TYPE(other.options().type_equal(options), "expected ",
+                   options, " (got ", other.options(), ")");


This isn't right, because other.options() also includes dtype

a) I'm not sure which thing isn't right here b) regardless, is it worth putting in a comment?

This comment is out of date; the code is updated!

ezyang · 2021-03-16T16:51:03Z

There are a few places in the sparse constructors where we allocated
a tensor for values, and then read out the dispatch key from the
result to allocate the keys. As best as I can tell, this is totally
equivalent to just passing in the options to both values and indices
(the only difference is dtype, which is captured via a separate
argument)

The CUDA failures suggest I was wrong here.

…spatchKey" Fixes #53544 I had to touch a bunch of lines but the refactoring was fairly mechanical. Here's how it works. The basic concept behind this PR is that tensor_new.cpp was previously abusing DispatchKey when it actually meant TensorOptions. The provided DispatchKey argument to most of the constructor functions typically comes from torch::tensors::get_default_dispatch_key(); it doesn't really make sense for people to set the default dispatch key, but this got grandfathered in due to the old API set_default_tensor_type (where the "Type" concept got refactored into "DispatchKey" concept over time). See also #53124. But the upshot is that, semantically, what we refer to as the default dispatch key really is more like torch.set_default_tensor_type(torch.Tensor) versus torch.set_default_tensor_type(torch.cuda.Tensor): clearly the user wants to do something about *construction* of the tensor, and TensorOptions captures that exactly. So, how exactly to translate from one to the other? - Sources (things that used to PRODUCE DispatchKey) - Most top level functions take a DispatchKey as their argument. I use the new function dispatchKeyToTensorOptions to convert it into a TensorOptions - typeIdWithDefault now produces a TensorOptions (probably could do with a rename, though I didn't) - Sinks (things that used to CONSUME DispatchKey) - Previously, the function options() was typically used to convert the DispatchKey into a TensorOptions. Now its replacement build_options just takes a TensorOptions and sets some extra fields on it. Irritatingly, I can't just replace `build_options(options, scalar_type, device)` with `options.dtype(scalar_type).device(device)` because the semantics are slightly different: if device is nullopt, we should preserve the usage of the device specified in options (what options.device() does is overwrite the device unconditionally; e.g., if device is nullopt, unset device from options) - The other major sink for DispatchKey was `internal_new_from_data`, but it turns out it only really extracts the device type from the dispatch key. Now it just pulls out the device from TensorOptions. - To actually do the translation of DispatchKey to TensorOptions, I introduce new functions dispatchKeyToLayout (replicating layout_from_backend--there are still a few uses of this function so I couldn't delete it) and dispatchKeyToDeviceType (replacing computeDeviceType) - In all internal functions, whenever DispatchKey is taken as an argument, I instead take TensorOptions as an argument, and pass it along. - Anywhere `legacyExtractDispatchKey(other.key_set())` equality was previously used, I now do `other.options().type_equal()`, which is the intended BC for doing "backend to backend" comparisons - There are a few places in the sparse constructors where we allocated a tensor for values, and then read out the dispatch key from the result to allocate the keys. As best as I can tell, this is totally equivalent to just passing in the options to both values and indices (the only difference is dtype, which is captured via a separate argument) This refactor doesn't really go far enough: for example, there are now functions that take both TensorOptions and ScalarType, when really the TensorOptions can capture this all. I kept it solely just s/DispatchKey/TensorOptions/ to reduce the number of possible bugs; also, a lot of this will be mooted by a proper fix to #53124. Even with this limited refactor, the payoff is sweet. I can delete: - backendToCPU - backendToXPU - backendToCUDA - backendToHIP - backendToBackendOfDeviceType The reason I can do this is because I can simply overwrite layout in TensorOptions to do the conversion, rather than having to type out each backend case explicitly. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]

Fixes #53544 I had to touch a bunch of lines but the refactoring was fairly mechanical. Here's how it works. The basic concept behind this PR is that tensor_new.cpp was previously abusing DispatchKey when it actually meant TensorOptions. The provided DispatchKey argument to most of the constructor functions typically comes from torch::tensors::get_default_dispatch_key(); it doesn't really make sense for people to set the default dispatch key, but this got grandfathered in due to the old API set_default_tensor_type (where the "Type" concept got refactored into "DispatchKey" concept over time). See also #53124. But the upshot is that, semantically, what we refer to as the default dispatch key really is more like torch.set_default_tensor_type(torch.Tensor) versus torch.set_default_tensor_type(torch.cuda.Tensor): clearly the user wants to do something about *construction* of the tensor, and TensorOptions captures that exactly. So, how exactly to translate from one to the other? - Sources (things that used to PRODUCE DispatchKey) - Most top level functions take a DispatchKey as their argument. I use the new function dispatchKeyToTensorOptions to convert it into a TensorOptions - typeIdWithDefault now produces a TensorOptions (probably could do with a rename, though I didn't) - Sinks (things that used to CONSUME DispatchKey) - Previously, the function options() was typically used to convert the DispatchKey into a TensorOptions. Now its replacement build_options just takes a TensorOptions and sets some extra fields on it. Irritatingly, I can't just replace `build_options(options, scalar_type, device)` with `options.dtype(scalar_type).device(device)` because the semantics are slightly different: if device is nullopt, we should preserve the usage of the device specified in options (what options.device() does is overwrite the device unconditionally; e.g., if device is nullopt, unset device from options) - The other major sink for DispatchKey was `internal_new_from_data`, but it turns out it only really extracts the device type from the dispatch key. Now it just pulls out the device from TensorOptions. - To actually do the translation of DispatchKey to TensorOptions, I introduce new functions dispatchKeyToLayout (replicating layout_from_backend--there are still a few uses of this function so I couldn't delete it) and dispatchKeyToDeviceType (replacing computeDeviceType) - In all internal functions, whenever DispatchKey is taken as an argument, I instead take TensorOptions as an argument, and pass it along. - Anywhere `legacyExtractDispatchKey(other.key_set())` equality was previously used, I now do `other.options().type_equal()`, which is the intended BC for doing "backend to backend" comparisons - There are a few places in the sparse constructors where we allocated a tensor for values, and then read out the dispatch key from the result to allocate the keys. As best as I can tell, this is totally equivalent to just passing in the options to both values and indices (the only difference is dtype, which is captured via a separate argument) This refactor doesn't really go far enough: for example, there are now functions that take both TensorOptions and ScalarType, when really the TensorOptions can capture this all. I kept it solely just s/DispatchKey/TensorOptions/ to reduce the number of possible bugs; also, a lot of this will be mooted by a proper fix to #53124. Even with this limited refactor, the payoff is sweet. I can delete: - backendToCPU - backendToXPU - backendToCUDA - backendToHIP - backendToBackendOfDeviceType The reason I can do this is because I can simply overwrite layout in TensorOptions to do the conversion, rather than having to type out each backend case explicitly. Signed-off-by: Edward Z. Yang <ezyang@fb.com> ghstack-source-id: b1ecfa7 Pull Request resolved: #54034

…spatchKey" Fixes #53544 I had to touch a bunch of lines but the refactoring was fairly mechanical. Here's how it works. The basic concept behind this PR is that tensor_new.cpp was previously abusing DispatchKey when it actually meant TensorOptions. The provided DispatchKey argument to most of the constructor functions typically comes from torch::tensors::get_default_dispatch_key(); it doesn't really make sense for people to set the default dispatch key, but this got grandfathered in due to the old API set_default_tensor_type (where the "Type" concept got refactored into "DispatchKey" concept over time). See also #53124. But the upshot is that, semantically, what we refer to as the default dispatch key really is more like torch.set_default_tensor_type(torch.Tensor) versus torch.set_default_tensor_type(torch.cuda.Tensor): clearly the user wants to do something about *construction* of the tensor, and TensorOptions captures that exactly. So, how exactly to translate from one to the other? - Sources (things that used to PRODUCE DispatchKey) - Most top level functions take a DispatchKey as their argument. I use the new function dispatchKeyToTensorOptions to convert it into a TensorOptions - typeIdWithDefault now produces a TensorOptions (probably could do with a rename, though I didn't) - Sinks (things that used to CONSUME DispatchKey) - Previously, the function options() was typically used to convert the DispatchKey into a TensorOptions. Now its replacement build_options just takes a TensorOptions and sets some extra fields on it. Irritatingly, I can't just replace `build_options(options, scalar_type, device)` with `options.dtype(scalar_type).device(device)` because the semantics are slightly different: if device is nullopt, we should preserve the usage of the device specified in options (what options.device() does is overwrite the device unconditionally; e.g., if device is nullopt, unset device from options) - The other major sink for DispatchKey was `internal_new_from_data`, but it turns out it only really extracts the device type from the dispatch key. Now it just pulls out the device from TensorOptions. - To actually do the translation of DispatchKey to TensorOptions, I introduce new functions dispatchKeyToLayout (replicating layout_from_backend--there are still a few uses of this function so I couldn't delete it) and dispatchKeyToDeviceType (replacing computeDeviceType) - In all internal functions, whenever DispatchKey is taken as an argument, I instead take TensorOptions as an argument, and pass it along. - Anywhere `legacyExtractDispatchKey(other.key_set())` equality was previously used, I now do `other.options().type_equal()`, which is the intended BC for doing "backend to backend" comparisons - There are a few places in the sparse constructors where we allocated a tensor for values, and then read out the dispatch key from the result to allocate the keys. As best as I can tell, this is totally equivalent to just passing in the options to both values and indices (the only difference is dtype, which is captured via a separate argument) This refactor doesn't really go far enough: for example, there are now functions that take both TensorOptions and ScalarType, when really the TensorOptions can capture this all. I kept it solely just s/DispatchKey/TensorOptions/ to reduce the number of possible bugs; also, a lot of this will be mooted by a proper fix to #53124. Even with this limited refactor, the payoff is sweet. I can delete: - backendToCPU - backendToXPU - backendToCUDA - backendToHIP - backendToBackendOfDeviceType The reason I can do this is because I can simply overwrite layout in TensorOptions to do the conversion, rather than having to type out each backend case explicitly. Signed-off-by: Edward Z. Yang <ezyang@fb.com> [ghstack-poisoned]

…spatchKey" Fixes #53544 I had to touch a bunch of lines but the refactoring was fairly mechanical. Here's how it works. The basic concept behind this PR is that tensor_new.cpp was previously abusing DispatchKey when it actually meant TensorOptions. The provided DispatchKey argument to most of the constructor functions typically comes from torch::tensors::get_default_dispatch_key(); it doesn't really make sense for people to set the default dispatch key, but this got grandfathered in due to the old API set_default_tensor_type (where the "Type" concept got refactored into "DispatchKey" concept over time). See also #53124. But the upshot is that, semantically, what we refer to as the default dispatch key really is more like torch.set_default_tensor_type(torch.Tensor) versus torch.set_default_tensor_type(torch.cuda.Tensor): clearly the user wants to do something about *construction* of the tensor, and TensorOptions captures that exactly. So, how exactly to translate from one to the other? - Sources (things that used to PRODUCE DispatchKey) - Most top level functions take a DispatchKey as their argument. I use the new function dispatchKeyToTensorOptions to convert it into a TensorOptions - typeIdWithDefault now produces a TensorOptions (probably could do with a rename, though I didn't) - Sinks (things that used to CONSUME DispatchKey) - Previously, the function options() was typically used to convert the DispatchKey into a TensorOptions. Now its replacement build_options just takes a TensorOptions and sets some extra fields on it. Irritatingly, I can't just replace `build_options(options, scalar_type, device)` with `options.dtype(scalar_type).device(device)` because the semantics are slightly different: if device is nullopt, we should preserve the usage of the device specified in options (what options.device() does is overwrite the device unconditionally; e.g., if device is nullopt, unset device from options) - The other major sink for DispatchKey was `internal_new_from_data`, but it turns out it only really extracts the device type from the dispatch key. Now it just pulls out the device from TensorOptions. - To actually do the translation of DispatchKey to TensorOptions, I introduce new functions dispatchKeyToLayout (replicating layout_from_backend--there are still a few uses of this function so I couldn't delete it) and dispatchKeyToDeviceType (replacing computeDeviceType) - In all internal functions, whenever DispatchKey is taken as an argument, I instead take TensorOptions as an argument, and pass it along. - Anywhere `legacyExtractDispatchKey(other.key_set())` equality was previously used, I now do `other.options().type_equal()`, which is the intended BC for doing "backend to backend" comparisons - There are a few places in the sparse constructors where we allocated a tensor for values, and then read out the dispatch key from the result to allocate the keys. As best as I can tell, this is totally equivalent to just passing in the options to both values and indices (the only difference is dtype, which is captured via a separate argument) This refactor doesn't really go far enough: for example, there are now functions that take both TensorOptions and ScalarType, when really the TensorOptions can capture this all. I kept it solely just s/DispatchKey/TensorOptions/ to reduce the number of possible bugs; also, a lot of this will be mooted by a proper fix to #53124. Even with this limited refactor, the payoff is sweet. I can delete: - backendToCPU - backendToXPU - backendToCUDA - backendToHIP - backendToBackendOfDeviceType The reason I can do this is because I can simply overwrite layout in TensorOptions to do the conversion, rather than having to type out each backend case explicitly. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: [D27109509](https://our.internmc.facebook.com/intern/diff/D27109509) [ghstack-poisoned]

ezyang · 2021-03-18T16:25:19Z

This needs reviews!

bhosmer

Nice cleanup - one nontrivial thing inline (if I'm reading the code right)

bhosmer · 2021-03-19T05:20:34Z

-c10::DispatchKey typeIdWithDefault(PythonArgs& r, int64_t device_idx, c10::DispatchKey dispatch_key) {
-  auto device_type = r.isNone(device_idx) ? computeDeviceType(dispatch_key) : r.device(device_idx).type();
-  return backendToDispatchKey(backendToBackendOfDeviceType(dispatchKeyToBackend(dispatch_key), device_type));
+c10::TensorOptions typeIdWithDefault(PythonArgs& r, int64_t device_idx, c10::DispatchKey dispatch_key) {


This really is overdue for a rename but yeah there's no obvious upgrade. What about just optionsWithDefaultDevice or something?

bhosmer · 2021-03-19T05:21:08Z

+  auto options = dispatchKeyToTensorOptions(dispatch_key);
+  if (!r.isNone(device_idx)) {
+    // TODO: This line doesn't seem to be exercised at all in tests
+    options = options.device(r.device(device_idx).type());


Doesn't this still have the problem you described in the "janky" note? Index gets scraped off in the same way as before I think.

Yes, I think you're right. It's too bad I can't tell if it matters or not cuz it's not exercised by tests LOL

bhosmer · 2021-03-19T05:22:04Z

-                   toString(dispatch_key), " (got ", toString(legacyExtractDispatchKey(other.key_set())), ")");
+Tensor new_with_tensor(c10::TensorOptions options, at::ScalarType scalar_type, const Tensor& other) {
+  TORCH_CHECK_TYPE(other.options().type_equal(options), "expected ",
+                   options, " (got ", other.options(), ")");


a) I'm not sure which thing isn't right here b) regardless, is it worth putting in a comment?

bhosmer · 2021-03-19T05:26:24Z

+// tensor should go to CUDA, so we have to get the information in somehow.
+//
+// This code is kind of jank.  For one, the dtype in options is silently ignored
+// by internal_new_from_data.  Also, in classic janky code style, it used to


This is a choice we're making in adopting the new scheme though, right? The relevant subset of stuff carried by an options argument will be context-dependent.

Not... really? Previously the code accepted DispatchKey, but ignored everything besides the Device associated with that DispatchKey. Options is a bigger type, but the mismatch in size was always there in the code.

ezyang · 2021-03-19T13:52:04Z

Gonna see if I can land this stack ready to go (if it lands, I'll fix these in follow up), otherwise I'll address these comments

facebook-github-bot · 2021-03-19T16:10:48Z

@ezyang merged this pull request in e0aebe2.

…ytorch#54034) Summary: Pull Request resolved: pytorch#54034 Fixes pytorch#53544 I had to touch a bunch of lines but the refactoring was fairly mechanical. Here's how it works. The basic concept behind this PR is that tensor_new.cpp was previously abusing DispatchKey when it actually meant TensorOptions. The provided DispatchKey argument to most of the constructor functions typically comes from torch::tensors::get_default_dispatch_key(); it doesn't really make sense for people to set the default dispatch key, but this got grandfathered in due to the old API set_default_tensor_type (where the "Type" concept got refactored into "DispatchKey" concept over time). See also pytorch#53124. But the upshot is that, semantically, what we refer to as the default dispatch key really is more like torch.set_default_tensor_type(torch.Tensor) versus torch.set_default_tensor_type(torch.cuda.Tensor): clearly the user wants to do something about *construction* of the tensor, and TensorOptions captures that exactly. So, how exactly to translate from one to the other? - Sources (things that used to PRODUCE DispatchKey) - Most top level functions take a DispatchKey as their argument. I use the new function dispatchKeyToTensorOptions to convert it into a TensorOptions - typeIdWithDefault now produces a TensorOptions (probably could do with a rename, though I didn't) - Sinks (things that used to CONSUME DispatchKey) - Previously, the function options() was typically used to convert the DispatchKey into a TensorOptions. Now its replacement build_options just takes a TensorOptions and sets some extra fields on it. Irritatingly, I can't just replace `build_options(options, scalar_type, device)` with `options.dtype(scalar_type).device(device)` because the semantics are slightly different: if device is nullopt, we should preserve the usage of the device specified in options (what options.device() does is overwrite the device unconditionally; e.g., if device is nullopt, unset device from options) - The other major sink for DispatchKey was `internal_new_from_data`, but it turns out it only really extracts the device type from the dispatch key. Now it just pulls out the device from TensorOptions. - To actually do the translation of DispatchKey to TensorOptions, I introduce new functions dispatchKeyToLayout (replicating layout_from_backend--there are still a few uses of this function so I couldn't delete it) and dispatchKeyToDeviceType (replacing computeDeviceType) - In all internal functions, whenever DispatchKey is taken as an argument, I instead take TensorOptions as an argument, and pass it along. - Anywhere `legacyExtractDispatchKey(other.key_set())` equality was previously used, I now do `other.options().type_equal()`, which is the intended BC for doing "backend to backend" comparisons - There are a few places in the sparse constructors where we allocated a tensor for values, and then read out the dispatch key from the result to allocate the keys. As best as I can tell, this is totally equivalent to just passing in the options to both values and indices (the only difference is dtype, which is captured via a separate argument) This refactor doesn't really go far enough: for example, there are now functions that take both TensorOptions and ScalarType, when really the TensorOptions can capture this all. I kept it solely just s/DispatchKey/TensorOptions/ to reduce the number of possible bugs; also, a lot of this will be mooted by a proper fix to pytorch#53124. Even with this limited refactor, the payoff is sweet. I can delete: - backendToCPU - backendToXPU - backendToCUDA - backendToHIP - backendToBackendOfDeviceType The reason I can do this is because I can simply overwrite layout in TensorOptions to do the conversion, rather than having to type out each backend case explicitly. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D27109509 Pulled By: ezyang fbshipit-source-id: 91d16cfbc390127770362ac04fb43f7e070077e9

ezyang requested review from albanD and soulitzer as code owners March 15, 2021 22:32

This was referenced Mar 15, 2021

Also propagate storage_access_should_throw_ when copying tensor metadata #53816

Closed

Compute type_equal() without reference to backend() #53823

Closed

Make storage access error NotImplementedError #53972

Closed

facebook-github-bot added the cla signed label Mar 15, 2021

This was referenced Mar 15, 2021

Delete defunct ComplexCPU/ComplexCUDA dispatch keys #54013

Closed

Delete denseTypeIdWithDefault and toDense #54016

Closed

ezyang commented Mar 16, 2021

View reviewed changes

ezyang mentioned this pull request Mar 16, 2021

Sparse constructors don't work with cuda:1 values and non-CUDA indices #54075

Closed

ezyang requested a review from smessmer March 16, 2021 18:17

ezyang mentioned this pull request Mar 16, 2021

Add Tensor::is_cpu, genericize TensorIterator #54079

Closed

ezyang requested a review from bhosmer March 16, 2021 18:44

This was referenced Mar 17, 2021

Fix bug in assertRaises NotImplemented handling when no exception is thrown #54126

Closed

Fix inaccurate dispatch tables #54127

Closed

Restore storage on meta tensors; increase meta coverage #53973

Closed

ezyang added 3 commits March 17, 2021 07:55

ezyang requested a review from anjali411 March 18, 2021 17:10

bhosmer approved these changes Mar 19, 2021

View reviewed changes

facebook-github-bot closed this in e0aebe2 Mar 19, 2021

facebook-github-bot added the Merged label Mar 19, 2021

facebook-github-bot deleted the gh/ezyang/949/head branch March 23, 2021 14:16

Conversation

ezyang commented Mar 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Mar 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

🕵️ 1 new failure recognized by patterns

clang-tidy (1/1)

Uh oh!

ezyang commented Mar 16, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ezyang commented Mar 16, 2021

Uh oh!

ezyang commented Mar 18, 2021

Uh oh!

bhosmer left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ezyang commented Mar 19, 2021

Uh oh!

facebook-github-bot commented Mar 19, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ezyang commented Mar 15, 2021 •

edited

Loading

facebook-github-bot commented Mar 15, 2021 •

edited

Loading