Add memory format support to typecasting shortcuts `byte`,`char`,`double`,`bool`,`half`,`int`,`long`,`short`,`float`,`bfloat16` by VitalyFedyunin · Pull Request #27228 · pytorch/pytorch

VitalyFedyunin · 2019-10-02T19:20:16Z

Stack from ghstack:

Add channels last support to cuda.comm.scatter and gather #28077 Add channels last support to cuda.comm.scatter and gather
Kill operator== of TensorOptions as confusing one #28076 Kill operator== of TensorOptions as confusing one
Add memory format support to resize_as_ operator #27979 Add memory format support to resize_as_ operator
Cleanup testing of _like operators #27891 Cleanup testing of _like operators
Add memory format support to randn_like operator #27890 Add memory format support to randn_like operator
Add memory format support to randint_like operator #27889 Add memory format support to randint_like operator
Add memory format support to zeros_like operator #27562 Add memory format support to zeros_like operator
Add memory format support to rand_like operator #27561 Add memory format support to rand_like operator
Add memory format support to ones_like operator #27270 Add memory format support to ones_like operator
Add memory format support to full_like operator #27262 Add memory format support to full_like operator
Add memory format support to empty_like operator #27244 Add memory format support to empty_like operator
Add memory format support to typecasting shortcuts byte,char,double,bool,half,int,long,short,float,bfloat16 #27228 Add memory format support to typecasting shortcuts byte,char,double,bool,half,int,long,short,float,bfloat16
Add memory format support to cpu and cuda operators #27223 Add memory format support to cpu and cuda operators

Adds memory_format keyword argument (positional for cpp).

'Preserve' behavior now follows next rules:

If tensor is non-overlapping and dense - output tensor will have the same strides as input tensor.
If not (1) and tensor is stored in the channels last format, output tensor going to have channels last format.
Output tensor is going to be contiguous in all other cases.

Dense tensor is the tensor that store values in a contiguous block of memory.
Non-overlapping tensor is the tensor in which elements occupy individual non-repetitive memory.

Differential Revision: D17980315

…ouble', 'bool', 'half', 'int', 'long', 'short','float','bfloat16' [ghstack-poisoned]

… 'char', 'double', 'bool', 'half', 'int', 'long', 'short','float','bfloat16'" [ghstack-poisoned]

…`char`,`double`,`bool`,`half`,`int`,`long`,`short`,`float`,`bfloat16`" Adds memory_format keyword argument (positional for cpp). 'Preserve' behavior now follows next rules: 1) If tensor is non-overlapping and dense - output tensor will have the same strides as input tensor. 2) If not (1) and tensor is stored in the channels last format, output tensor going to have channels last format. 3) Output tensor is going to be contiguous in all other cases. --- Dense tensor is the tensor that store values in a contiguous block of memory. Non-overlapping tensor is the tensor in which elements occupy individual non-repetitive memory. [ghstack-poisoned]

…cuts `byte`,`char`,`double`,`bool`,`half`,`int`,`long`,`short`,`float`,`bfloat16`" Adds memory_format keyword argument (positional for cpp). 'Preserve' behavior now follows next rules: 1) If tensor is non-overlapping and dense - output tensor will have the same strides as input tensor. 2) If not (1) and tensor is stored in the channels last format, output tensor going to have channels last format. 3) Output tensor is going to be contiguous in all other cases. --- Dense tensor is the tensor that store values in a contiguous block of memory. Non-overlapping tensor is the tensor in which elements occupy individual non-repetitive memory. [ghstack-poisoned]

…hortcuts `byte`,`char`,`double`,`bool`,`half`,`int`,`long`,`short`,`float`,`bfloat16`" Adds memory_format keyword argument (positional for cpp). 'Preserve' behavior now follows next rules: 1) If tensor is non-overlapping and dense - output tensor will have the same strides as input tensor. 2) If not (1) and tensor is stored in the channels last format, output tensor going to have channels last format. 3) Output tensor is going to be contiguous in all other cases. --- Dense tensor is the tensor that store values in a contiguous block of memory. Non-overlapping tensor is the tensor in which elements occupy individual non-repetitive memory. [ghstack-poisoned]

VitalyFedyunin · 2019-10-15T17:14:26Z

@ailzhang breaks XLA

zou3519 · 2019-10-15T21:44:56Z

tools/autograd/templates/python_variable_methods.cpp


-static PyObject * THPVariable_char(PyObject* self, PyObject* args) {
-  return THPVariable_to_type(self, ScalarType::Char);
+static PyObject * THPVariable_to_type_char(PyObject* self, PyObject* args, PyObject* kwargs)  {


We should keep the name as THPVariable_char. Most, if not all autogenerated functions follow a naming format of THPVariable_{api_name}; I also rely on this to search through the code base (but I'm not sure if anyone else does this).

zou3519 · 2019-10-15T21:52:22Z

tools/autograd/templates/python_variable_methods.cpp

+static PyObject * THPVariable_to_type_float(PyObject* self, PyObject* args, PyObject* kwargs) {
+  HANDLE_TH_ERRORS
+  static PythonArgParser parser({
+    "float(*, MemoryFormat? memory_format=None)"


Are there jit tests that check that this work without MemoryFormat? We should also add the following:

If the JIT supports float(MemoryFormat), then add a test to check this

If the JIT does not support float(MemoryFormat), then add a test and file an issue

#28221 #28220 will do both later to unlock support for TorchScript and add tests.

…shortcuts `byte`,`char`,`double`,`bool`,`half`,`int`,`long`,`short`,`float`,`bfloat16`" Adds memory_format keyword argument (positional for cpp). 'Preserve' behavior now follows next rules: 1) If tensor is non-overlapping and dense - output tensor will have the same strides as input tensor. 2) If not (1) and tensor is stored in the channels last format, output tensor going to have channels last format. 3) Output tensor is going to be contiguous in all other cases. --- Dense tensor is the tensor that store values in a contiguous block of memory. Non-overlapping tensor is the tensor in which elements occupy individual non-repetitive memory. [ghstack-poisoned]

ezyang · 2019-10-16T15:33:45Z

tools/autograd/templates/python_variable_methods.cpp

+  return THPVariable_Wrap(dispatch_to(self_, scalarType, false, false, optional_memory_format));
  END_HANDLE_TH_ERRORS
 }
-static PyObject * THPVariable_byte(PyObject* self, PyObject* args) {


I am a bit skeptical that these should be manually bound. They seem like fairly simple functions that the code generator should be able to understand...

Eh, if we made byte, char, ..., into native functions, then they'd call at::to. This would do two dispatches; once for byte and once for to, which could be annoying. I don't know how much times dispatches take nowadays or if it is important that we micro-manage performance on these functions

If we care about the performance, we can call native::to and specify the derivative for each of byte, char, ...

I don't think the dispatching really matters, casting isn't super common and these functions don't even support non_blocking, so you are going to pay the copy immediately anyway.

Lets address it separately #28223

ezyang · 2019-10-16T15:35:16Z

test/test_torch.py


+    def test_memory_format_type_shortcuts(self, device):
+        def input_generator_fn(device):
+            return torch.randn((10, 3, 32, 32), device=device, dtype=torch.float32).clamp(0, 1).round().contiguous(memory_format=torch.channels_last)


Does the tensor really need to be this big in the test?

Not really, will make it smaller.

Fixed in #27891

ezyang

I concur with Richard, implications on JIT must be checked as you have manually edited the Python bindings. However, I do think that this can come in a later PR.

…ting shortcuts `byte`,`char`,`double`,`bool`,`half`,`int`,`long`,`short`,`float`,`bfloat16`" Adds memory_format keyword argument (positional for cpp). 'Preserve' behavior now follows next rules: 1) If tensor is non-overlapping and dense - output tensor will have the same strides as input tensor. 2) If not (1) and tensor is stored in the channels last format, output tensor going to have channels last format. 3) Output tensor is going to be contiguous in all other cases. --- Dense tensor is the tensor that store values in a contiguous block of memory. Non-overlapping tensor is the tensor in which elements occupy individual non-repetitive memory. [ghstack-poisoned]

…cuts `byte`,`char`,`double`,`bool`,`half`,`int`,`long`,`short`,`float`,`bfloat16`" Adds memory_format keyword argument (positional for cpp). 'Preserve' behavior now follows next rules: 1) If tensor is non-overlapping and dense - output tensor will have the same strides as input tensor. 2) If not (1) and tensor is stored in the channels last format, output tensor going to have channels last format. 3) Output tensor is going to be contiguous in all other cases. --- Dense tensor is the tensor that store values in a contiguous block of memory. Non-overlapping tensor is the tensor in which elements occupy individual non-repetitive memory. [ghstack-poisoned]

…ble`,`bool`,`half`,`int`,`long`,`short`,`float`,`bfloat16` (#27228) Summary: Pull Request resolved: #27228 Adds memory_format keyword argument (positional for cpp). 'Preserve' behavior now follows next rules: 1) If tensor is non-overlapping and dense - output tensor will have the same strides as input tensor. 2) If not (1) and tensor is stored in the channels last format, output tensor going to have channels last format. 3) Output tensor is going to be contiguous in all other cases. --- Dense tensor is the tensor that store values in a contiguous block of memory. Non-overlapping tensor is the tensor in which elements occupy individual non-repetitive memory. Test Plan: Imported from OSS Differential Revision: D17980315 Pulled By: VitalyFedyunin fbshipit-source-id: fd5615621bc4968aa4ef2a26430c492c552ed671

facebook-github-bot · 2019-10-17T17:37:03Z

@VitalyFedyunin merged this pull request in 15df371.

…ble`,`bool`,`half`,`int`,`long`,`short`,`float`,`bfloat16` Adds memory_format keyword argument (positional for cpp). 'Preserve' behavior now follows next rules: 1) If tensor is non-overlapping and dense - output tensor will have the same strides as input tensor. 2) If not (1) and tensor is stored in the channels last format, output tensor going to have channels last format. 3) Output tensor is going to be contiguous in all other cases. --- Dense tensor is the tensor that store values in a contiguous block of memory. Non-overlapping tensor is the tensor in which elements occupy individual non-repetitive memory. ghstack-source-id: 3364785 Pull Request resolved: pytorch/pytorch#27228

…ble`,`bool`,`half`,`int`,`long`,`short`,`float`,`bfloat16` (pytorch#27228) Summary: Pull Request resolved: pytorch#27228 Adds memory_format keyword argument (positional for cpp). 'Preserve' behavior now follows next rules: 1) If tensor is non-overlapping and dense - output tensor will have the same strides as input tensor. 2) If not (1) and tensor is stored in the channels last format, output tensor going to have channels last format. 3) Output tensor is going to be contiguous in all other cases. --- Dense tensor is the tensor that store values in a contiguous block of memory. Non-overlapping tensor is the tensor in which elements occupy individual non-repetitive memory. Test Plan: Imported from OSS Differential Revision: D17980128 Pulled By: VitalyFedyunin fbshipit-source-id: b2646bab72c4475b7a82bb271d204a9d96d28bd4

…ble`,`bool`,`half`,`int`,`long`,`short`,`float`,`bfloat16` (pytorch#27228) Summary: Pull Request resolved: pytorch#27228 Adds memory_format keyword argument (positional for cpp). 'Preserve' behavior now follows next rules: 1) If tensor is non-overlapping and dense - output tensor will have the same strides as input tensor. 2) If not (1) and tensor is stored in the channels last format, output tensor going to have channels last format. 3) Output tensor is going to be contiguous in all other cases. --- Dense tensor is the tensor that store values in a contiguous block of memory. Non-overlapping tensor is the tensor in which elements occupy individual non-repetitive memory. Test Plan: Imported from OSS Differential Revision: D17980315 Pulled By: VitalyFedyunin fbshipit-source-id: fd5615621bc4968aa4ef2a26430c492c552ed671

Add memory format support to typecasting shortcuts 'byte', 'char', 'd…

f9ec45f

…ouble', 'bool', 'half', 'int', 'long', 'short','float','bfloat16' [ghstack-poisoned]

pytorchbot added module: internals Related to internal abstractions in c10 and ATen module: operators labels Oct 2, 2019

VitalyFedyunin mentioned this pull request Oct 2, 2019

Add memory format argument to the clone operator #27106

Closed

pytorchbot added the module: pybind Related to our Python bindings / interactions with other Python libraries label Oct 2, 2019

This was referenced Oct 2, 2019

Add memory_format support to and type operators #27107

Closed

Add memory format support to cpu and cuda operators #27223

Closed

Update on "Add memory format support to typecasting shortcuts 'byte',…

560ff7c

… 'char', 'double', 'bool', 'half', 'int', 'long', 'short','float','bfloat16'" [ghstack-poisoned]

VitalyFedyunin mentioned this pull request Oct 2, 2019

Add memory format support to empty_like operator #27244

Closed

VitalyFedyunin mentioned this pull request Oct 3, 2019

Add memory format support to full_like operator #27262

Closed

VitalyFedyunin mentioned this pull request Oct 3, 2019

Add memory format support to ones_like operator #27270

Closed

This was referenced Oct 8, 2019

Add memory format support to rand_like operator #27561

Closed

Add memory format support to zeros_like operator #27562

Closed

VitalyFedyunin mentioned this pull request Oct 14, 2019

Draft PR. #27885

Closed

This was referenced Oct 14, 2019

Add memory format support to randint_like operator #27889

Closed

Add memory format support to randn_like operator #27890

Closed

Cleanup testing of _like operators #27891

Closed

VitalyFedyunin added 2 commits October 14, 2019 12:46

VitalyFedyunin mentioned this pull request Oct 15, 2019

Add memory format support to resize_as_ operator #27979

Closed

VitalyFedyunin requested a review from gchanan October 15, 2019 17:14

VitalyFedyunin requested review from ezyang and zou3519 October 15, 2019 17:14

zou3519 reviewed Oct 15, 2019

View reviewed changes

This was referenced Oct 16, 2019

Kill operator== of TensorOptions as confusing one #28076

Closed

Add channels last support to cuda.comm.scatter and gather #28077

Closed

ezyang reviewed Oct 16, 2019

View reviewed changes

ezyang approved these changes Oct 16, 2019

View reviewed changes

VitalyFedyunin mentioned this pull request Oct 17, 2019

Convert manually bound cuda cpu byte float operators to native_functions #28223

Open

facebook-github-bot closed this in 15df371 Oct 17, 2019

facebook-github-bot added the merged label Oct 17, 2019

facebook-github-bot deleted the gh/VitalyFedyunin/4/head branch October 28, 2019 22:07

mruberry added the Merged label Oct 28, 2020

Conversation

VitalyFedyunin commented Oct 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

VitalyFedyunin commented Oct 15, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zou3519 Oct 16, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ezyang left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Oct 17, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

VitalyFedyunin commented Oct 2, 2019 •

edited

Loading

zou3519 Oct 16, 2019 •

edited

Loading