Serialization for per channel qtensor by dzhulgakov · Pull Request #26339 · pytorch/pytorch

dzhulgakov · 2019-09-17T07:24:48Z

Stack from ghstack:

Fix _empty_per_channel_affine_quantized to be less hacky #26243 Fix _empty_per_channel_affine_quantized to be less hacky
Serialization for per channel qtensor #26339 Serialization for per channel qtensor

Serializes per-channel tensor in both torch.serialization and jit. Since we didn't bind Quantizer properly yet, I chose to save a tuple representing quantizer settings. To avoid recursive tensor serialization calls, I'm using tuple instead of tensor to store scales and zero points.

@driazati - please check the serialization logic. Is there a good test that compares that JIT serialization and python serialization are equivalent? (I haven't tested it yet)

Differential Revision: D17443222

[ghstack-poisoned]

@driazati

Serializes per-channel tensor in both torch.serialization and jit. Since we didn't bind Quantizer properly yet, I chose to save a tuple representing quantizer settings. To avoid recursive tensor serialization calls, I'm using tuple instead of tensor to store scales and zero points. @driazati - please check the serialization logic. Is there a good test that compares that JIT serialization and python serialization are equivalent? (I haven't tested it yet) [ghstack-poisoned]

ghstack-source-id: b436ddd Pull Request resolved: #26339

driazati

I'm generally apprehensive on the serialization changes, there's a lot of special casing / deeply paths in the tensor serialization and it's going to be hard to ensure there's no forward/backward compatibility problems.

driazati · 2019-09-17T17:26:17Z

-    pushInt(tensor.q_zero_point());
+    push<PickleOpCode>(PickleOpCode::MARK);
+    pushGlobal("torch", toString(tensor.qscheme()));
+    // tuple of (qscheme, scale, zp) or (qscheme, scales, zps, axis)


Why are there two different schemas here? Is it possible to always do it as (qscheme, scales, zps, axis)? Specializing is going to make the (de)serialization logic much more complicated

It depends on quantizer (which can be per-tensor or per-channel) - the per-tensor one doesn't have axis. I can make axis to be None in the other case, but not sure it makes it cleaner. Specialization still has to be there as the scale vs scales is int vs tuple of int

driazati · 2019-09-17T17:28:26Z

+      case at::kPerChannelAffine: {
+        const auto* quantizer = static_cast<at::PerChannelAffineQuantizer*>(
+            tensor.quantizer().get());
+        push<PickleOpCode>(PickleOpCode::MARK);


You can shorten all of these by creating a c10::tuple and using pushIValue

driazati · 2019-09-17T17:30:02Z

+                  axis.push_back(x.toInt());
+                }
+                result = _empty_per_channel_affine_quantized(
+                    {0}, scales, zero_points, axis, storage_tensor.options());


Why are all the parameters for _empty_per_channel_affine_quantized Tensors instead of a list?

because scales is float array, we don't have float array and we don't have plans to support it

also, very theoretically, we might have multi-axis quantization in which case scales would be a multidimensional tensor. It's good to have a support for it in api, but it's not likely to appear soon. (note that in serialiation it's still fine to fold the scales to a tuple as the shape is deducible from the original tensor.

driazati · 2019-09-17T17:36:03Z

As for tests that compare, you can try to save with the pickler (e.g. use torch.save in a @script function) and load it in eager, and vice versa. This test does that for regular tensors:

pytorch/test/test_jit.py

Line 14797 in 06c69ad

def _test_pickle_checkpoint(self, device):

jerryzh168 · 2019-09-17T17:49:27Z

@@ -722,11 +722,21 @@ def assertTensorsEqual(a, b):
            elif x.is_quantized and y.is_quantized:


nit: we can remove and y.is_quantized here

jerryzh168 · 2019-09-17T17:51:49Z

+        zero_points = torch.round(torch.rand(10) * 20 + 1).to(torch.long)
+        # quint32 is not supported yet
+        for dtype in [torch.quint8, torch.qint8]:
+            qr = torch.quantize_linear_per_channel(r, scales, zero_points, [1], dtype)


what is the behavior for slicing(view) on per channel quantized tensor right now?

any slicing / set strides operation today basically errors out with nice message of "can't do it". We could support a subset of these operations, but I doubt it's worth it at this point.

Is this true for permute also? We could conceptually support it, but again doesn't seem like it is really needed

yes, permute is implemented as strides setting, so it gives the same error

jerryzh168

checked code except jit/pickler.cpp.

@driazati

Serializes per-channel tensor in both torch.serialization and jit. Since we didn't bind Quantizer properly yet, I chose to save a tuple representing quantizer settings. To avoid recursive tensor serialization calls, I'm using tuple instead of tensor to store scales and zero points. @driazati - please check the serialization logic. Is there a good test that compares that JIT serialization and python serialization are equivalent? (I haven't tested it yet) [ghstack-poisoned]

ghstack-source-id: 4e5f0ff Pull Request resolved: #26339

@driazati

Serializes per-channel tensor in both torch.serialization and jit. Since we didn't bind Quantizer properly yet, I chose to save a tuple representing quantizer settings. To avoid recursive tensor serialization calls, I'm using tuple instead of tensor to store scales and zero points. @driazati - please check the serialization logic. Is there a good test that compares that JIT serialization and python serialization are equivalent? (I haven't tested it yet) [ghstack-poisoned]

ghstack-source-id: df30787 Pull Request resolved: #26339

raghuramank100 · 2019-09-18T21:23:59Z

        # See Note [Don't serialize hooks]
        torch.utils.hooks.warn_if_has_hooks(self)
        if self.is_quantized:
+            if self.qscheme() == torch.per_tensor_affine:


We still support torch.per_tensor_symmetric and torch.per_channel_symmetric as valid qschemes. They are used only by the observer currently. We should add checks for these too whenever we have a qscheme based check.

this code will never be hit by observers - you can't create a tensor with that quantizer

raghuramank100 · 2019-09-18T21:24:41Z

+                result = at::_empty_affine_quantized(
+                    {0}, storage_tensor.options(), q_scale, q_zero_point);
+              } break;
+              case at::kPerChannelAffine: {


We should also handle kPerChannelSymmetric and kPerTensorSymmetric here

again, we have only two types of quantizers present today

raghuramank100

Please add support for symmetric qschemes.

@driazati

Serializes per-channel tensor in both torch.serialization and jit. Since we didn't bind Quantizer properly yet, I chose to save a tuple representing quantizer settings. To avoid recursive tensor serialization calls, I'm using tuple instead of tensor to store scales and zero points. @driazati - please check the serialization logic. Is there a good test that compares that JIT serialization and python serialization are equivalent? (I haven't tested it yet) Differential Revision: [D17443222](https://our.internmc.facebook.com/intern/diff/D17443222) [ghstack-poisoned]

ghstack-source-id: b7b6be2 Pull Request resolved: #26339

@driazati

Serializes per-channel tensor in both torch.serialization and jit. Since we didn't bind Quantizer properly yet, I chose to save a tuple representing quantizer settings. To avoid recursive tensor serialization calls, I'm using tuple instead of tensor to store scales and zero points. @driazati - please check the serialization logic. Is there a good test that compares that JIT serialization and python serialization are equivalent? (I haven't tested it yet) Differential Revision: [D17443222](https://our.internmc.facebook.com/intern/diff/D17443222) [ghstack-poisoned]

@driazati

Serializes per-channel tensor in both torch.serialization and jit. Since we didn't bind Quantizer properly yet, I chose to save a tuple representing quantizer settings. To avoid recursive tensor serialization calls, I'm using tuple instead of tensor to store scales and zero points. @driazati - please check the serialization logic. Is there a good test that compares that JIT serialization and python serialization are equivalent? (I haven't tested it yet) Differential Revision: [D17443222](https://our.internmc.facebook.com/intern/diff/D17443222) [ghstack-poisoned]

ghstack-source-id: 77edd23 Pull Request resolved: #26339

@driazati

Serializes per-channel tensor in both torch.serialization and jit. Since we didn't bind Quantizer properly yet, I chose to save a tuple representing quantizer settings. To avoid recursive tensor serialization calls, I'm using tuple instead of tensor to store scales and zero points. @driazati - please check the serialization logic. Is there a good test that compares that JIT serialization and python serialization are equivalent? (I haven't tested it yet) Differential Revision: [D17443222](https://our.internmc.facebook.com/intern/diff/D17443222) [ghstack-poisoned]

ghstack-source-id: 355dbe6 Pull Request resolved: #26339

@driazati

Serializes per-channel tensor in both torch.serialization and jit. Since we didn't bind Quantizer properly yet, I chose to save a tuple representing quantizer settings. To avoid recursive tensor serialization calls, I'm using tuple instead of tensor to store scales and zero points. @driazati - please check the serialization logic. Is there a good test that compares that JIT serialization and python serialization are equivalent? (I haven't tested it yet) Differential Revision: [D17443222](https://our.internmc.facebook.com/intern/diff/D17443222) [ghstack-poisoned]

@driazati

Serializes per-channel tensor in both torch.serialization and jit. Since we didn't bind Quantizer properly yet, I chose to save a tuple representing quantizer settings. To avoid recursive tensor serialization calls, I'm using tuple instead of tensor to store scales and zero points. @driazati - please check the serialization logic. Is there a good test that compares that JIT serialization and python serialization are equivalent? (I haven't tested it yet) Differential Revision: [D17443222](https://our.internmc.facebook.com/intern/diff/D17443222) [ghstack-poisoned]

facebook-github-bot · 2019-09-24T02:35:54Z

@dzhulgakov merged this pull request in ebc2365.

Summary: Pull Request resolved: pytorch#26339 Serializes per-channel tensor in both torch.serialization and jit. Since we didn't bind Quantizer properly yet, I chose to save a tuple representing quantizer settings. To avoid recursive tensor serialization calls, I'm using tuple instead of tensor to store scales and zero points. driazati - please check the serialization logic. Is there a good test that compares that JIT serialization and python serialization are equivalent? (I haven't tested it yet) Test Plan: Imported from OSS Differential Revision: D17443222 Pulled By: dzhulgakov fbshipit-source-id: a34758de1ffd2ec1cdc5355f5baf95284a4ccf4b

Serialization for per channel qtensor

8f3bf63

[ghstack-poisoned]

dzhulgakov requested a review from apaszke as a code owner September 17, 2019 07:24

dzhulgakov mentioned this pull request Sep 17, 2019

Implement more support for per-channel quantization #26240

Closed

pytorchbot added oncall: jit Add this issue/PR to JIT oncall triage queue module: internals Related to internal abstractions in c10 and ATen module: operators module: tests Issues related to tests (not the torch.testing module) labels Sep 17, 2019

This was referenced Sep 17, 2019

Fold weight permutation inside quantized conv operator #26241

Closed

Fold activation permutation inside quantized conv operator #26242

Closed

Fix _empty_per_channel_affine_quantized to be less hacky #26243

Closed

dzhulgakov requested review from driazati, jamesr66a, jerryzh168 and raghuramank100 September 17, 2019 07:27

dzhulgakov pushed a commit that referenced this pull request Sep 17, 2019

Serialization for per channel qtensor

59e1bd2

ghstack-source-id: b436ddd Pull Request resolved: #26339

driazati reviewed Sep 17, 2019

View reviewed changes

jerryzh168 reviewed Sep 17, 2019

View reviewed changes

jerryzh168 approved these changes Sep 17, 2019

View reviewed changes

dzhulgakov pushed a commit that referenced this pull request Sep 18, 2019

Serialization for per channel qtensor

e25c5bf

ghstack-source-id: 4e5f0ff Pull Request resolved: #26339

dzhulgakov pushed a commit that referenced this pull request Sep 18, 2019

Serialization for per channel qtensor

6aa332c

ghstack-source-id: df30787 Pull Request resolved: #26339

raghuramank100 reviewed Sep 18, 2019

View reviewed changes

Comment thread test/test_jit.py

raghuramank100 reviewed Sep 18, 2019

View reviewed changes

Comment thread test/test_jit.py Outdated

raghuramank100 suggested changes Sep 18, 2019

View reviewed changes

dzhulgakov pushed a commit that referenced this pull request Sep 19, 2019

Serialization for per channel qtensor

68286c5

ghstack-source-id: b7b6be2 Pull Request resolved: #26339

dzhulgakov requested a review from raghuramank100 September 19, 2019 05:49

dzhulgakov pushed a commit that referenced this pull request Sep 19, 2019

Serialization for per channel qtensor

5a22fc3

ghstack-source-id: 77edd23 Pull Request resolved: #26339

dzhulgakov pushed a commit that referenced this pull request Sep 19, 2019

Serialization for per channel qtensor

cf2182d

ghstack-source-id: 355dbe6 Pull Request resolved: #26339

raghuramank100 approved these changes Sep 19, 2019

View reviewed changes

Dmytro Dzhulgakov added 2 commits September 22, 2019 23:08

facebook-github-bot closed this in ebc2365 Sep 23, 2019

ailzhang mentioned this pull request Sep 23, 2019

Fix torch_patch as broken by PT#26339 pytorch/xla#1088

Merged

facebook-github-bot added the merged label Sep 24, 2019

facebook-github-bot deleted the gh/dzhulgakov/5/head branch October 28, 2019 22:08

mruberry added the Merged label Oct 28, 2020

		@@ -722,11 +722,21 @@ def assertTensorsEqual(a, b):
		elif x.is_quantized and y.is_quantized:

Conversation

dzhulgakov commented Sep 17, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

driazati left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

driazati commented Sep 17, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

raghuramank100 left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Sep 24, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

dzhulgakov commented Sep 17, 2019 •

edited

Loading