Implement more support for per-channel quantization by dzhulgakov · Pull Request #26240 · pytorch/pytorch

dzhulgakov · 2019-09-14T23:44:08Z

Stack from ghstack:

Serialization for per channel qtensor #26339 Serialization for per channel qtensor
Fix _empty_per_channel_affine_quantized to be less hacky #26243 Fix _empty_per_channel_affine_quantized to be less hacky
Fold activation permutation inside quantized conv operator #26242 Fold activation permutation inside quantized conv operator
Fold weight permutation inside quantized conv operator #26241 Fold weight permutation inside quantized conv operator
Implement more support for per-channel quantization #26240 Implement more support for per-channel quantization

In particular adds support for empty/empty_like which is needed for memory layouts to work.

Differential Revision: D17443220

jerryzh168 · 2019-09-16T18:56:18Z

-                                       use_memory_format);
+    auto qscheme = self.qscheme();
+    if (qscheme == kPerTensorAffine) {
+      return at::_empty_affine_quantized(self.sizes(), options,


we have different ordering of arguments here...
maybe we can have:
at::_empty_affine_quantized(q_scale, q_zero_point, sizes, options, use_memory_format);
and
at::_empty_per_channel_affine_quantized_like(q_per_channel_scales, q_per_channel_zero_points, q_per_channel_axis, sizes, options, use_memory_format);

But should do in a separate PR: @dskhudia could you change the order for _empty_per_channel_affine_quantized_like?

oh q_per_channel_axis is added in this PR, then @dzhulgakov could you change the order as well?

looks like this is addressed in later PRs

So the order and name for _empty_per_channel_affine_quantized_like (_like!) - was weird as a workaround for some codegen peculiarity - #26243 fixes it

I the order is still not the same - TensorOptions and additional arguments are in the wrong order. It's C++ only as the args are marked as kwargs in python. But we should fix it anyway. Since it's an intrusive change - I'll send a separate PR for it after these ones land

In particular adds support for empty/empty_like which is needed for memory layouts to work. [ghstack-poisoned]

jerryzh168 · 2019-09-17T17:33:03Z

+        # change memory format
+        qlast = qr.contiguous(memory_format=torch.channels_last)
+        self.assertEqual(qr.stride(), list(reversed(sorted(qr.stride()))))
+        self.assertNotEqual(qlast.stride(), list(reversed(sorted(qlast.stride()))))


what is this for? to check the contiguous operation actually happened?

yeah, just making sure. I can remove it, but it doesn't hurt

jerryzh168 · 2019-09-17T17:35:08Z

+        self.assertTrue(torch.equal(qr.int_repr(), qlast.int_repr()))
+        self.assertTrue(qr.q_scale() == qlast.q_scale())
+        self.assertTrue(qr.q_zero_point() == qlast.q_zero_point())
+        self.assertTrue(np.array_equal(qlast.dequantize().numpy(), qr.dequantize().numpy()))


I think we have self.assertEqual on pytorch tensor as well. so

self.assertEqual(qlast.dequantize(), qr.dequantize())

should work

ah, good point! when I tried it I typed assertEquals instead of assertEqual and we don't override that one :-P

jerryzh168 · 2019-09-17T17:35:58Z

+        qlast = qr.contiguous(memory_format=torch.channels_last)
+        self.assertEqual(qr.stride(), list(reversed(sorted(qr.stride()))))
+        self.assertNotEqual(qlast.stride(), list(reversed(sorted(qlast.stride()))))
+        self.assertTrue(torch.equal(qr.int_repr(), qlast.int_repr()))


FYI: int_repr() will do contiguous right now, but I think it should preserve the strides, will fix in #25429 when I have some time.

jerryzh168

LGTM Thanks

jerryzh168 · 2019-09-17T17:39:39Z

-  auto result = detail::make_tensor<QTensorImpl>(Storage(self.storage()), self.type_set(), get_qtensorimpl(self)->quantizer());
+  auto quantizer = get_qtensorimpl(self)->quantizer();
+  TORCH_CHECK(
+      quantizer->qscheme() == QScheme::PER_TENSOR_SYMMETRIC ||


we don't have QScheme::PER_TENSOR_SYMMETRIC in backend. please remove.

see: https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/quantized/Quantizer.h#L121

In particular adds support for empty/empty_like which is needed for memory layouts to work. [ghstack-poisoned]

raghuramank100 · 2019-09-18T21:32:20Z

+                                         self.q_scale(),
+                                         self.q_zero_point(),
+                                         use_memory_format);
+    } else if (qscheme == kPerChannelAffine) {


Do we also need to support kPerChannelSymmetric and kPerTensorSymmetric here?

again, they are not present in the backend

raghuramank100 · 2019-09-18T21:32:55Z

-  auto result = detail::make_tensor<QTensorImpl>(Storage(self.storage()), self.type_set(), get_qtensorimpl(self)->quantizer());
+  auto quantizer = get_qtensorimpl(self)->quantizer();
+  TORCH_CHECK(
+      quantizer->qscheme() == QScheme::PER_TENSOR_AFFINE,


Similar comments as above for Qscheme::PER_TENSOR_SYMMETRIC

raghuramank100 · 2019-09-18T21:33:53Z

-  return get_qtensorimpl(self)->quantizer().get();
+IntArrayRef q_per_channel_axis_quant(const Tensor& self) {
+  auto quantizer = get_qtensorimpl(self)->quantizer();
+  TORCH_CHECK(quantizer->qscheme() == kPerChannelAffine);


Support for PerChannelSymmetric

raghuramank100 · 2019-09-18T21:34:13Z


 Tensor quantized_clone(const Tensor& self) {
+  // TODO: add per channel support
+  TORCH_INTERNAL_ASSERT(


Similar comments as above

raghuramank100 · 2019-09-18T21:36:15Z

        q = torch._empty_per_channel_affine_quantized_like(scales, zero_points, [numel], [ch_axis], dtype=torch.quint8)
        self.assertEqual(scales, q.q_per_channel_scales())
        self.assertEqual(zero_points, q.q_per_channel_zero_points())
+        self.assertEqual([ch_axis], q.q_per_channel_axis())


Both per_channel and qtensor creation tests should sweep over symmetric and affine quant.

raghuramank100 · 2019-09-18T21:45:51Z

+        self.assertEqual(qlast.dequantize(), qr.dequantize())
+
+    def test_qtensor_per_channel_permute(self):
+        r = torch.rand(20, 10, 2, 2, dtype=torch.float) * 4 - 2


Can we allow for permutes that do not change the axis that is quantized?
Would this work?

# x is in nchw i.e x_q = torch.quantize_linear_per_channel(... axis =[0]) # x_q is also in nchw x_nhwc = x_q.permute([0, 2, 3,1])

as I mentioned in another place, we could, but I think we can add it later in a separate diff. And memory format works today already so it should be sufficient

raghuramank100 · 2019-09-18T21:50:10Z

+q_per_channel_zero_points() -> tuple of ints
+
+Given a Tensor quantized by linear (affine) per-channel quantization,
+returns a indices of dimensions on which per-channel quantization is applied.


Do we really support a tuple for axis? Why not restrict axis to be scalar/list of length 1 and clarify this in the API.
Also NIT: the doc string should say q_per_channel_axis()

How likely are we to have multidim in the future? Having both apis would be confusing

raghuramank100

Please see comments, thanks

In particular adds support for empty/empty_like which is needed for memory layouts to work. Differential Revision: [D17443220](https://our.internmc.facebook.com/intern/diff/D17443220) [ghstack-poisoned]

Summary: Pull Request resolved: pytorch/pytorch#26240 In particular adds support for empty/empty_like which is needed for memory layouts to work. Test Plan: Imported from OSS Differential Revision: D17443220 Pulled By: dzhulgakov fbshipit-source-id: 9c9e25981999c0edaf40be104a5741e9c62a1333

facebook-github-bot · 2019-09-20T01:05:35Z

@dzhulgakov merged this pull request in 8c1354c.

ezyang · 2019-09-20T14:14:04Z


-Quantizer* quantizer(const Tensor& self) {
-  return get_qtensorimpl(self)->quantizer().get();
+IntArrayRef q_per_channel_axis_quant(const Tensor& self) {


This is the very first occurrence of an IntArrayRef at return site of a native function, and I don't like it. The returned ref is non-owning. So what's its lifetime tied to? This is not documented anywhere. (In fact, the lifetime is tied to the quantizer, not the tensor itself; so the semantics don't even match sizes() and strides(), whose lifetime is tied to the tensor). Furthermore, this would have to be supported properly in the JIT, as all values in the stack must be owning. C.f. implementation of sizes():

Operator( "aten::size(Tensor self) -> int[]", [](Stack& stack) { RECORD_FUNCTION("size", last(stack, 1)); auto t = std::move(pop(stack)).toTensor(); pack(stack, t.sizes().vec()); return 0; }, aliasAnalysisFromSchema()),

I suppose, from first principles, the ability to return a primitive array from a function is desirable functionality, I think we should think carefully about how we want to design this. As a first proposal, the return type of this function should be a std::vector<>, not an IntArrayRef. As a second proposal, this shouldn't be a native function, and just hard coded in the same way sizes/strides are.

cc @gchanan @zdevito

Summary: Pull Request resolved: pytorch#26240 In particular adds support for empty/empty_like which is needed for memory layouts to work. Test Plan: Imported from OSS Differential Revision: D17443220 Pulled By: dzhulgakov fbshipit-source-id: 9c9e25981999c0edaf40be104a5741e9c62a1333

implement more per-channel code

2e2e009

pytorchbot added module: internals Related to internal abstractions in c10 and ATen module: operators module: pybind Related to our Python bindings / interactions with other Python libraries oncall: quantization Quantization support in PyTorch labels Sep 14, 2019

This was referenced Sep 14, 2019

Fold weight permutation inside quantized conv operator #26241

Closed

Fold activation permutation inside quantized conv operator #26242

Closed

Fix _empty_per_channel_affine_quantized to be less hacky #26243

Closed

dzhulgakov changed the title ~~implement more per-channel code~~ Implement more support for per-channel quantization Sep 14, 2019

dzhulgakov requested review from jamesr66a, jerryzh168 and raghuramank100 September 14, 2019 23:57

jerryzh168 reviewed Sep 16, 2019

View reviewed changes

Comment thread tools/autograd/gen_python_functions.py Outdated

Update on "Implement more support for per-channel quantization"

3a8fdad

In particular adds support for empty/empty_like which is needed for memory layouts to work. [ghstack-poisoned]

dzhulgakov requested a review from apaszke as a code owner September 17, 2019 03:52

pytorchbot added the module: autograd Related to torch.autograd, and the autograd engine in general label Sep 17, 2019

dzhulgakov mentioned this pull request Sep 17, 2019

Serialization for per channel qtensor #26339

Closed

jerryzh168 reviewed Sep 17, 2019

View reviewed changes

jerryzh168 approved these changes Sep 17, 2019

View reviewed changes

jerryzh168 reviewed Sep 17, 2019

View reviewed changes

Update on "Implement more support for per-channel quantization"

3fbc070

In particular adds support for empty/empty_like which is needed for memory layouts to work. [ghstack-poisoned]

pytorchbot added module: docs Related to our documentation, both in docs/ and docblocks module: printing Issues related to the printing format of tensors labels Sep 18, 2019

Update on "Implement more support for per-channel quantization"

9e8d80a

In particular adds support for empty/empty_like which is needed for memory layouts to work. [ghstack-poisoned]

raghuramank100 reviewed Sep 18, 2019

View reviewed changes

raghuramank100 suggested changes Sep 18, 2019

View reviewed changes

Dmytro Dzhulgakov added 3 commits September 18, 2019 22:49

Update on "Implement more support for per-channel quantization"

9914995

In particular adds support for empty/empty_like which is needed for memory layouts to work. Differential Revision: [D17443220](https://our.internmc.facebook.com/intern/diff/D17443220) [ghstack-poisoned]

Update on "Implement more support for per-channel quantization"

b82ba40

In particular adds support for empty/empty_like which is needed for memory layouts to work. Differential Revision: [D17443220](https://our.internmc.facebook.com/intern/diff/D17443220) [ghstack-poisoned]

Update on "Implement more support for per-channel quantization"

ebdcc91

In particular adds support for empty/empty_like which is needed for memory layouts to work. Differential Revision: [D17443220](https://our.internmc.facebook.com/intern/diff/D17443220) [ghstack-poisoned]

raghuramank100 approved these changes Sep 19, 2019

View reviewed changes

facebook-github-bot closed this in 8c1354c Sep 19, 2019

facebook-github-bot added the merged label Sep 20, 2019

ezyang reviewed Sep 20, 2019

View reviewed changes

facebook-github-bot deleted the gh/dzhulgakov/1/head branch October 28, 2019 22:08

mruberry added the Merged label Oct 28, 2020

Conversation

dzhulgakov commented Sep 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Sep 17, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

raghuramank100 Sep 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

raghuramank100 left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Sep 20, 2019

Uh oh!

ezyang Sep 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

dzhulgakov commented Sep 14, 2019 •

edited

Loading

jerryzh168 Sep 17, 2019 •

edited

Loading

raghuramank100 Sep 18, 2019 •

edited

Loading

ezyang Sep 20, 2019 •

edited

Loading