Adding quantized::linear function for pytorch mobile in c10 by supriyar · Pull Request #26135 · pytorch/pytorch

supriyar · 2019-09-12T22:32:31Z

Stack from ghstack:

Unify Quantization APIs for add, pool and relu #26335 Unify Quantization APIs for add, pool and relu
Changes to support int8 weight and fp32 bias in QNNPACK #26307 Changes to support int8 weight and fp32 bias in QNNPACK
Add support to call unpack for pytorch mobile quantized FC and Conv #26211 Add support to call unpack for pytorch mobile quantized FC and Conv
Adding quantized::conv2d function for pytorch mobile in c10 #26152 Adding quantized::conv2d function for pytorch mobile in c10
Adding quantized::linear function for pytorch mobile in c10 #26135 Adding quantized::linear function for pytorch mobile in c10

Summary:
This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected)
It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten
I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same.

Test Plan:
python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack
Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D17434885

Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags:

Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 55e37d5 Pull Request resolved: #26135

Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags:

ljk53 · 2019-09-13T15:03:24Z

 option(USE_OPENMP "Use OpenMP for parallel code" ON)
 option(USE_PROF "Use profiling" OFF)
-option(USE_QNNPACK "Use QNNPACK (quantized 8-bit operators)" ON)
+option(USE_QNNPACK "Use QNNPACK (quantized 8-bit operators)" OFF)


Can we set it to OFF on line 300 instead?

I am currently testing QNNPACK on pytorch server builds too (x86) using the python tests. If I turn off USE_QNNPACK only for mobile then I won't be able to test it on x86.
I think the main issue is that we didn't rename the kernels when we forked QNNPACK so at runtime it picks up kernels from third_party instead.

Hm... we'd need to fix it (rename the kernels), otherwise buck build will be broken too

Yeah Ashkan has a PR out for this. #26238

dzhulgakov · 2019-09-14T07:00:16Z

 option(USE_OPENMP "Use OpenMP for parallel code" ON)
 option(USE_PROF "Use profiling" OFF)
-option(USE_QNNPACK "Use QNNPACK (quantized 8-bit operators)" ON)
+option(USE_QNNPACK "Use QNNPACK (quantized 8-bit operators)" OFF)


Hm... we'd need to fix it (rename the kernels), otherwise buck build will be broken too

dzhulgakov · 2019-09-14T07:06:39Z

+}
+#endif

+inline uint8_t QuantizeUint8(float scale, int32_t zero_point, float value) {


btw, this seems to be similar to fbgemm implementation used in quantize_val. Can they be the same?

Looks similar, but that function calls fbgemm::Quantize function. I wanted to avoid calling fbgemm functions from pytorch mobile build because I don't think they will be built on mobile.

dzhulgakov · 2019-09-14T07:07:37Z

        use_channelwise=st.booleans())
    def test_qlinear(self, batch_size, input_channels, output_channels,
                     use_bias, use_relu, use_multi_dim_input, use_channelwise):
+        torch.backends.quantized.engine = torch.fbgemm


same comment that you'd need a context manager probably

I think it'd be even better to lift it into the test's base class to make sure that we run tests agains both engines without duplicating the tests

Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags:

dzhulgakov

It looks good, but I'd fix the build mode issue before landing

dzhulgakov · 2019-09-15T20:04:18Z

        use_channelwise=st.booleans())
    def test_qlinear(self, batch_size, input_channels, output_channels,
                     use_bias, use_relu, use_multi_dim_input, use_channelwise):
+        torch.backends.quantized.engine = torch.fbgemm


I think it'd be even better to lift it into the test's base class to make sure that we run tests agains both engines without duplicating the tests

Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 90b7304 Pull Request resolved: pytorch#26135

Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags:

Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D17422588](https://our.internmc.facebook.com/intern/diff/D17422588) [ghstack-poisoned]

facebook-github-bot · 2019-09-17T23:42:33Z

This pull request has been merged in bb1efb3.

Summary: Pull Request resolved: pytorch/pytorch#26135 This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Imported from OSS Differential Revision: D17434885 fbshipit-source-id: 084698026938f4529f61d12e86dfe82534ec73dd

…26135) Summary: Pull Request resolved: pytorch#26135 This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Imported from OSS Differential Revision: D17434885 fbshipit-source-id: 084698026938f4529f61d12e86dfe82534ec73dd

pytorchbot added module: build Build system issues module: operators oncall: quantization Quantization support in PyTorch labels Sep 12, 2019

supriyar mentioned this pull request Sep 12, 2019

[WIP] Integrate forked QNNPACK into mobile PyTorch builds. #26134

Closed

supriyar requested review from AshkanAliabadi, ljk53 and raghuramank100 September 12, 2019 22:52

supriyar mentioned this pull request Sep 13, 2019

Adding quantized::conv2d function for pytorch mobile in c10 #26152

Closed

ljk53 reviewed Sep 13, 2019

View reviewed changes

supriyar requested a review from ljk53 September 13, 2019 19:18

supriyar mentioned this pull request Sep 13, 2019

Add support to call unpack for pytorch mobile quantized FC and Conv #26211

Closed

dzhulgakov reviewed Sep 14, 2019

View reviewed changes

supriyar requested a review from dzhulgakov September 15, 2019 00:31

dzhulgakov approved these changes Sep 15, 2019

View reviewed changes

pytorchbot added the module: internals Related to internal abstractions in c10 and ATen label Sep 16, 2019

supriyar mentioned this pull request Sep 16, 2019

Changes to support int8 weight and fp32 bias in QNNPACK #26307

Closed

supriyar mentioned this pull request Sep 17, 2019

Unify Quantization APIs for add, pool and relu #26335

Closed

supriyar added 5 commits September 16, 2019 22:07

facebook-github-bot closed this in bb1efb3 Sep 17, 2019

facebook-github-bot added the merged label Sep 17, 2019

facebook-github-bot deleted the gh/supriyar/14/head branch October 28, 2019 22:20

mruberry added the Merged label Oct 28, 2020

Conversation

supriyar commented Sep 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dzhulgakov left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Sep 17, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

supriyar commented Sep 12, 2019 •

edited

Loading