Adding quantized::linear function for pytorch mobile in c10#26135
Adding quantized::linear function for pytorch mobile in c10#26135supriyar wants to merge 13 commits intogh/supriyar/14/basefrom
Conversation
Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags:
Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 55e37d5 Pull Request resolved: #26135
Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags:
Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags:
Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags:
Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags:
| option(USE_OPENMP "Use OpenMP for parallel code" ON) | ||
| option(USE_PROF "Use profiling" OFF) | ||
| option(USE_QNNPACK "Use QNNPACK (quantized 8-bit operators)" ON) | ||
| option(USE_QNNPACK "Use QNNPACK (quantized 8-bit operators)" OFF) |
There was a problem hiding this comment.
Can we set it to OFF on line 300 instead?
There was a problem hiding this comment.
I am currently testing QNNPACK on pytorch server builds too (x86) using the python tests. If I turn off USE_QNNPACK only for mobile then I won't be able to test it on x86.
I think the main issue is that we didn't rename the kernels when we forked QNNPACK so at runtime it picks up kernels from third_party instead.
There was a problem hiding this comment.
Hm... we'd need to fix it (rename the kernels), otherwise buck build will be broken too
There was a problem hiding this comment.
Yeah Ashkan has a PR out for this. #26238
| option(USE_OPENMP "Use OpenMP for parallel code" ON) | ||
| option(USE_PROF "Use profiling" OFF) | ||
| option(USE_QNNPACK "Use QNNPACK (quantized 8-bit operators)" ON) | ||
| option(USE_QNNPACK "Use QNNPACK (quantized 8-bit operators)" OFF) |
There was a problem hiding this comment.
Hm... we'd need to fix it (rename the kernels), otherwise buck build will be broken too
| } | ||
| #endif | ||
|
|
||
| inline uint8_t QuantizeUint8(float scale, int32_t zero_point, float value) { |
There was a problem hiding this comment.
btw, this seems to be similar to fbgemm implementation used in quantize_val. Can they be the same?
There was a problem hiding this comment.
Looks similar, but that function calls fbgemm::Quantize function. I wanted to avoid calling fbgemm functions from pytorch mobile build because I don't think they will be built on mobile.
| use_channelwise=st.booleans()) | ||
| def test_qlinear(self, batch_size, input_channels, output_channels, | ||
| use_bias, use_relu, use_multi_dim_input, use_channelwise): | ||
| torch.backends.quantized.engine = torch.fbgemm |
There was a problem hiding this comment.
same comment that you'd need a context manager probably
There was a problem hiding this comment.
I think it'd be even better to lift it into the test's base class to make sure that we run tests agains both engines without duplicating the tests
Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags:
dzhulgakov
left a comment
There was a problem hiding this comment.
It looks good, but I'd fix the build mode issue before landing
| use_channelwise=st.booleans()) | ||
| def test_qlinear(self, batch_size, input_channels, output_channels, | ||
| use_bias, use_relu, use_multi_dim_input, use_channelwise): | ||
| torch.backends.quantized.engine = torch.fbgemm |
There was a problem hiding this comment.
I think it'd be even better to lift it into the test's base class to make sure that we run tests agains both engines without duplicating the tests
Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 90b7304 Pull Request resolved: pytorch#26135
Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags:
Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D17422588](https://our.internmc.facebook.com/intern/diff/D17422588) [ghstack-poisoned]
Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D17422588](https://our.internmc.facebook.com/intern/diff/D17422588) [ghstack-poisoned]
Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D17422588](https://our.internmc.facebook.com/intern/diff/D17422588) [ghstack-poisoned]
Summary: This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D17422588](https://our.internmc.facebook.com/intern/diff/D17422588) [ghstack-poisoned]
|
This pull request has been merged in bb1efb3. |
Summary: Pull Request resolved: pytorch/pytorch#26135 This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Imported from OSS Differential Revision: D17434885 fbshipit-source-id: 084698026938f4529f61d12e86dfe82534ec73dd
…26135) Summary: Pull Request resolved: pytorch#26135 This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Imported from OSS Differential Revision: D17434885 fbshipit-source-id: 084698026938f4529f61d12e86dfe82534ec73dd
Stack from ghstack:
Summary:
This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected)
It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten
I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same.
Test Plan:
python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: D17434885