Integrate forked QNNPACK into mobile PyTorch builds.#25844
Integrate forked QNNPACK into mobile PyTorch builds.#25844AshkanAliabadi wants to merge 2 commits intopytorch:masterfrom AshkanAliabadi:qnnpack
Conversation
There was a problem hiding this comment.
Important change, please double check. Enable caffe2/quantized only when BUILD_CAFFE2_MOBILE is set.
There was a problem hiding this comment.
operators/quantized is only built for server or libcaffe2 mobile (line 67) - but for both cases BUILD_CAFFE2_MOBILE is true (right, it's default to true and we don't unset it for non-mobile build). The additional condition should be no-op - so what case are you trying to protect here?
In general, I saw you already added source code macro USE_PYTORCH_QNNPACK, maybe we can do the same thing in cmake, i.e. instead of changing cmake USE_QNNPACK's behavior with BUILD_CAFFE2_MOBILE macro, I'm thinking of a more decoupled setup:
- doesn't change cmake USE_QNNPACK logic;
- add USE_PYTORCH_QNNPACK cmake macro for cloned pytorch_qnnpack. For now we can simply copy USE_QNNPACK cmake logic - we should gradually refactor pytorch_qnnpack/CMakeList to make it more natively integrated with aten;
- for caffe2 mobile build, always set USE_PYTORCH_QNNPACK to false; USE_QNNPACK is still a user facing macro;
- for pytorch mobile build, always set USE_QNNPACK to false (as we don't build caffe2 ops); and always set USE_PYTORCH_QNNPACK to true - unless cmake/Dependencies.cmake logic determines it's not supported (check places where set USE_QNNPACK off currently).
Maybe we can call it INTERN_USE_PYTORCH_QNNPACK to keep it a private flag as eventually it will be part of aten natively. I think we can always build it unless the architecture is not supported by qnnpack implementation. Does it sound reasonable or not?
There was a problem hiding this comment.
I am a little unclear on the distinction between USE_QNNPACK and USE_PYTORCH_QNNPACK. Looks like for pytorch mobile they will both be set? But for caffe2 only USE_QNNPACK is set.
There was a problem hiding this comment.
USE_QNNPACK is the only user facing flag the user of the library needs to worry about at build time to avoid having two flags exposed at that level. This flag is used to tell whether the user wants to build PyTorch / Caffe 2 with QNNPACK or not. Beyond that though, and internally, we need to somehow be able to distinguish and differentiate between the scenario where we need to build the third party QNNPACK (for Caffe 2) or the one in PyTorch, otherwise we will end up with two versions of the library compiled in. This is not going to cause any linker errors, because the symbols are prefixed with pytorch_ in the PyTorch version, but that means an unnecessary increase in library size.
One way to differentiate between the above two use cases is to use the value of USE_QNNPACK && !BUILD_CAFFE2_MOBILE to be able to tell this is a PyTorch mobile build (and consequently use the forked version of QNNPACK), and use USE_QNNPACK && BUILD_CAFFE2_MOBILE to designate it as a Caffe 2 build (and build the third party version of QNNPACK instead.) As an aside, if USE_QNNPACK is OFF to begin with, neither version will be used.
Now that brings us to PYTORCH_USE_QNNPACK, which is defined whenever (USE_QNNPACK && !BUILD_CAFFE2_MOBILE) is true to avoid having to check for the combined flag every time in the source file. If this is not a PyTorch build, that preprocessor symbol will not be defined. Again, USE_QNNPACK alone will not cut it because there would be no way to differentiate 3 scenarios (No QNNPACK, QNNPACK for Caffe 2, and QNNPACK for PyTorch) with a switch that can only hold two values. The other option is to use the combination of two flags USE_QNNPACK && !BUILD_CAFFE2_MOBILE at all places which I believe is more error prone.
There was a problem hiding this comment.
Do we also set this macro for CI builds and runs in the jenkins setup? It may be getting automatically set, but just making sure.
ljk53
left a comment
There was a problem hiding this comment.
Thanks for working on this! Please let me know whether the proposal of decoupling cmake USE_PYTORCH_QNNPACK and USE_QNNPACK make sense.
There was a problem hiding this comment.
nit: shouldn't use 2 space indent and avoid these changes?
There was a problem hiding this comment.
operators/quantized is only built for server or libcaffe2 mobile (line 67) - but for both cases BUILD_CAFFE2_MOBILE is true (right, it's default to true and we don't unset it for non-mobile build). The additional condition should be no-op - so what case are you trying to protect here?
In general, I saw you already added source code macro USE_PYTORCH_QNNPACK, maybe we can do the same thing in cmake, i.e. instead of changing cmake USE_QNNPACK's behavior with BUILD_CAFFE2_MOBILE macro, I'm thinking of a more decoupled setup:
- doesn't change cmake USE_QNNPACK logic;
- add USE_PYTORCH_QNNPACK cmake macro for cloned pytorch_qnnpack. For now we can simply copy USE_QNNPACK cmake logic - we should gradually refactor pytorch_qnnpack/CMakeList to make it more natively integrated with aten;
- for caffe2 mobile build, always set USE_PYTORCH_QNNPACK to false; USE_QNNPACK is still a user facing macro;
- for pytorch mobile build, always set USE_QNNPACK to false (as we don't build caffe2 ops); and always set USE_PYTORCH_QNNPACK to true - unless cmake/Dependencies.cmake logic determines it's not supported (check places where set USE_QNNPACK off currently).
Maybe we can call it INTERN_USE_PYTORCH_QNNPACK to keep it a private flag as eventually it will be part of aten natively. I think we can always build it unless the architecture is not supported by qnnpack implementation. Does it sound reasonable or not?
There was a problem hiding this comment.
as I proposed, pytorch_qnnpack related logic can be put after qnnpack block, gated with USE_PYTORCH_QNNPACK macro.
AshkanAliabadi
left a comment
There was a problem hiding this comment.
@ljk53 Please check if I addressed your concerns properly. Thank you.
There was a problem hiding this comment.
Adding a new user-facing option.
There was a problem hiding this comment.
Can we have a different description compared to USE_QNNPACK? It may get confusing for the user.
There was a problem hiding this comment.
When do we plan to set this new macro to ON? Do we know if the tests pass with this on? For pytorch build (both mobile and server) we should ideally set USE_QNNPACK to false and set USE_PYTORCH_QNNPACK to true. @ljk53 does that sound reasonable?
There was a problem hiding this comment.
Good point regarding a better description. I had to set USE_PYTORCH_QNNPACK to true in a subsequent commit to make a bunch of failing tests pass. I left USE_QNNPACK to ON but can change it if that's what we want.
There was a problem hiding this comment.
Using a separate preprocessor symbol.
There was a problem hiding this comment.
Adding a separate block.
facebook-github-bot
left a comment
There was a problem hiding this comment.
@AshkanAliabadi has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
ljk53
left a comment
There was a problem hiding this comment.
Looks good to me. Do you know why imported phabricator diff doesn't have xplat sync?
There was a problem hiding this comment.
We probably can remove this and set USE_PYTORCH_QNNPACK on line #300 (in INTERN_BUILD_MOBILE AND NOT BUILD_CAFFE2_MOBILE). pytorch_qnnpack will be merged into libtorch as default quantization engine for mobile build so I feel we don't have to expose the option to users.
There was a problem hiding this comment.
@ljk53 I think it would be nice to enable this macro for pytorch server builds too. It is possible to run QNNPACK on x86 and it will be useful for debugging/running python tests and comparing with FBGEMM, etc.
|
@ljk53 No I don't know why xplat doesn't sync. This happened with the previous PR as well and I had to manually amend the commit locally just to force the pre push hook run to perform the copy. I'll upload another revision with the changes you requested soon. |
|
@ljk53 I made the change you requested and noticed that tests/test_quantized.py now fails because USE_PYTORCH_QNNPACK is only enabled for mobile builds, while that test apparently is not gated. Hence QNNPACKRelu and family resolve to an empty implementation on non-mobile builds which throws off the test. How do you want to move forward? Disable the tests for non-mobile builds or enable USE_PYTORCH_QNNPACK for all? |
facebook-github-bot
left a comment
There was a problem hiding this comment.
@AshkanAliabadi has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
Can we try to stack this PR on top of #26238 and see if we can leave both USE_QNNPACK and USE_PYTORCH_QNNPACK enabled? |
AshkanAliabadi
left a comment
There was a problem hiding this comment.
Merging the two related PRs (the current one and this here.) There was probably a way to use ghstack to this same effect that I am not familiar with.
facebook-github-bot
left a comment
There was a problem hiding this comment.
@AshkanAliabadi has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
| @@ -10,6 +10,7 @@ set(libtorch_FILE ${CMAKE_CURRENT_LIST_DIR}/src/main/jniLibs/${ANDROID_ABI}/libt | |||
| set(libc10_FILE ${CMAKE_CURRENT_LIST_DIR}/src/main/jniLibs/${ANDROID_ABI}/libc10.a) | |||
There was a problem hiding this comment.
Thanks for the help Ivan. This file contains the Android related changes.
|
Hopefully this'll work for you. |
ljk53
left a comment
There was a problem hiding this comment.
Looks good to me - do you want to land both PRs together? or try splitting into two PRs with ghstack?
Summary: Enable forked QNNPACK builds in PyTorch mobile. Pull Request resolved: pytorch/pytorch#25844 Differential Revision: D17336458 Pulled By: AshkanAliabadi fbshipit-source-id: 6ea09dd6c114b64313e9159bf7f17253bc87bfdb
|
@AshkanAliabadi merged this pull request in dc851ab. |
Summary: Enable forked QNNPACK builds in PyTorch mobile. Pull Request resolved: pytorch#25844 Differential Revision: D17336458 Pulled By: AshkanAliabadi fbshipit-source-id: 6ea09dd6c114b64313e9159bf7f17253bc87bfdb
Enable forked QNNPACK builds in PyTorch mobile.