Skip to content

Integrate forked QNNPACK into mobile PyTorch builds.#25844

Closed
AshkanAliabadi wants to merge 2 commits intopytorch:masterfrom
AshkanAliabadi:qnnpack
Closed

Integrate forked QNNPACK into mobile PyTorch builds.#25844
AshkanAliabadi wants to merge 2 commits intopytorch:masterfrom
AshkanAliabadi:qnnpack

Conversation

@AshkanAliabadi
Copy link
Copy Markdown
Contributor

Enable forked QNNPACK builds in PyTorch mobile.

@pytorchbot pytorchbot added caffe2 module: build Build system issues module: operators oncall: quantization Quantization support in PyTorch labels Sep 9, 2019
Comment thread caffe2/CMakeLists.txt Outdated
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important change, please double check. Enable caffe2/quantized only when BUILD_CAFFE2_MOBILE is set.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

operators/quantized is only built for server or libcaffe2 mobile (line 67) - but for both cases BUILD_CAFFE2_MOBILE is true (right, it's default to true and we don't unset it for non-mobile build). The additional condition should be no-op - so what case are you trying to protect here?

In general, I saw you already added source code macro USE_PYTORCH_QNNPACK, maybe we can do the same thing in cmake, i.e. instead of changing cmake USE_QNNPACK's behavior with BUILD_CAFFE2_MOBILE macro, I'm thinking of a more decoupled setup:

  • doesn't change cmake USE_QNNPACK logic;
  • add USE_PYTORCH_QNNPACK cmake macro for cloned pytorch_qnnpack. For now we can simply copy USE_QNNPACK cmake logic - we should gradually refactor pytorch_qnnpack/CMakeList to make it more natively integrated with aten;
  • for caffe2 mobile build, always set USE_PYTORCH_QNNPACK to false; USE_QNNPACK is still a user facing macro;
  • for pytorch mobile build, always set USE_QNNPACK to false (as we don't build caffe2 ops); and always set USE_PYTORCH_QNNPACK to true - unless cmake/Dependencies.cmake logic determines it's not supported (check places where set USE_QNNPACK off currently).

Maybe we can call it INTERN_USE_PYTORCH_QNNPACK to keep it a private flag as eventually it will be part of aten natively. I think we can always build it unless the architecture is not supported by qnnpack implementation. Does it sound reasonable or not?

Comment thread cmake/Dependencies.cmake Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little unclear on the distinction between USE_QNNPACK and USE_PYTORCH_QNNPACK. Looks like for pytorch mobile they will both be set? But for caffe2 only USE_QNNPACK is set.

Copy link
Copy Markdown
Contributor Author

@AshkanAliabadi AshkanAliabadi Sep 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

USE_QNNPACK is the only user facing flag the user of the library needs to worry about at build time to avoid having two flags exposed at that level. This flag is used to tell whether the user wants to build PyTorch / Caffe 2 with QNNPACK or not. Beyond that though, and internally, we need to somehow be able to distinguish and differentiate between the scenario where we need to build the third party QNNPACK (for Caffe 2) or the one in PyTorch, otherwise we will end up with two versions of the library compiled in. This is not going to cause any linker errors, because the symbols are prefixed with pytorch_ in the PyTorch version, but that means an unnecessary increase in library size.

One way to differentiate between the above two use cases is to use the value of USE_QNNPACK && !BUILD_CAFFE2_MOBILE to be able to tell this is a PyTorch mobile build (and consequently use the forked version of QNNPACK), and use USE_QNNPACK && BUILD_CAFFE2_MOBILE to designate it as a Caffe 2 build (and build the third party version of QNNPACK instead.) As an aside, if USE_QNNPACK is OFF to begin with, neither version will be used.

Now that brings us to PYTORCH_USE_QNNPACK, which is defined whenever (USE_QNNPACK && !BUILD_CAFFE2_MOBILE) is true to avoid having to check for the combined flag every time in the source file. If this is not a PyTorch build, that preprocessor symbol will not be defined. Again, USE_QNNPACK alone will not cut it because there would be no way to differentiate 3 scenarios (No QNNPACK, QNNPACK for Caffe 2, and QNNPACK for PyTorch) with a switch that can only hold two values. The other option is to use the combination of two flags USE_QNNPACK && !BUILD_CAFFE2_MOBILE at all places which I believe is more error prone.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also set this macro for CI builds and runs in the jenkins setup? It may be getting automatically set, but just making sure.

Copy link
Copy Markdown
Contributor

@ljk53 ljk53 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! Please let me know whether the proposal of decoupling cmake USE_PYTORCH_QNNPACK and USE_QNNPACK make sense.

Comment thread cmake/Dependencies.cmake Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: shouldn't use 2 space indent and avoid these changes?

Comment thread caffe2/CMakeLists.txt Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

operators/quantized is only built for server or libcaffe2 mobile (line 67) - but for both cases BUILD_CAFFE2_MOBILE is true (right, it's default to true and we don't unset it for non-mobile build). The additional condition should be no-op - so what case are you trying to protect here?

In general, I saw you already added source code macro USE_PYTORCH_QNNPACK, maybe we can do the same thing in cmake, i.e. instead of changing cmake USE_QNNPACK's behavior with BUILD_CAFFE2_MOBILE macro, I'm thinking of a more decoupled setup:

  • doesn't change cmake USE_QNNPACK logic;
  • add USE_PYTORCH_QNNPACK cmake macro for cloned pytorch_qnnpack. For now we can simply copy USE_QNNPACK cmake logic - we should gradually refactor pytorch_qnnpack/CMakeList to make it more natively integrated with aten;
  • for caffe2 mobile build, always set USE_PYTORCH_QNNPACK to false; USE_QNNPACK is still a user facing macro;
  • for pytorch mobile build, always set USE_QNNPACK to false (as we don't build caffe2 ops); and always set USE_PYTORCH_QNNPACK to true - unless cmake/Dependencies.cmake logic determines it's not supported (check places where set USE_QNNPACK off currently).

Maybe we can call it INTERN_USE_PYTORCH_QNNPACK to keep it a private flag as eventually it will be part of aten natively. I think we can always build it unless the architecture is not supported by qnnpack implementation. Does it sound reasonable or not?

Comment thread cmake/Dependencies.cmake Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as I proposed, pytorch_qnnpack related logic can be put after qnnpack block, gated with USE_PYTORCH_QNNPACK macro.

Copy link
Copy Markdown
Contributor Author

@AshkanAliabadi AshkanAliabadi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ljk53 Please check if I addressed your concerns properly. Thank you.

Comment thread CMakeLists.txt Outdated
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a new user-facing option.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a different description compared to USE_QNNPACK? It may get confusing for the user.

Copy link
Copy Markdown
Contributor

@supriyar supriyar Sep 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When do we plan to set this new macro to ON? Do we know if the tests pass with this on? For pytorch build (both mobile and server) we should ideally set USE_QNNPACK to false and set USE_PYTORCH_QNNPACK to true. @ljk53 does that sound reasonable?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point regarding a better description. I had to set USE_PYTORCH_QNNPACK to true in a subsequent commit to make a bunch of failing tests pass. I left USE_QNNPACK to ON but can change it if that's what we want.

Comment thread CMakeLists.txt Outdated
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a separate preprocessor symbol.

Comment thread cmake/Dependencies.cmake Outdated
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a separate block.

Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AshkanAliabadi has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Copy Markdown
Contributor

@ljk53 ljk53 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Do you know why imported phabricator diff doesn't have xplat sync?

Comment thread CMakeLists.txt Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably can remove this and set USE_PYTORCH_QNNPACK on line #300 (in INTERN_BUILD_MOBILE AND NOT BUILD_CAFFE2_MOBILE). pytorch_qnnpack will be merged into libtorch as default quantization engine for mobile build so I feel we don't have to expose the option to users.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ljk53 I think it would be nice to enable this macro for pytorch server builds too. It is possible to run QNNPACK on x86 and it will be useful for debugging/running python tests and comparing with FBGEMM, etc.

@AshkanAliabadi
Copy link
Copy Markdown
Contributor Author

@ljk53 No I don't know why xplat doesn't sync. This happened with the previous PR as well and I had to manually amend the commit locally just to force the pre push hook run to perform the copy. I'll upload another revision with the changes you requested soon.

@AshkanAliabadi
Copy link
Copy Markdown
Contributor Author

@ljk53 I made the change you requested and noticed that tests/test_quantized.py now fails because USE_PYTORCH_QNNPACK is only enabled for mobile builds, while that test apparently is not gated. Hence QNNPACKRelu and family resolve to an empty implementation on non-mobile builds which throws off the test. How do you want to move forward? Disable the tests for non-mobile builds or enable USE_PYTORCH_QNNPACK for all?

Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AshkanAliabadi has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ljk53
Copy link
Copy Markdown
Contributor

ljk53 commented Sep 15, 2019

Can we try to stack this PR on top of #26238 and see if we can leave both USE_QNNPACK and USE_PYTORCH_QNNPACK enabled?

Copy link
Copy Markdown
Contributor Author

@AshkanAliabadi AshkanAliabadi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging the two related PRs (the current one and this here.) There was probably a way to use ghstack to this same effect that I am not familiar with.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed.

@pytorchbot pytorchbot added the module: android Related to Android support label Sep 16, 2019
Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AshkanAliabadi has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@@ -10,6 +10,7 @@ set(libtorch_FILE ${CMAKE_CURRENT_LIST_DIR}/src/main/jniLibs/${ANDROID_ABI}/libt
set(libc10_FILE ${CMAKE_CURRENT_LIST_DIR}/src/main/jniLibs/${ANDROID_ABI}/libc10.a)
Copy link
Copy Markdown
Contributor Author

@AshkanAliabadi AshkanAliabadi Sep 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the help Ivan. This file contains the Android related changes.

@supriyar supriyar self-requested a review September 17, 2019 00:06
Copy link
Copy Markdown
Contributor

@supriyar supriyar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Thanks!

@AshkanAliabadi
Copy link
Copy Markdown
Contributor Author

Hopefully this'll work for you.

Copy link
Copy Markdown
Contributor

@ljk53 ljk53 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me - do you want to land both PRs together? or try splitting into two PRs with ghstack?

zdevito pushed a commit to zdevito/ATen that referenced this pull request Sep 17, 2019
Summary:
Enable forked QNNPACK builds in PyTorch mobile.
Pull Request resolved: pytorch/pytorch#25844

Differential Revision: D17336458

Pulled By: AshkanAliabadi

fbshipit-source-id: 6ea09dd6c114b64313e9159bf7f17253bc87bfdb
@facebook-github-bot
Copy link
Copy Markdown
Contributor

@AshkanAliabadi merged this pull request in dc851ab.

laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
Summary:
Enable forked QNNPACK builds in PyTorch mobile.
Pull Request resolved: pytorch#25844

Differential Revision: D17336458

Pulled By: AshkanAliabadi

fbshipit-source-id: 6ea09dd6c114b64313e9159bf7f17253bc87bfdb
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

caffe2 Merged module: android Related to Android support module: build Build system issues oncall: quantization Quantization support in PyTorch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants