[pytorch][mobile] support for custom mobile build with dynamic dispatch#34055
[pytorch][mobile] support for custom mobile build with dynamic dispatch#34055ljk53 wants to merge 6 commits intogh/ljk53/110/basefrom
Conversation
Summary: Enable custom mobile build with dynamic dispatch for OSS build. It calls a python util script to calculate transitive dependencies from the op dependency graph and the list of used root ops, then pass the result as the op registration whitelist to aten codegen, so that only these used ops are registered and kept at link time. For custom build with dynamic dispatch to work correctly, it's critical to have the accurate list of used ops. Current assumption is that only those ops referenced by TorchScript model are used. It works well if client code doesn't call libtorch API (e.g. tensor methods) directly; otherwise the extra used ops need to be added to the whitelist manually, as shown by the HACK in prepare_model.py. Also, if JIT starts calling extra ops independent of specific model, then the extra ops need to be added to the whitelist as well. Verified the correctness of the whole process with MobileNetV2: ``` TEST_CUSTOM_BUILD_DYNAMIC=1 test/mobile/custom_build/build.sh ``` [ghstack-poisoned]
…amic dispatch" Summary: Enable custom mobile build with dynamic dispatch for OSS build. It calls a python util script to calculate transitive dependencies from the op dependency graph and the list of used root ops, then pass the result as the op registration whitelist to aten codegen, so that only these used ops are registered and kept at link time. For custom build with dynamic dispatch to work correctly, it's critical to have the accurate list of used ops. Current assumption is that only those ops referenced by TorchScript model are used. It works well if client code doesn't call libtorch API (e.g. tensor methods) directly; otherwise the extra used ops need to be added to the whitelist manually, as shown by the HACK in prepare_model.py. Also, if JIT starts calling extra ops independent of specific model, then the extra ops need to be added to the whitelist as well. Verified the correctness of the whole process with MobileNetV2: ``` TEST_CUSTOM_BUILD_DYNAMIC=1 test/mobile/custom_build/build.sh ``` [ghstack-poisoned]
Summary: Enable custom mobile build with dynamic dispatch for OSS build. It calls a python util script to calculate transitive dependencies from the op dependency graph and the list of used root ops, then pass the result as the op registration whitelist to aten codegen, so that only these used ops are registered and kept at link time. For custom build with dynamic dispatch to work correctly, it's critical to have the accurate list of used ops. Current assumption is that only those ops referenced by TorchScript model are used. It works well if client code doesn't call libtorch API (e.g. tensor methods) directly; otherwise the extra used ops need to be added to the whitelist manually, as shown by the HACK in prepare_model.py. Also, if JIT starts calling extra ops independent of specific model, then the extra ops need to be added to the whitelist as well. Verified the correctness of the whole process with MobileNetV2: ``` TEST_CUSTOM_BUILD_DYNAMIC=1 test/mobile/custom_build/build.sh ``` ghstack-source-id: d571825 Pull Request resolved: #34055
💊 CircleCI build failures summary and remediationsAs of commit 6cf4d94 (more details on the Dr. CI page):
🕵️ 1 new failure recognized by patternsThe following build failures do not appear to be due to upstream breakages:
|
|
"Also, if JIT starts calling extra ops independent of specific model, |
|
|
||
| # Dump root ops used by the model (for custom build optimization). | ||
| ops = torch.jit.export_opnames(traced_script_module) | ||
| ops.append('aten::ones') # HACK because predictor.cpp explicitly calls this! |
There was a problem hiding this comment.
Just a thought that if our eventual setup still needs to explicitly push roots into ops like this we'll probably want to institutionalize it, like have a global list called EXTRA_ROOT_OPS or whatever
There was a problem hiding this comment.
Yes, as I replied to iseeyuan, we need run static analysis against JIT code to find out common EXTRA_ROOT_OPS, if any.
However, this case is different - aten::ones is directly called from client code (the dummy predictor.cpp calls it to create all-one-tensor for testing purpose). Generally, user could write pure c++ client code to directly call functions or tensor methods, which won't be captured by extracting ops from model's bytecodes.
This is less a problem for Android, where we expect users to use the limited set of Java APIs. To actually solve this problem, we probably need ask users to run code analyzer against their client code to dump these extra root ops - that's why I feel there will be more support work to switch to dynamic dispatch - in static dispatch case these extra ops will be kept by linker automatically.
For CI purpose this one-off hack is probably fine? :)
There was a problem hiding this comment.
Question, does this mean every custom build is gonna get aten::ones whether they use it or not? Or just binaries that use the dummy predictor.cpp? (I'm guessing the former since this is a general script, but plz correct :)
A single op isn't that big a deal I guess, but I think it's important to describe the situation more explicitly - something like your description above in a comment above a global that declares the op more explicitly. Text below is just your comment above, so may not be entirely correct, but this is the kind of thing I mean:
# We need to run static analysis against JIT code to find out common EXTRA_ROOT_OPS, if any.
#
# However, this case is different - aten::ones is directly called from client code (the dummy
# predictor.cpp calls it to create all-one-tensor for testing purpose). Generally, user could
# write pure c++ client code to directly call functions or tensor methods, which won't be
# captured by extracting ops from model's bytecodes.
#
# This is less a problem for Android, where we expect users to use the limited set of Java
# APIs. To actually solve this problem, we probably need ask users to run code analyzer against
# their client code to dump these extra root ops - that's why I feel there will be more support
# work to switch to dynamic dispatch - in static dispatch case these extra ops will be kept by
# linker automatically.
#
# For CI purpose this one-off hack is probably fine? :)
EXTRA_CI_ROOT_OPS = ['aten::ones']
...
ops.extend(EXTRA_CI_ROOT_OPS)
There was a problem hiding this comment.
Question, does this mean every custom build is gonna get
aten::oneswhether they use it or not? Or just binaries that use the dummypredictor.cpp? (I'm guessing the former since this is a general script, but plz correct :)
It's the latter - the script is under test/mobile/... which is for mobile CI specifically. That's why I don't feel so guilty to add this hack :)
There was a problem hiding this comment.
Added the comment to the test script to keep a record.
…amic dispatch" Summary: Enable custom mobile build with dynamic dispatch for OSS build. It calls a python util script to calculate transitive dependencies from the op dependency graph and the list of used root ops, then pass the result as the op registration whitelist to aten codegen, so that only these used ops are registered and kept at link time. For custom build with dynamic dispatch to work correctly, it's critical to have the accurate list of used ops. Current assumption is that only those ops referenced by TorchScript model are used. It works well if client code doesn't call libtorch API (e.g. tensor methods) directly; otherwise the extra used ops need to be added to the whitelist manually, as shown by the HACK in prepare_model.py. Also, if JIT starts calling extra ops independent of specific model, then the extra ops need to be added to the whitelist as well. Verified the correctness of the whole process with MobileNetV2: ``` TEST_CUSTOM_BUILD_DYNAMIC=1 test/mobile/custom_build/build.sh ``` Differential Revision: [D20193327](https://our.internmc.facebook.com/intern/diff/D20193327) [ghstack-poisoned]
…amic dispatch" Summary: Enable custom mobile build with dynamic dispatch for OSS build. It calls a python util script to calculate transitive dependencies from the op dependency graph and the list of used root ops, then pass the result as the op registration whitelist to aten codegen, so that only these used ops are registered and kept at link time. For custom build with dynamic dispatch to work correctly, it's critical to have the accurate list of used ops. Current assumption is that only those ops referenced by TorchScript model are used. It works well if client code doesn't call libtorch API (e.g. tensor methods) directly; otherwise the extra used ops need to be added to the whitelist manually, as shown by the HACK in prepare_model.py. Also, if JIT starts calling extra ops independent of specific model, then the extra ops need to be added to the whitelist as well. Verified the correctness of the whole process with MobileNetV2: ``` TEST_CUSTOM_BUILD_DYNAMIC=1 test/mobile/custom_build/build.sh ``` Differential Revision: [D20193327](https://our.internmc.facebook.com/intern/diff/D20193327) [ghstack-poisoned]
Summary: Enable custom mobile build with dynamic dispatch for OSS build. It calls a python util script to calculate transitive dependencies from the op dependency graph and the list of used root ops, then pass the result as the op registration whitelist to aten codegen, so that only these used ops are registered and kept at link time. For custom build with dynamic dispatch to work correctly, it's critical to have the accurate list of used ops. Current assumption is that only those ops referenced by TorchScript model are used. It works well if client code doesn't call libtorch API (e.g. tensor methods) directly; otherwise the extra used ops need to be added to the whitelist manually, as shown by the HACK in prepare_model.py. Also, if JIT starts calling extra ops independent of specific model, then the extra ops need to be added to the whitelist as well. Verified the correctness of the whole process with MobileNetV2: ``` TEST_CUSTOM_BUILD_DYNAMIC=1 test/mobile/custom_build/build.sh ``` ghstack-source-id: c94a3db Pull Request resolved: #34055
The problem you mentioned should be fixed (regardless of static dispatch custom build or dynamic dispatch custom build) :) What I meant was potential case where model's bytecode doesn't fully cover all ops called from JIT, e.g. what if during model loading time some JIT code decides to call some ops directly? I can imagine some common preprocessing pass might do this. One way to address this problem is to run static analysis against JIT codebase and find out. It will require a little work. |
Are load-time operations embedded in |
bhosmer
left a comment
There was a problem hiding this comment.
One inline thing about the commenting aten::ones dependency but LGTM :)
…amic dispatch" Summary: Enable custom mobile build with dynamic dispatch for OSS build. It calls a python util script to calculate transitive dependencies from the op dependency graph and the list of used root ops, then pass the result as the op registration whitelist to aten codegen, so that only these used ops are registered and kept at link time. For custom build with dynamic dispatch to work correctly, it's critical to have the accurate list of used ops. Current assumption is that only those ops referenced by TorchScript model are used. It works well if client code doesn't call libtorch API (e.g. tensor methods) directly; otherwise the extra used ops need to be added to the whitelist manually, as shown by the HACK in prepare_model.py. Also, if JIT starts calling extra ops independent of specific model, then the extra ops need to be added to the whitelist as well. Verified the correctness of the whole process with MobileNetV2: ``` TEST_CUSTOM_BUILD_DYNAMIC=1 test/mobile/custom_build/build.sh ``` Differential Revision: [D20193327](https://our.internmc.facebook.com/intern/diff/D20193327) [ghstack-poisoned]
…amic dispatch" Summary: Enable custom mobile build with dynamic dispatch for OSS build. It calls a python util script to calculate transitive dependencies from the op dependency graph and the list of used root ops, then pass the result as the op registration whitelist to aten codegen, so that only these used ops are registered and kept at link time. For custom build with dynamic dispatch to work correctly, it's critical to have the accurate list of used ops. Current assumption is that only those ops referenced by TorchScript model are used. It works well if client code doesn't call libtorch API (e.g. tensor methods) directly; otherwise the extra used ops need to be added to the whitelist manually, as shown by the HACK in prepare_model.py. Also, if JIT starts calling extra ops independent of specific model, then the extra ops need to be added to the whitelist as well. Verified the correctness of the whole process with MobileNetV2: ``` TEST_CUSTOM_BUILD_DYNAMIC=1 test/mobile/custom_build/build.sh ``` Differential Revision: [D20193327](https://our.internmc.facebook.com/intern/diff/D20193327) [ghstack-poisoned]
Summary: Enable custom mobile build with dynamic dispatch for OSS build. It calls a python util script to calculate transitive dependencies from the op dependency graph and the list of used root ops, then pass the result as the op registration whitelist to aten codegen, so that only these used ops are registered and kept at link time. For custom build with dynamic dispatch to work correctly, it's critical to have the accurate list of used ops. Current assumption is that only those ops referenced by TorchScript model are used. It works well if client code doesn't call libtorch API (e.g. tensor methods) directly; otherwise the extra used ops need to be added to the whitelist manually, as shown by the HACK in prepare_model.py. Also, if JIT starts calling extra ops independent of specific model, then the extra ops need to be added to the whitelist as well. Verified the correctness of the whole process with MobileNetV2: ``` TEST_CUSTOM_BUILD_DYNAMIC=1 test/mobile/custom_build/build.sh ``` ghstack-source-id: 9bf078d Pull Request resolved: #34055
|
Hey @ljk53, looks like either this PR or the one below it has broken master. Do we have any fix coming for it? https://app.circleci.com/jobs/github/pytorch/pytorch/4683824 |
|
@mrshenli - looking - this might affect build but shouldn't break test. |
…ch (pytorch#34055) Summary: Pull Request resolved: pytorch#34055 Enable custom mobile build with dynamic dispatch for OSS build. It calls a python util script to calculate transitive dependencies from the op dependency graph and the list of used root ops, then pass the result as the op registration whitelist to aten codegen, so that only these used ops are registered and kept at link time. For custom build with dynamic dispatch to work correctly, it's critical to have the accurate list of used ops. Current assumption is that only those ops referenced by TorchScript model are used. It works well if client code doesn't call libtorch API (e.g. tensor methods) directly; otherwise the extra used ops need to be added to the whitelist manually, as shown by the HACK in prepare_model.py. Also, if JIT starts calling extra ops independent of specific model, then the extra ops need to be added to the whitelist as well. Verified the correctness of the whole process with MobileNetV2: ``` TEST_CUSTOM_BUILD_DYNAMIC=1 test/mobile/custom_build/build.sh ``` Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D20193327 Pulled By: ljk53 fbshipit-source-id: 9d369b8864856b098342aea79e0ac8eec04149aa
…ch (pytorch#34055) Summary: Pull Request resolved: pytorch#34055 Enable custom mobile build with dynamic dispatch for OSS build. It calls a python util script to calculate transitive dependencies from the op dependency graph and the list of used root ops, then pass the result as the op registration whitelist to aten codegen, so that only these used ops are registered and kept at link time. For custom build with dynamic dispatch to work correctly, it's critical to have the accurate list of used ops. Current assumption is that only those ops referenced by TorchScript model are used. It works well if client code doesn't call libtorch API (e.g. tensor methods) directly; otherwise the extra used ops need to be added to the whitelist manually, as shown by the HACK in prepare_model.py. Also, if JIT starts calling extra ops independent of specific model, then the extra ops need to be added to the whitelist as well. Verified the correctness of the whole process with MobileNetV2: ``` TEST_CUSTOM_BUILD_DYNAMIC=1 test/mobile/custom_build/build.sh ``` Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D20193327 Pulled By: ljk53 fbshipit-source-id: 9d369b8864856b098342aea79e0ac8eec04149aa
@iseeyuan Here is one example of calling ops during model loading time: https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/serialization/unpickler.cpp#L425 It calls Today it works because these ops are pretty common and are also indirectly called by regular models - but there is no guarantee it will keep this way. For example, #34641 breaks dynamic dispatch CI in a subtle way. We probably need run code analyzer against torch/csrc to find out these common ops used by JIT. |
…ch (pytorch#34055) Summary: Pull Request resolved: pytorch#34055 Enable custom mobile build with dynamic dispatch for OSS build. It calls a python util script to calculate transitive dependencies from the op dependency graph and the list of used root ops, then pass the result as the op registration whitelist to aten codegen, so that only these used ops are registered and kept at link time. For custom build with dynamic dispatch to work correctly, it's critical to have the accurate list of used ops. Current assumption is that only those ops referenced by TorchScript model are used. It works well if client code doesn't call libtorch API (e.g. tensor methods) directly; otherwise the extra used ops need to be added to the whitelist manually, as shown by the HACK in prepare_model.py. Also, if JIT starts calling extra ops independent of specific model, then the extra ops need to be added to the whitelist as well. Verified the correctness of the whole process with MobileNetV2: ``` TEST_CUSTOM_BUILD_DYNAMIC=1 test/mobile/custom_build/build.sh ``` Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D20193327 Pulled By: ljk53 fbshipit-source-id: 9d369b8864856b098342aea79e0ac8eec04149aa
Stack from ghstack:
Summary:
Enable custom mobile build with dynamic dispatch for OSS build.
It calls a python util script to calculate transitive dependencies from
the op dependency graph and the list of used root ops, then pass the
result as the op registration whitelist to aten codegen, so that only
these used ops are registered and kept at link time.
For custom build with dynamic dispatch to work correctly, it's critical
to have the accurate list of used ops. Current assumption is that only
those ops referenced by TorchScript model are used. It works well if
client code doesn't call libtorch API (e.g. tensor methods) directly;
otherwise the extra used ops need to be added to the whitelist manually,
as shown by the HACK in prepare_model.py.
Also, if JIT starts calling extra ops independent of specific model,
then the extra ops need to be added to the whitelist as well.
Verified the correctness of the whole process with MobileNetV2:
Differential Revision: D20193327