Python binding api to optimize for mobile model on script module.#36357
Python binding api to optimize for mobile model on script module.#36357xcheng16 wants to merge 5 commits intogh/xcheng16/17/basefrom
Conversation
Differential Revision: [D20946076](https://our.internmc.facebook.com/intern/diff/D20946076/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D20946076/)! ghstack-source-id: 101896315 Pull Request resolved: #36357
💊 Build failures summary and remediationsAs of commit 81504d3 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker. This comment has been revised 67 times. |
Pull Request resolved: #36357 ghstack-source-id: 101907180 Differential Revision: [D20946076](https://our.internmc.facebook.com/intern/diff/D20946076/)
Pull Request resolved: #36357 ghstack-source-id: 4f7558a Differential Revision: [D20946076](https://our.internmc.facebook.com/intern/diff/D20946076/)
Pull Request resolved: #36357 ghstack-source-id: bad2d7a Differential Revision: [D20946076](https://our.internmc.facebook.com/intern/diff/D20946076/)
AshkanAliabadi
left a comment
There was a problem hiding this comment.
Many thanks for doing this Xingying! This innocent PR enables some of the most important FP32 optimizations we've been working on these past few months. :) Exciting.
| @@ -0,0 +1,26 @@ | |||
| """ | |||
| This module contains utility method for mobile model optimization and lint. | |||
| """ | |||
There was a problem hiding this comment.
I would still like to evangelize for this function to be placed somewhere more accessible, ideally in the torch.jit Python module itself, given its impact on the overall performance on mobile. The difference between running vs neglecting to run this function on a model (potentially because the user is not aware of it) at this point is a factor of at least 2x on FP32. Otherwise we will be mostly reliant on documentation and samples to publicize the presence of this function. In my opinion this function is more than a utility - more like a necessity.
There was a problem hiding this comment.
I am not convinced that by placing in torch.jit namespace it will require less evangelization. We will still have to advertise in best practices for mobile etc. And of course we can move this if it is seems more expedient.
I agree on the part that this should be more of a necessity than utility though.
There was a problem hiding this comment.
how about we create a new mobile folder like one of other proposal @dreiss had. Having the path like torch.mobile.optimize_for_mobile instead of utility if you both feel utility does not give much of necessity.
| buffer = io.BytesIO() | ||
| torch.jit.save(scripted_model, buffer) | ||
| buffer.seek(0) | ||
| optimized_script_model = torch.jit.load(buffer) |
There was a problem hiding this comment.
Curious why we have to save and load the model? Does torch._C._jit_pass_optimize_for_mobile "enable" the pass, and it's only executed on save?
There was a problem hiding this comment.
Yes. @xcheng16, are you doing this because _jit_pass_optimize_for_mobile does inplace transformation of the script module? I think it probably is best for us to change that. Then you can just do return torch._C._jit_pass_optimize_for_mobile(scripted_model._c)
There was a problem hiding this comment.
torch._C._jit_pass_optimize_for_mobile(scripted_model._c) does not provide the inplace transformation of the script module, that is why I save and load again. I can check on _jit_pass_optimize_for_mobile method for why it is the case.
| batch_size = 2 | ||
| input_channels_per_group = 6 | ||
| height = 16 | ||
| width = 16 | ||
| output_channels_per_group = 6 | ||
| groups = 4 | ||
| kernel_h = kernel_w = 3 | ||
| stride_h = stride_w = 1 | ||
| pad_h = pad_w = 1 | ||
| dilation = 1 | ||
| input_channels = input_channels_per_group * groups | ||
| output_channels = output_channels_per_group * groups | ||
| strides = (stride_h, stride_w) | ||
| paddings = (pad_h, pad_w) | ||
| dilations = (dilation, dilation) | ||
| conv_weight_shape = (output_channels, input_channels_per_group, kernel_h, kernel_w) | ||
| conv_bias_shape = (output_channels) | ||
|
|
||
| input_data = torch.rand((batch_size, input_channels, height, width)) | ||
| conv_weight = torch.rand((output_channels, input_channels_per_group, kernel_h, kernel_w)) | ||
| conv_bias = torch.rand((output_channels)) | ||
| result = F.conv2d(input_data, conv_weight, conv_bias, strides, paddings, dilations, groups) | ||
| weight_output_dim = 24 | ||
| linear_input_shape = result.shape[1] | ||
| linear_weight_shape = (weight_output_dim, linear_input_shape) |
There was a problem hiding this comment.
Can you move all of this in test_optimize_for_mobile function? We shouldn't have any global variable initialization that is not ncessary.
| @@ -0,0 +1,26 @@ | |||
| """ | |||
There was a problem hiding this comment.
Should we name this file mobile_optimizer? True that our prepacking and packed op optimizations are not strictly mobile specific, but I wonder if we will want to put other non-mobile specific optimization in torch.utils.optimizer namespace.
If we do, maybe we should get a consensus from others who will want to have their optimizations places here? Maybe quantization team? @AshkanAliabadi @dreiss your opinions?
There was a problem hiding this comment.
If we move it to mobile folder, does it still necessary to check the name into mobile_optimizer?
| import torch | ||
|
|
||
|
|
||
| def optimize_for_mobile(scripted_model): |
There was a problem hiding this comment.
Also, I think it is important to point out that current optimization pass does the optimization inplace. Maybe we can change that so we return a new module.
Pull Request resolved: #36357 ghstack-source-id: 262273b Differential Revision: [D20946076](https://our.internmc.facebook.com/intern/diff/D20946076/)
…module." Summary: Creating a python api entry to optimize mobile model which takes a scripted module as argument and returns an optimized scripted module. The initial optimization features includes inserting and folding prepack ops. Test Plan: python test/test_optimizer.py Differential Revision: [D20946076](https://our.internmc.facebook.com/intern/diff/D20946076/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D20946076/)! [ghstack-poisoned]
Pull Request resolved: #36357 ghstack-source-id: 46c57cb Differential Revision: [D20946076](https://our.internmc.facebook.com/intern/diff/D20946076/)
…module." Summary: Creating a python api entry to optimize mobile model which takes a scripted module as argument and returns an optimized scripted module. The initial optimization features includes inserting and folding prepack ops. Test Plan: python test/test_optimizer.py Differential Revision: [D20946076](https://our.internmc.facebook.com/intern/diff/D20946076/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D20946076/)! [ghstack-poisoned]
Pull Request resolved: #36357 ghstack-source-id: 11c1acd Differential Revision: [D20946076](https://our.internmc.facebook.com/intern/diff/D20946076/)
…module." Summary: Creating a python api entry to optimize mobile model which takes a scripted module as argument and returns an optimized scripted module. The initial optimization features includes inserting and folding prepack ops. Test Plan: python test/test_optimizer.py Differential Revision: [D20946076](https://our.internmc.facebook.com/intern/diff/D20946076/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D20946076/)! [ghstack-poisoned]
Pull Request resolved: #36357 ghstack-source-id: d0f4d57 Differential Revision: [D20946076](https://our.internmc.facebook.com/intern/diff/D20946076/)
…module." Summary: Creating a python api entry to optimize mobile model which takes a scripted module as argument and returns an optimized scripted module. The initial optimization features includes inserting and folding prepack ops. Test Plan: python test/test_optimizer.py Differential Revision: [D20946076](https://our.internmc.facebook.com/intern/diff/D20946076/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D20946076/)! [ghstack-poisoned]
Pull Request resolved: #36357 ghstack-source-id: c4f4589 Differential Revision: [D20946076](https://our.internmc.facebook.com/intern/diff/D20946076/)
| linear_weight_shape = (weight_output_dim, linear_input_shape) | ||
|
|
||
| class MyTestModule(torch.nn.Module): | ||
| def __init__(self, activation_fn=F.relu): |
There was a problem hiding this comment.
Since this test is really use for relu only, lets remove activation_fn. Just called relu directly in forward.
| pattern_count_map = {"Tensor = aten::conv2d": -1, | ||
| "Tensor = prim::CallFunction": -1, | ||
| "prepacked::conv2d_clamp_prepack": -1, | ||
| "prepacked::conv2d_clamp_run": 1, | ||
| "prepacked::linear_clamp_prepack": -1, | ||
| "prepacked::linear_clamp_run": 1} | ||
| for pattern, v in pattern_count_map.items(): | ||
| if (v == 0): | ||
| FileCheck().check(pattern).run(optimized_scripted_model.graph) | ||
| elif (v == -1): | ||
| FileCheck().check_not(pattern).run(optimized_scripted_model.graph) | ||
| else: | ||
| FileCheck().check_count(pattern, v, exactly=True).run(optimized_scripted_model.graph) |
There was a problem hiding this comment.
I think you can replace this part with FileCheck() directly. I used it earlier for convenience. I think you can just have a series of filechecks.
FileCheck().check_not("Tensor = aten::conv2d") \
.check_not("Tensor = prim::CallFunction") \
.check_not("prepacked::conv2d_clamp_prepack") \
.check("prepacked::conv2d_clamp_run")....\
.run(optimized_scripted_model.graph)
| scripted_model: An instance of torch script module with type of ScriptModule | ||
|
|
||
| Returns: | ||
| scripted_model: A new optimized torch script module, the method does not do |
There was a problem hiding this comment.
You dont need to say, it doe not do inplace transformation. Given it already returns a new module, I think it is expected.
…module." Summary: Creating a python api entry to optimize mobile model which takes a scripted module as argument and returns an optimized scripted module. The initial optimization features includes inserting and folding prepack ops. Test Plan: python test/test_optimizer.py Differential Revision: [D20946076](https://our.internmc.facebook.com/intern/diff/D20946076/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D20946076/)! [ghstack-poisoned]
Pull Request resolved: #36357 ghstack-source-id: 9da7666 Differential Revision: [D20946076](https://our.internmc.facebook.com/intern/diff/D20946076/)
kimishpatel
left a comment
There was a problem hiding this comment.
Looks great. Thanks Xingying.
|
This pull request has been merged in 86f354c. |
…torch#36357) Summary: Pull Request resolved: pytorch#36357 ghstack-source-id: 101907180 Creating a python api entry to optimize mobile model which takes a scripted module as argument and returns an optimized scripted module. The initial optimization features includes inserting and folding prepack ops. Test Plan: python test/test_optimizer.py Differential Revision: D20946076 fbshipit-source-id: 93cb4a5bb2371128f802d738eb26d0a4f3b2fe10
Stack from ghstack:
Summary:
Creating a python api entry to optimize mobile model which takes a scripted module as argument and returns an optimized scripted module. The initial optimization features includes inserting and folding prepack ops.
Test Plan:
python test/test_optimizer.py
Differential Revision: D20946076
NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on Phabricator!