[Caffe2]Enable fusion for IDEEP in optimizeForIdeep#8105
[Caffe2]Enable fusion for IDEEP in optimizeForIdeep#8105gujinghui wants to merge 7 commits intopytorch:masterfrom
Conversation
|
@yinghai |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
caffe2/ideep/operators/conv_op.cc
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
caffe2/opt/optimize_ideep.cc
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
caffe2/opt/optimize_ideep.cc
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
caffe2/opt/optimize_ideep.cc
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
Add @jgong5 |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
hi @yinghai |
|
@gujinghui Sorry for the late reply. I'm working on verifying the fix for group convolution and next item is to check the memory increase issue. I'll get back to this one once the above outstanding issues are resolved as this one is not a blocker to anything. Makes sense? |
|
@yinghai Please let us know anything we can help to troubleshoot the memory footprint issue. I guess we can reproduce the problem with any model like ResNet-50, right? |
|
@jgong5 @gujinghui yeah, to start, just try resnet50 with ideep and mkl and check the difference. |
|
@gujinghui Actually do you have mask RCNN case to run? That might repo the memory issue better. |
Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
Use optimizeForIdeep to convert filter format, instead. Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
|
@yinghai |
| if (MKLDNN_INCLUDE_DIR) | ||
| list(APPEND IDEEP_INCLUDE_DIR ${MKLDNN_INCLUDE_DIR}) | ||
| list(APPEND __ideep_looked_for ${MKLDNN_INCLUDE_DIR}) | ||
| list(APPEND __ideep_looked_for MKLDNN_INCLUDE_DIR) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
yinghai
left a comment
There was a problem hiding this comment.
Thanks for pushing this. I have some comments that needs to be addressed. In addition, I think this PR is losing focus, as it is supposed to be transformation. However, I saw
- Transformation
- Fixes in group size and FC
- Fixes in CMakefile
I suggest we split this PR into 3. Actually, let me take the CMakefile fix. And you can work on to split this into 2. Sounds good?
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| auto* Y = Output(OUTPUT); | ||
| auto Y_dims = CalcOutputDims(X, filter.get_dim(0)); | ||
| auto grouped = filter.is_grouped() ? 1 : 0; | ||
| auto Y_dims = CalcOutputDims( |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| const auto& filter = Input(FILTER); | ||
| auto* Y = Output(OUTPUT); | ||
|
|
||
| auto newDims = [&](itensor::dims adims, size_t axis) { |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| if (MKLDNN_INCLUDE_DIR) | ||
| list(APPEND IDEEP_INCLUDE_DIR ${MKLDNN_INCLUDE_DIR}) | ||
| list(APPEND __ideep_looked_for ${MKLDNN_INCLUDE_DIR}) | ||
| list(APPEND __ideep_looked_for MKLDNN_INCLUDE_DIR) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
facebook-github-bot
left a comment
There was a problem hiding this comment.
@yinghai has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: The reason is that we are referencing `__ideep_looked_for` here: https://github.com/pytorch/pytorch/blob/77484d91db052dfcfa22a38408349853b6246f8a/cmake/Modules/FindMKL.cmake#L350 This was first flushed out in #8105 and probably can help with #9024 Pull Request resolved: #9217 Reviewed By: houseroad Differential Revision: D8754491 Pulled By: yinghai fbshipit-source-id: 70aecc2d60684b9ea522403dc98a0a1a2c3db7e6
Summary: The reason is that we are referencing `__ideep_looked_for` here: https://github.com/pytorch/pytorch/blob/77484d91db052dfcfa22a38408349853b6246f8a/cmake/Modules/FindMKL.cmake#L350 This was first flushed out in pytorch/pytorch#8105 and probably can help with pytorch/pytorch#9024 Pull Request resolved: pytorch/pytorch#9217 Reviewed By: houseroad Differential Revision: D8754491 Pulled By: yinghai fbshipit-source-id: 70aecc2d60684b9ea522403dc98a0a1a2c3db7e6
Summary: The reason is that we are referencing `__ideep_looked_for` here: https://github.com/pytorch/pytorch/blob/77484d91db052dfcfa22a38408349853b6246f8a/cmake/Modules/FindMKL.cmake#L350 This was first flushed out in pytorch/pytorch#8105 and probably can help with pytorch/pytorch#9024 Pull Request resolved: pytorch/pytorch#9217 Reviewed By: houseroad Differential Revision: D8754491 Pulled By: yinghai fbshipit-source-id: 70aecc2d60684b9ea522403dc98a0a1a2c3db7e6
|
@gujinghui Is this PR still relevant? If yes, could you rebase on master? |
|
@yinghai This PR is not needed now. Will close it. |
Summary: The reason is that we are referencing `__ideep_looked_for` here: https://github.com/pytorch/pytorch/blob/77484d91db052dfcfa22a38408349853b6246f8a/cmake/Modules/FindMKL.cmake#L350 This was first flushed out in pytorch#8105 and probably can help with pytorch#9024 Pull Request resolved: pytorch#9217 Reviewed By: houseroad Differential Revision: D8754491 Pulled By: yinghai fbshipit-source-id: 70aecc2d60684b9ea522403dc98a0a1a2c3db7e6
Enable fusion for IDEEP in optimizeForIdeep, including Conv+ReLU, Conv+Sum, Conv+Sum+ReLU, Conv+BN, and pre-convert filter format