Skip to content

[caffe2] Support group filter in IDEEP fusion#9767

Closed
gujinghui wants to merge 3 commits intopytorch:masterfrom
gujinghui:group_fusion
Closed

[caffe2] Support group filter in IDEEP fusion#9767
gujinghui wants to merge 3 commits intopytorch:masterfrom
gujinghui:group_fusion

Conversation

@gujinghui
Copy link
Collaborator

The grouped filter has been supported in mkldnn.
This commit removed the check to enable it in IDEEP fusion.

@gujinghui
Copy link
Collaborator Author

@yinghai @jgong5

@yinghai
Copy link
Contributor

yinghai commented Jul 24, 2018

We need to update mkl-dnn module, no?

@gujinghui
Copy link
Collaborator Author

@yinghai
pls do not merge it now.
waiting for one patch from mkldnn.

@jgong5
Copy link
Collaborator

jgong5 commented Jul 25, 2018

@yinghai MKL-DNN supports CONV+SUM fusion for all conv kernels except for GEMM conv. We reported the issue and they are fixing the problem now.

@ezyang
Copy link
Contributor

ezyang commented Oct 12, 2018

@gujinghui What's the status of this PR now?

@gujinghui
Copy link
Collaborator Author

@ezyang
depends on #12164

@gujinghui
Copy link
Collaborator Author

@yinghai
rebased. pls help review.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yinghai has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@yinghai yinghai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any unit tests for this?

@gujinghui
Copy link
Collaborator Author

gujinghui commented Dec 15, 2018

@yinghai
unit test is added. pls help review.

failed test cases should be not relevant.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yinghai has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@gujinghui
Copy link
Collaborator Author

@yinghai
So sorry. I forgot one patch for this PR.
Just appended it. Pls help review.

Thanks a lot.

@gujinghui
Copy link
Collaborator Author

@yinghai
rebased. pls help review.

@gujinghui
Copy link
Collaborator Author

case failed due to this PR. will fix it soon.

Copy link
Contributor

@yinghai yinghai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any update?

@gujinghui
Copy link
Collaborator Author

root-caused. need time to figure out how to fix it. :(

@gujinghui
Copy link
Collaborator Author

@yinghai
The failure should not be related.
Pls help review.

E: Could not get lock /var/lib/apt/lists/lock - open (11: Resource temporarily unavailable)
E: Unable to lock directory /var/lib/apt/lists/
Exited with code 100

#include "caffe2/opt/fusion.h"

#ifdef CAFFE2_USE_MKLDNN
#include <cpuinfo.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this come from?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the src of cpuinfo can be found under third_party/cpuinfo.
it is static built in to caffe2 lib by default.
it is introduced here to check if the avx512 is supported on current machine.
because the conv + sum fusion is only available on avx512 path so far.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yinghai has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yinghai has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@yinghai yinghai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It fails ASAN tests in our CI.

@gujinghui
Copy link
Collaborator Author

hi @yinghai
could you give more info on the ASAN failure?

@yinghai
Copy link
Contributor

yinghai commented Feb 27, 2019

Yeah, the ASAN msg is not very obvious, complaining wide pointer access. Somewhere in the Ideep code it's corrupting the memory.

@gujinghui gujinghui closed this Mar 25, 2019
@gujinghui gujinghui reopened this Mar 25, 2019
Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
…port

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
@gujinghui
Copy link
Collaborator Author

@yinghai
The patch caused ASAN in ideep has been reverted.
Pls have a try to check this PR.

@jgong5

Thanks,
Jinghui

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants