Skip to content

Fix build issues and test failures in MKLDNN, and refine cmake files#11853

Closed
gujinghui wants to merge 16 commits intopytorch:masterfrom
gujinghui:fix_build
Closed

Fix build issues and test failures in MKLDNN, and refine cmake files#11853
gujinghui wants to merge 16 commits intopytorch:masterfrom
gujinghui:fix_build

Conversation

@gujinghui
Copy link
Collaborator

@gujinghui gujinghui commented Sep 19, 2018

  1. Fix undefined symbol issue in MKLDNN after caffe2 rebase
  2. Fix build error for MKLDNN due to cmake global var corrupted
  3. Remove avx2 support in MKLDNN
  4. Seperate mkl, MKLDNN and mklml in cmake
  5. Fix random failure in test case convfusion_test

@gujinghui
Copy link
Collaborator Author

gujinghui commented Sep 19, 2018

@yinghai @jgong5
Pls review.

Hi @yinghai
The MKLDNN upgrade in this PR is only to fix the missing symbol after rebasing caffe2.
There is no other changes any more in MKLDNN.
Pls help merge it ASAP.

Copy link
Collaborator

@soumith soumith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gujinghui if we want to build MKLDNN functionality, but without ideep, it's no longer possible after this PR. We need to make sure that MKLDNN-only build (i.e. without ideep) is not removed / regressed

@gujinghui
Copy link
Collaborator Author

hi @soumith

ideep and mkldnn are exactly same thing. ideep just is the integration bridge of mkldnn for caffe2.
In this PR, we just help to build mkldnn from source code for both caffe2 and aten, instead of finding the pre-installed mkldnn.
Is it OK to you? Or, could you tell your concern ?

Thanks,

This comment was marked as off-topic.

This comment was marked as off-topic.

@soumith
Copy link
Collaborator

soumith commented Sep 20, 2018

thanks. this is helpful context. I didn't have the context before-hand, so was confused as to what it was doing.

I requested that you revert some tab / whitespace changes in build_pytorch_libs.sh / bat. After that this PR is good to go from my side.

And gracefully include omp.h

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
Mapping between cmake var and c macro:
USE_MKL    -->  CAFFE2_USE_MKL
USE_IDEEP  -->  CAFFE2_USE_IDEEP
USE_MKLML  -->  CAFFE2_USE_MKLML

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
@jiecaoyu
Copy link

I think this PR will solve my problem. I am spending days to force caffe2 to use mkldnn and ideep. I doubt whether current version is really using mkldnn at all since I cannot find where mkldnn primitives are called.

@jgong5
Copy link
Collaborator

jgong5 commented Sep 21, 2018

@jiecaoyu PyTorch repo is changing very rapidly these days. We are trying hard to get MKL-DNN build back live these days...

@jiecaoyu
Copy link

@jgong5 Thanks so much!

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

add_library(Caffe2_ideep_operators OBJECT ${avx2_srcs})
add_dependencies(Caffe2_ideep_operators Caffe2_PROTO)
set_target_properties(Caffe2_ideep_operators PROPERTIES COMPILE_FLAGS "-mavx2")

This comment was marked as off-topic.

This comment was marked as off-topic.


BaseStaticContext* GetIDEEPStaticContext();
#ifdef _WIN32
#define CAFFE2_IDEEP_EXPORT __declspec(dllexport)

This comment was marked as off-topic.

This comment was marked as off-topic.

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
@gujinghui gujinghui force-pushed the fix_build branch 2 times, most recently from 47403eb to 2c056b6 Compare September 21, 2018 14:58
Copy link
Contributor

@yinghai yinghai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the build errors look weird. Have you trying rebasing to master?

Copy link
Contributor

@yinghai yinghai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diff still tries to do too many things and now it's painful to debug the build errors. Let's further localize the changes. Let's make a PR to do just one job:

  • Fix undefined symbol issue in iDEEP after caffe2
    rebase

And see how things go. It's OK that initially we have build errors.

ELSE (BLAS_USE_CBLAS_DOT)
SET(BLAS_USE_CBLAS_DOT FALSE)
ENDIF (BLAS_USE_CBLAS_DOT)
SET(CMAKE_REQUIRED_LIBRARIES)

This comment was marked as off-topic.

This comment was marked as off-topic.

@gujinghui
Copy link
Collaborator Author

@yinghai

split to #11975

@gujinghui gujinghui changed the title Fix build issues and test failures in iDEEP, and refine cmake files Fix build issues and test failures in MKLDNN bridge, and refine cmake files Sep 28, 2018
@gujinghui gujinghui changed the title Fix build issues and test failures in MKLDNN bridge, and refine cmake files Fix build issues and test failures in MKLDNN, and refine cmake files Sep 28, 2018
@pjh5
Copy link
Contributor

pjh5 commented Oct 23, 2018

@gujinghui what's the status of this? Are there other PRs to follow up the other changes that were included here?

@gujinghui
Copy link
Collaborator Author

gujinghui commented Oct 24, 2018

@pjh5
Changes in this PR has been split into several PRs. And some of them has been merged in master branch.
We are continously co-working with pytorch/caffe2 team to make this happen ASAP.
If done, this PR will be closed.

BTW, we are working on #12170 now. After this got merged, you can run below test cmd to build with MKLDNN.

NO_MKLDNN=NO python setup.py build_deps

@yinghai
Copy link
Contributor

yinghai commented Oct 29, 2018

Closing this as #12170 is closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants