[Caffe2] Enabling AMD GPU Backend for Caffe2#7566
Conversation
…e2_core_hip * 'caffe2_core_hip' of github.com:petrex/pytorch: caffe2 PB update for AMD/ROCM HIP device
| USE_GLOO "Use Gloo" ON | ||
| "BUILD_CAFFE2" OFF) | ||
| option(USE_GLOO_IBVERBS "Use Gloo IB verbs for distributed support" OFF) # New option | ||
| option(USE_HIP "Use HIP" ON) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| FIND_PACKAGE(HIP 1.0 REQUIRED) | ||
| FIND_PACKAGE(HIP 1.0) | ||
|
|
||
| IF(HIP_FOUND) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| @@ -1,42 +1,86 @@ | |||
| set(PYTORCH_FOUND_HIP FALSE) | |||
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| set(Caffe2_HIP_INCLUDES | ||
| ${hip_INCLUDE_DIRS} ${rocrand_INCLUDE_DIRS} ${hiprand_INCLUDE_DIRS} ${rocblas_INCLUDE_DIRS} ${miopen_INCLUDE_DIRS} ${Caffe2_HIP_INCLUDES} ${thrust_INCLUDE_DIRS}) | ||
| set(Caffe2_HIP_DEPENDENCY_LIBS | ||
| ${rocrand_LIBRARIES} ${hiprand_LIBRARIES} ${PYTORCH_HIP_HCC_LIBRARIES} ${PYTORCH_MIOPEN_LIBRARIES}) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| # Find the HIP package, set the HIP paths, load the HIP CMake. | ||
| IF(WITH_ROCM) | ||
| include(LoadHIP) | ||
| if (NOT PYTORCH_FOUND_HIP) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
…e2_core_hip * 'caffe2_core_hip' of github.com:petrex/pytorch: (40 commits) [auto] Update onnx to 52f7528 - add more shape inference tests (onnx/onnx#971) onnx/onnx@52f7528 JIT cleanup (pytorch#7631) fix to build sleef when using cmake 3.11.1 (pytorch#7679) Fix typo in document (pytorch#7725) [auto] Update onnx to 6f4b1b1 - Tests for Gemm operator (onnx/onnx#885) onnx/onnx@6f4b1b1 [auto] Update onnx to c6c6aad - Enhance the 1-element broadcast case (onnx/onnx#902) onnx/onnx@c6c6aad serialization for torch.device (pytorch#7713) Fix compile flags for MSVC (pytorch#7703) Fix exporting Sum to onnx (pytorch#7685) Renanme ZFNet to ZFNet512 (pytorch#7723) Implement __reduce__ for torch.dtype (pytorch#7699) Remove unnecessary include in vec256_float.h (pytorch#7711) Update from facebook (pytorch#7696) fix for cuda 9.2 builds (pytorch#7709) make BatchSampler subclass of Sampler, and expose (pytorch#7707) Dont emit warning for ABI incompatibility when PyTorch was built from source (pytorch#7681) remove index from python bindings (fixes: pytorch#7639) (pytorch#7690) Update _torch_docs.py (pytorch#7700) Fix the wrong usage of environment variables detection in cmake Changes from D7881937 and D7963936 plus an edit (pytorch#7605) ...
|
If we're working around a bug in the upstream HIP files, we should say so in the code that is implementing the workaround, so that when HIP fixes their cmake we know what to eliminate. |
|
@ezyang @Jorghi12 Ok let me explain here again, |
|
@petrex Let's first get this initial version in so we can parallel the work of polishing the core and adding hip ops |
…e2_core_hip * 'caffe2_core_hip' of github.com:petrex/pytorch: (24 commits) Allow empty storage for the 'Edge' class. (pytorch#7595) Process group base class and Gloo implementation (pytorch#7628) _LRSchedulers getstate include optimizer info (pytorch#7757) [PyTorch] [gradcheck] change backward() to grad() (pytorch#7710) Update test_nn.py (pytorch#7787) Define general default scheduler for TBB and fix ppc64le bug (pytorch#7761) Add support for accepting Tensor as input in clip_grad_* functions. (pytorch#7769) [Easy] Remove unused code (pytorch#7782) Update tbb (pytorch#7734) Add @generated annotation (pytorch#7780) fix legacy comment after variable tensor merge (pytorch#7771) Revert pytorch#7750 and pytorch#7762 to fix Windows CI on master (pytorch#7772) Temporarily disable build env check (pytorch#7768) Add missing brace (pytorch#7762) [C++ API] Add backward() to Tensor and Variable (pytorch#7750) [auto] Update onnx to d43b550 - Fix .gitignore and add missing files (onnx/onnx#1005) onnx/onnx@d43b550 [auto] Update onnx to ea1aa13 - add tests for reduce ops (onnx/onnx#675) onnx/onnx@ea1aa13 include cudnn_h (pytorch#7749) [C++ API] Using new registration mechanism (pytorch#7663) [auto] Update onnx to 5dd68e6 - Add a util function: polish_model (onnx/onnx#1000) onnx/onnx@5dd68e6 ...
This reverts commit 6e89ad4.
|
@bddppq Just reverted change for the operators. Let's keep this PR for Caffe2 core and CI only. |
* Revert "[auto] Update onnx to 4898c9e - Added TensorDenotation and metadata_props for images (onnx/onnx#879) onnx/onnx@4898c9e" This reverts commit 9c679da. * Revert "Add BiasCHW fallback for GPU (#7738)" This reverts commit 14ad2e7. * Revert "[Caffe2] Enabling AMD GPU Backend for Caffe2 (#7566)" This reverts commit 2ebcf4b.
* origin: [Caffe2] Enabling AMD GPU Backend for Caffe2 (pytorch#7566) Call grad_mode.py context managers as decorators (pytorch#7737) catch CPU tensors in checkSameGPU (fixes pytorch#7689) (pytorch#7767) Mark stack as non-executable in NNPACK (pytorch#7752) small fixes in fusion_compiler (pytorch#7776) Run clang-format on c10d (pytorch#7791)
* Add hip support for caffe2 core * Add MIOPEN header/wrapper to caffe2 core * Add HIP device into caffe2 PB * top level makefile change for rocm/hip * makefile scaffolding for AMD/RocM/HIP * Makefile scafodding for AMD/RocM/HIP; add makefile/utility for HIP files * caffe2 PB update for AMD/ROCM HIP device * Add AMD/RocM/Thrust dependency * HIP threadpool update * Fix makefile macro * makefile fix: duplicate test/binary name * makefile clean-up * makefile clean-up * add HIP operator registry * add utilities for hip device * Add USE_HIP to config summary * makefile fix for BUILD_TEST * merge latest * Fix indentation * code clean-up * Guard builds without HIP and use the same cmake script as PyTorch to find HIP * Setup rocm environment variables in build.sh (ideally should be done in the docker images) * setup locale * set HIP_PLATFORM * Revert "set HIP_PLATFORM" This reverts commit 8ec58db. * continue the build script environment variables mess * HCC_AMDGPU_TARGET * Cleanup the mess, has been fixed in the lastest docker images * Assign protobuf field hip_gpu_id a new field number for backward compatibility * change name to avoid conflict * Fix duplicated thread pool flag * Refactor cmake files to not add hip includes and libs globally * Fix the wrong usage of environment variables detection in cmake * Add MIOPEN CNN operators * Revert "Add MIOPEN CNN operators" This reverts commit 6e89ad4.
* Revert "[auto] Update onnx to 4898c9e - Added TensorDenotation and metadata_props for images (onnx/onnx#879) onnx/onnx@4898c9e" This reverts commit 9c679da. * Revert "Add BiasCHW fallback for GPU (pytorch#7738)" This reverts commit 14ad2e7. * Revert "[Caffe2] Enabling AMD GPU Backend for Caffe2 (pytorch#7566)" This reverts commit 2ebcf4b.
The goal of this PR is to enable AMD GPU backend for Caffe2.
Major changes include :