Enable Detectron model inference for CPU and MKL-DNN paths by jgong5 · Pull Request #10157 · pytorch/pytorch

jgong5 · 2018-08-02T06:37:41Z

Support ops needed for inference of Faster-RCNN/Mask-RCNN needed in Detectron, mostly direct fallbacks.
Use CPU device to hold 0-dim tensors and integer tensors in both fallback op and blob feeder, needed by Detectron models.
Ignore 0-dim tensor in MKL-DNN concat operator.
Generate dynamic library of Detectron module for CPU device.

This PR obsoletes #9164.

caffe2/ideep/operators/concat_split_op.cc

+        auto& tensor_cpu = OperatorBase::Input<Tensor>(i, CPU);
+        CAFFE_ENFORCE(tensor_cpu.dims().size() == 0 ||
+                      tensor_cpu.size_from_dim(0) == 0,
+                      "Expect zero dim tensor");


caffe2/ideep/operators/operator_fallback_ideep.h

-      } else if (
-          InputIsType<itensor>(i) &&
-          Input(i).get_data_type() == itensor::data_type::s32) {
-        auto& input = Input(i);


caffe2/ideep/operators/operator_fallback_ideep.h

-        } else {
-          input.reorder_to(dtensor->template mutable_data<float>());
+        if (input_share_[i]) {
+          local_input_blobs_[i]->Reset();


caffe2/ideep/operators/operator_fallback_ideep.h

+                                const_cast<void*>(src.raw_data()));
+        } else {
+          dtensor->set_data_handle(const_cast<void *>(src.raw_data()));
+        }


caffe2/python/pybind_state_ideep.cc

    else if (meta == TypeMeta::Make<int>())
      return itensor::data_type::s32;
-    else if (meta == TypeMeta::Make<float16>())
-      return itensor::data_type::s16;


…ther an IDEEP tensor nor CPU tensor, e.g. AtomicIter

jgong5 · 2018-08-10T03:17:43Z

@yinghai @wesolwsk Any hint on the build failure? It doesn't seem related to my changes... Also any review comments?

yinghai · 2018-08-10T20:59:55Z

Could you check the one test failure?

facebook-github-bot

yinghai has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

jgong5 · 2018-08-13T04:13:20Z

@yinghai Rebased and now the pre-ci test passed. Please check. Thanks.

caffe2/ideep/operators/concat_split_op.cc

+      if (OperatorBase::InputBlob(i).template IsType<itensor>()) {
+        inputs.emplace_back(Input(i));
+      } else {
+        CAFFE_ENFORCE(OperatorBase::InputBlob(i).template IsType<TensorCPU>(),


jgong5 · 2018-08-13T05:05:06Z

@yinghai Update. Please check.

facebook-github-bot

yinghai has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

yinghai · 2018-08-14T05:04:04Z

@BIT-silence Could you take a look at the detectron related ops?

jgong5 · 2018-08-17T04:44:17Z

@BIT-silence Any comments? OK to merge?

modules/detectron/upsample_nearest_op.h

-    // No CPU implementation for now
-    CAFFE_NOT_IMPLEMENTED;
+    auto translate_idx = [](int ii, int d1, int d2, int d3, int scale_factor) {
+      int x, y, z, w;


modules/detectron/upsample_nearest_op.h

+    auto& X = Input(0);
+    auto* Y = Output(0);
+
+    vector<TIndex> out_shape;


modules/detectron/upsample_nearest_op.h

+      d3 = Y->dim32(3);
+    }
+
+    const float *input_data = X.template data<T>();


modules/detectron/upsample_nearest_op.h

+    }
+
+    const float *input_data = X.template data<T>();
+    float *output_data = Y->template mutable_data<T>();


caffe2/python/pybind_state_ideep.cc

+    if (ndim == 0) {
+      return true;
+    }
+    for (int i = 0; i < ndim; i++) {


caffe2/python/pybind_state_ideep.cc

+    auto g = MakeGuard([&]() { Py_XDECREF(array); });
+    const auto npy_type = PyArray_TYPE(array);
+    const TypeMeta &meta = NumpyTypeToCaffe(npy_type);
+    CAFFE_ENFORCE(


caffe2/python/pybind_state_ideep.cc

+    }
+
+    switch (npy_type) {
+    case NPY_OBJECT:


modules/detectron/upsample_nearest_op.cc


 #include "upsample_nearest_op.h"
+#ifdef CAFFE2_USE_IDEEP
+#include <caffe2/ideep/operators/operator_fallback_ideep.h>


modules/detectron/upsample_nearest_op.cc

 #include "upsample_nearest_op.h"
+#ifdef CAFFE2_USE_IDEEP
+#include <caffe2/ideep/operators/operator_fallback_ideep.h>
+#include <caffe2/ideep/utils/ideep_operator.h>


jgong5 · 2018-08-17T07:14:40Z

@BIT-silence Thanks much for the review comments. Please check the updated patch.

Conflicts: caffe2/ideep/operators/operator_fallback_ideep.cc

jgong5 · 2018-08-18T02:54:10Z

@BIT-silence OK to merge? Thanks.

xiaomengy

LGTM

facebook-github-bot

yinghai has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

jgong5 · 2018-08-28T13:32:35Z

@yinghai OK to merge?

facebook-github-bot

yinghai has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…0157) Summary: 1. Support ops needed for inference of Faster-RCNN/Mask-RCNN needed in Detectron, mostly direct fallbacks. 2. Use CPU device to hold 0-dim tensors and integer tensors in both fallback op and blob feeder, needed by Detectron models. 3. Ignore 0-dim tensor in MKL-DNN concat operator. 4. Generate dynamic library of Detectron module for CPU device. This PR obsoletes pytorch#9164. Pull Request resolved: pytorch#10157 Differential Revision: D9276837 Pulled By: yinghai fbshipit-source-id: dc364932ae4a2e7fcefdee70b5fce3c0cee91b6f

detectron support

c6caa5d