Skip to content

cv::dnn::Net::forward() returns wrong timings #20077

@pauljurczak

Description

@pauljurczak

I tried OpenCV 4.5.2-dev on Ubuntu 18.04 and 20.04 with gcc-9, CUDA 11 and NVidia GPUs. This code snippet:

int main() {
  using namespace cv::dnn;

  Net net = readNetFromDarknet("/home/paul/darknet/cfg/yolov3.cfg", "/home/paul/darknet/yolov3.weights");
  
  net.setPreferableBackend(DNN_BACKEND_CUDA);
  net.setPreferableTarget(DNN_TARGET_CUDA);

  for (int i = 0; i < 7; i++) {
    Mat frame = imread(format("/home/paul/data/s{}.jpg", i));
    vector<double> timings;
    vector<Mat> preds;
    Mat blob = blobFromImage(frame, 1/255.0, cv::Size(608, 608), Scalar(0,0,0), true, false);

    net.setInput(blob);
    auto t0{high_resolution_clock::now()};
    net.forward(preds, net.getUnconnectedOutLayersNames());
    auto t1{high_resolution_clock::now()};
    print("{:.4f}s vs. {:.4f}s    {} preds\n", net.getPerfProfile(timings)/getTickFrequency(), duration<float>{t1-t0}.count(), preds.size());
  }
}

produces this output on GTX 1650 Super GPU:

0.2857s vs. 1.2992s    3 preds
0.0038s vs. 0.0565s    3 preds
0.0038s vs. 0.0589s    3 preds
0.0039s vs. 0.0571s    3 preds
0.0039s vs. 0.0520s    3 preds
0.0040s vs. 0.0544s    3 preds
0.0037s vs. 0.0510s    3 preds

and this one on GT 730 GPU:

0.1767s vs. 1.8613s    3 preds
0.0019s vs. 0.8146s    3 preds
0.0023s vs. 0.7725s    3 preds
0.0023s vs. 0.8310s    3 preds
0.0021s vs. 0.8263s    3 preds
0.0021s vs. 0.8203s    3 preds
0.0021s vs. 0.7628s    3 preds

The timings produced by cv::dnn::Net::forward() are more than an order of magnitude too short for a fast GPU and two orders of magnitude too short for a slow GPU.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions