cuda4dnn(region): optimize kernels by YashasSamaga · Pull Request #16096 · opencv/opencv

YashasSamaga · 2019-12-08T15:40:29Z

This pullrequest changes

optimizations to region kernels based on profiling results
optimizations mainly YOLOv3's region pathway

The CUDA part of the region layer took nearly 700us for single image inference on GTX 1050. It now takes around 270us. That's over 2.6x improvement.

The YOLOv2 path is poorly optimized but it's better than before. It can be optimized further if required (I don't think anybody uses YOLOv2 anyway).

Benchmark:

This PR + PR16092
GTX 1050 and 7700HQ

Warmup runs: 3
Benchmark runs: 100

Model	CUDA backend	Darknet
YOLOv3	54.154ms	57.384ms

Benchmark code: https://gist.github.com/YashasSamaga/26eb2eb16be2cc749e3394d300a7585e

DISCLAIMER: I am not very comfortable editing darknet code but I hope it's correct.

force_builders=Custom,docs
buildworker:Custom=linux-4
docker_image:Custom=ubuntu-cuda:16.04

alalek

Well done! Looks good to me 👍

optimize region kernels

dd3f517

YashasSamaga force-pushed the cuda4dnn-region-optimize branch from 9103f52 to dd3f517 Compare December 8, 2019 17:38

alalek approved these changes Dec 9, 2019

View reviewed changes

opencv-pushbot pushed a commit that referenced this pull request Dec 9, 2019

Merge pull request #16096 from YashasSamaga:cuda4dnn-region-optimize

b505cf8

opencv-pushbot merged commit dd3f517 into opencv:master Dec 9, 2019

HagegeR referenced this pull request in AlexeyAB/darknet Dec 10, 2019

Accelerated [Gaussian_yolo] layer

dbe34d7

AlexeyAB mentioned this pull request Dec 10, 2019

Measure execution time of [yolo] and [Gaussian_yolo] and optimize it if necessary. AlexeyAB/darknet#4497

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cuda4dnn(region): optimize kernels#16096

cuda4dnn(region): optimize kernels#16096
opencv-pushbot merged 1 commit intoopencv:masterfrom
YashasSamaga:cuda4dnn-region-optimize

YashasSamaga commented Dec 8, 2019 •

edited by alalek

Loading

Uh oh!

alalek left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

YashasSamaga commented Dec 8, 2019 • edited by alalek Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

This pullrequest changes

Uh oh!

alalek left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

YashasSamaga commented Dec 8, 2019 •

edited by alalek

Loading