boost NMS performance by WeiChungChang · Pull Request #19613 · opencv/opencv

WeiChungChang · 2021-02-24T09:49:16Z

To boost NMS performance, we apply early termination here at this PR.

Notice that current detection out layer will be used internally by proposal layer.
In this case, finally, proposal will, ex output top 100 proposals among all candidates.

Typically, the # of candidates is about more than several thousands.
Currently flow seriously down-grades performance by applying NMS for all candidates.
However, as we know, to output top K NMS output, we should check if we have picked enough candidates and terminate once we have collected what we need.

As the table below, to output top K = 100 candidates, we just need to check less than one thousand bboxes
But for current flow, we check all.

pic	# of candidate calculated before	# of candidate calculated after
1	6000	573
2	6000	371
3	6000	240

Since NMS is the most time-consuming part of proposal (detection out) layer, it downgrades performance much.
Ex, the experiment shows we will speed up by > x10 times for NMS.

	NMS exe time	NMS exe time
pic	before(ms)	after (ms)
1	16.989	1.566
2	30.75	5.095
3	50.68	4.446

As the result, the inference time for proposal (detection out) layer can speedup by, ex, at x2~x3 times faster than original flow.

	infer time	infer time
pic	before (ms)	after(ms)
1	31.755	15.111
2	54.335	23.932
3	69.057	23.85

Finally, to output top 100, from the table below we understand that original flow does unnecessary work in calculating extra candidates (which > 100).
By saving it, we can greatly boost layer inference performance.

pic	selected before	selected after
1	1286	100
2	1151	100
3	3369	100

test image can be download at following link:

testImage.zip

test model is:
opencv/opencv_extra/testdata/dnn/mask_rcnn_inception_v2_coco_2018_01_28.pbtxt
mask_rcnn_inception_v2_coco_2018_01_28/frozen_inference_graph.pb

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
The PR is proposed to proper branch
There is reference to original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

alalek · 2021-02-26T17:59:14Z

@WeiChungChang Thank you for contribution!

Please fix whitespace issues to make CI happy.

alalek

Well done! Thank you 👍

asmorkalov added category: dnn optimization labels Feb 25, 2021

boost NMS performance

47337e2

alalek approved these changes Mar 10, 2021

View reviewed changes

opencv-pushbot merged commit e4692ac into opencv:3.4 Mar 10, 2021

alalek mentioned this pull request Mar 13, 2021

(4.x) Merge 3.4 #19722

Merged

alalek mentioned this pull request Apr 9, 2021

(5.x) Merge 4.x #19885

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

boost NMS performance#19613

boost NMS performance#19613
opencv-pushbot merged 1 commit intoopencv:3.4from
WeiChungChang:NMS_refine

WeiChungChang commented Feb 24, 2021 •

edited

Loading

Uh oh!

alalek commented Feb 26, 2021

Uh oh!

alalek left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

WeiChungChang commented Feb 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Readiness Checklist

Uh oh!

alalek commented Feb 26, 2021

Uh oh!

alalek left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

WeiChungChang commented Feb 24, 2021 •

edited

Loading