Skip to content

boost NMS performance#19613

Merged
opencv-pushbot merged 1 commit intoopencv:3.4from
WeiChungChang:NMS_refine
Mar 10, 2021
Merged

boost NMS performance#19613
opencv-pushbot merged 1 commit intoopencv:3.4from
WeiChungChang:NMS_refine

Conversation

@WeiChungChang
Copy link
Copy Markdown
Contributor

@WeiChungChang WeiChungChang commented Feb 24, 2021

To boost NMS performance, we apply early termination here at this PR.

Notice that current detection out layer will be used internally by proposal layer.
In this case, finally, proposal will, ex output top 100 proposals among all candidates.

Typically, the # of candidates is about more than several thousands.
Currently flow seriously down-grades performance by applying NMS for all candidates.
However, as we know, to output top K NMS output, we should check if we have picked enough candidates and terminate once we have collected what we need.

As the table below, to output top K = 100 candidates, we just need to check less than one thousand bboxes
But for current flow, we check all.

pic # of candidate calculated before # of candidate calculated after
1 6000 573
2 6000 371
3 6000 240

Since NMS is the most time-consuming part of proposal (detection out) layer, it downgrades performance much.
Ex, the experiment shows we will speed up by > x10 times for NMS.

  NMS exe time NMS exe time
pic before(ms) after (ms)
1 16.989 1.566
2 30.75 5.095
3 50.68 4.446

image

As the result, the inference time for proposal (detection out) layer can speedup by, ex, at x2~x3 times faster than original flow.

  infer time infer time
pic before (ms) after(ms)
1 31.755 15.111
2 54.335 23.932
3 69.057 23.85

image

Finally, to output top 100, from the table below we understand that original flow does unnecessary work in calculating extra candidates (which > 100).
By saving it, we can greatly boost layer inference performance.

pic selected before selected after
1 1286 100
2 1151 100
3 3369 100

test image can be download at following link:

testImage.zip

test model is:
opencv/opencv_extra/testdata/dnn/mask_rcnn_inception_v2_coco_2018_01_28.pbtxt
mask_rcnn_inception_v2_coco_2018_01_28/frozen_inference_graph.pb

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • The PR is proposed to proper branch
  • There is reference to original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@alalek
Copy link
Copy Markdown
Member

alalek commented Feb 26, 2021

@WeiChungChang Thank you for contribution!

Please fix whitespace issues to make CI happy.

Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done! Thank you 👍

@opencv-pushbot opencv-pushbot merged commit e4692ac into opencv:3.4 Mar 10, 2021
@alalek alalek mentioned this pull request Mar 13, 2021
@alalek alalek mentioned this pull request Apr 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants