boost NMS performance#19613
Merged
opencv-pushbot merged 1 commit intoopencv:3.4from Mar 10, 2021
WeiChungChang:NMS_refine
Merged
boost NMS performance#19613opencv-pushbot merged 1 commit intoopencv:3.4from WeiChungChang:NMS_refine
opencv-pushbot merged 1 commit intoopencv:3.4from
WeiChungChang:NMS_refine
Conversation
Member
|
@WeiChungChang Thank you for contribution! Please fix whitespace issues to make CI happy. |
Merged
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
To boost NMS performance, we apply early termination here at this PR.
Notice that current detection out layer will be used internally by proposal layer.
In this case, finally, proposal will, ex output top 100 proposals among all candidates.
Typically, the # of candidates is about more than several thousands.
Currently flow seriously down-grades performance by applying NMS for all candidates.
However, as we know, to output top K NMS output, we should check if we have picked enough candidates and terminate once we have collected what we need.
As the table below, to output top K = 100 candidates, we just need to check less than one thousand bboxes
But for current flow, we check all.
Since NMS is the most time-consuming part of proposal (detection out) layer, it downgrades performance much.
Ex, the experiment shows we will speed up by > x10 times for NMS.
As the result, the inference time for proposal (detection out) layer can speedup by, ex, at x2~x3 times faster than original flow.
Finally, to output top 100, from the table below we understand that original flow does unnecessary work in calculating extra candidates (which > 100).
By saving it, we can greatly boost layer inference performance.
test image can be download at following link:
testImage.zip
test model is:
opencv/opencv_extra/testdata/dnn/mask_rcnn_inception_v2_coco_2018_01_28.pbtxt
mask_rcnn_inception_v2_coco_2018_01_28/frozen_inference_graph.pb
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.