Skip to content

improve map allocation check#19529

Merged
alalek merged 4 commits intoopencv:3.4from
WeiChungChang:3.4
Feb 23, 2021
Merged

improve map allocation check#19529
alalek merged 4 commits intoopencv:3.4from
WeiChungChang:3.4

Conversation

@WeiChungChang
Copy link
Copy Markdown
Contributor

@WeiChungChang WeiChungChang commented Feb 14, 2021

Move map resize to the beginning of loop to avoid unnecessary check within inner loop.

Google benchmark test:
Vgg16 Faster RCNN model, detection output layer is used by

  1. proposal output; where we have 1 location class and 9 * 14 * 14 = 1764 prediction per class.
  2. Output layer: where we have 21 location classes and 100 prediction per class.

According to test on VGG16 faster RCNN, there are speedup of:
65.38% (17000ns / 10279 ns)
45.22% (30308 / 20870 ns)

the benchmark program is at the link and compiled by:
https://drive.google.com/file/d/1sBzXPuCLEekjoWh85k4ZUKlgqQrJcXsU/view?usp=sharing

g++ test.cpp -std=c++14 -O3 -I<benchmark>/benchmark/include -L<benchmark>/build/src -lbenchmark -lpthread -o 
benchmark
Run on (12 X 4600 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x6)
  L1 Instruction 32 KiB (x6)
  L2 Unified 256 KiB (x6)
  L3 Unified 12288 KiB (x1)
Load Average: 0.22, 0.33, 0.54
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
***WARNING*** Library was built as DEBUG. Timings may be affected.

1x1764
---------------------------------------------------------------------
Benchmark                           Time             CPU   Iterations
---------------------------------------------------------------------
BM_FasterRCNNProposl            10279 ns        10279 ns        70308
BM_FasterRCNNProposlOrigin      17001 ns        17000 ns        41550

21x100
---------------------------------------------------------------------
Benchmark                           Time             CPU   Iterations
---------------------------------------------------------------------
BM_FasterRCNNProposl            20871 ns        20870 ns        35424
BM_FasterRCNNProposlOrigin      30309 ns        30308 ns        23321


Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • [V] I agree to contribute to the project under Apache 2 License.
  • [V] To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • [V] The PR is proposed to proper branch
  • There is reference to original bug report and related work
  • [V] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake
force_builders=linux,docs

Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you 👍

@alalek alalek merged commit d4d1216 into opencv:3.4 Feb 23, 2021
@alalek alalek mentioned this pull request Feb 27, 2021
@alalek alalek mentioned this pull request Apr 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants