Classify and extend convolution and depthwise performance tests by WanliZhong · Pull Request #24547 · opencv/opencv

WanliZhong · 2023-11-15T16:49:12Z

This PR aims to:

Extend the test cases from models: YOLOv5, YOLOv8, EfficientNet, YOLOX, YuNet, SFace, MPPalm, MPHand, MPPose, ViTTrack, PPOCRv3, CRNN, PPHumanSeg. (371 new test cases are added)
Classify the existing convolution performance test to below cases
- CONV_1x1
- CONV_3x3_S1_D1 (winograd)
- CONV
- DEPTHWISE
Reduce unnecessary test cases by follow 3 rules (366 test cases are pruned):
(i). For all tests, except for pad and bias related parameters, all other parameters are the same. Only one case can be reserved.
(ii). When the only difference is the channel of input shape, and other parameters are the same. Only one case can be reserved in each range [1, 3], [4, 7], [8, 15], [16, 31], [32, 63], [64, 127], [128, 255], [256, 511], [512, 1023], [1024, 2047], [2048, 4095]
(iii). When the only difference is the width and height of input shape, and other parameters are the same. Only one case can be reserved in each range [1, 31], [32, 63], [64, 95]...

Reproduced: 1. follow step in alalek@dnn_dump_conv_kernels to dump all convolution cases from new models. (declared flops may not right, need to be checked manually) 2 and 3. Use the script from python code classify conv.txt

Performance test result on Apple M2

Test result details: M2.md

Additional test result details with FP16: m2_results_with_fp16.zip

Brief summary for 4.8.1 vs 4.7.0 or 4.6.0:

CONV_1x1_S1_D1 dropped significant with small or large input shape.
DEPTHWISE_5x5 dropped a little compared with 4.7.0.

Performance test result on Intel Core i7-12700K: 8 Performance-cores (3.60 GHz, turbo up to 4.90 GHz), 4 Efficient-cores (2.70 GHz, turbo up to 3.80 GHz), 20 threads.

Test result details: INTEL.md
Brief summary for 4.8.1 vs 4.5.5:

CONV_5x5_S1_D1 dropped significant.
CONV_1x1_S1_D1, CONV_3x3_S1_D1, DEPTHWISE_3x3_S1_D1, DEPTHWISW_3x3_S2_D1 dropped with small input shape.

TODO:

Perform tests on arm with each opencv version
Perform tests on x86 with each opencv version
Split each test classification with single test config
test enable fp16

fengyuentau

Could you give a summary on what have been removed and added?

WanliZhong · 2023-11-16T03:32:51Z

@fengyuentau Totally add 284 new cases, remove 279 old cases. You can check each cases in file diff.txt

WanliZhong · 2023-11-16T15:19:09Z

UPDATE: Test results are attached. I have finished the convolution performance test with each OpenCV release version. The result show depthwise may not the biggest problem. Many performance issues have been fixed at 4.8.1

zihaomu · 2023-11-17T02:14:59Z

Great job! Looks like we are not good at some case which has the small output shape or small output channles. With these performance tests, we can control compute branches in more detail.

Regarding the x86 platform, these problems become more serious.

vpisarev · 2023-11-22T06:29:08Z

@WanliZhong, we need to split this test into several cases: 1x1 convolution, 3x3s1d1 (winograd and im2row-based), depthwise, generic (the remaining cases). each convolution case should test FP32 and FP16

WanliZhong · 2023-11-28T17:14:57Z

UPDATE

splitting the test to 4 types.
upload a additional test results source file with FP16. m2_results_with_fp16.zip
not sure why abi check fail.

WanliZhong · 2023-11-30T17:08:47Z

UPDATE

Currently, each test case will be tested in four situation: FP32 with Winograd, FP32 without Winograd, FP16 with Winograd, FP16 without Winograd by default.
For 4 type tests, only run top 20 cases by default to save CI time.
DO ANYONE HAVE OTHER GOOD RULES FOR PRUNING TEST CASES?

asmorkalov

👍

opencv-alalek · 2023-12-01T07:16:08Z

modules/dnn/perf/perf_convolution.cpp

+    Target targetId = get<1>(get<2>(GetParam()));
+    bool winograd = get<1>(GetParam());
+    Net net = build_net(params, backendId, targetId);
+    net.enableWinograd(winograd);


There is "warmup" stage in the original test code.

If we change settings, then we should do that again.

warm up happen on build_net() function. I didn't change this part of the code

If you play with network configuration setting (like net.enableWinograd(winograd); on the line 932) then you should do "warmup" again.

Thanks! It's right. I will do it soon

vpisarev · 2023-12-03T19:23:35Z

@WanliZhong, Winograd is only valid for 3x3s1d1; for other tests it does not make any sense. Could you please adjust your tests, otherwise we will have many useless cases for 1x1, depthwise, generic etc.

vpisarev · 2023-12-06T06:41:13Z

modules/dnn/perf/perf_convolution.cpp

+    return net;
+}
+
+typedef tuple<ConvParam_t, bool, tuple<Backend, Target> > ConvTestParam_t;


use the following definitions instead to add Winograd parameter only to 3x3S1D1

typedef tuple<ConvParam_t, tuple<Backend, Target> > ConvTestParam_t; typedef tuple<ConvParam_t, tuple<Backend, Target>, bool> Conv3x3S1D1TestParam_t; typedef TestBaseWithParam<ConvTestParam_t> Conv; typedef TestBaseWithParam<ConvTestParam_t> Conv_1x1; typedef TestBaseWithParam<Conv3x3S1D1TestParam_t> Conv_3x3S1D1; typedef TestBaseWithParam<ConvTestParam_t> Conv_Depthwise;

Classify and extend convolution and depthwise performance tests opencv#24547 This PR aims to: 1. Extend the test cases from models: `YOLOv5`, `YOLOv8`, `EfficientNet`, `YOLOX`, `YuNet`, `SFace`, `MPPalm`, `MPHand`, `MPPose`, `ViTTrack`, `PPOCRv3`, `CRNN`, `PPHumanSeg`. (371 new test cases are added) 2. Classify the existing convolution performance test to below cases - CONV_1x1 - CONV_3x3_S1_D1 (winograd) - CONV - DEPTHWISE 3. Reduce unnecessary test cases by follow 3 rules (366 test cases are pruned): (i). For all tests, except for pad and bias related parameters, all other parameters are the same. Only one case can be reserved. (ii). When the only difference is the channel of input shape, and other parameters are the same. Only one case can be reserved in each range `[1, 3], [4, 7], [8, 15], [16, 31], [32, 63], [64, 127], [128, 255], [256, 511], [512, 1023], [1024, 2047], [2048, 4095]` (iii). When the only difference is the width and height of input shape, and other parameters are the same. Only one case can be reserved in each range `[1, 31], [32, 63], [64, 95]... ` > **Reproduced**: 1. follow step in alalek@dnn_dump_conv_kernels to dump all convolution cases from new models. (declared flops may not right, need to be checked manually) 2 and 3. Use the script from python code [classify conv.txt](https://github.com/opencv/opencv/files/13522228/classify.conv.txt) **Performance test result on Apple M2** **Test result details**: [M2.md](https://github.com/opencv/opencv/files/13379189/M2.md) **Additional test result details with FP16**: [m2_results_with_fp16.zip](https://github.com/opencv/opencv/files/13491070/m2_results_with_fp16.zip) **Brief summary for 4.8.1 vs 4.7.0 or 4.6.0**: 1. `CONV_1x1_S1_D1` dropped significant with small or large input shape. 2. `DEPTHWISE_5x5 ` dropped a little compared with 4.7.0. --- **Performance test result on [Intel Core i7-12700K](https://www.intel.com/content/www/us/en/products/sku/134594/intel-core-i712700k-processor-25m-cache-up-to-5-00-ghz/specifications.html)**: 8 Performance-cores (3.60 GHz, turbo up to 4.90 GHz), 4 Efficient-cores (2.70 GHz, turbo up to 3.80 GHz), 20 threads. **Test result details**: [INTEL.md](https://github.com/opencv/opencv/files/13374093/INTEL.md) **Brief summary for 4.8.1 vs 4.5.5**: 1. `CONV_5x5_S1_D1` dropped significant. 2. `CONV_1x1_S1_D1`, `CONV_3x3_S1_D1`, `DEPTHWISE_3x3_S1_D1`, `DEPTHWISW_3x3_S2_D1` dropped with small input shape. --- TODO: - [x] Perform tests on arm with each opencv version - [x] Perform tests on x86 with each opencv version - [x] Split each test classification with single test config - [x] test enable fp16

WanliZhong added test category: dnn labels Nov 15, 2023

WanliZhong added this to the 4.9.0 milestone Nov 15, 2023

WanliZhong requested review from asmorkalov, fengyuentau, opencv-alalek and vpisarev November 15, 2023 16:49

fengyuentau reviewed Nov 16, 2023

View reviewed changes

WanliZhong added 5 commits November 28, 2023 22:32

classify and extend convolution and depthwise performance tests

b3303d6

add comment about this modification

9d474e2

Modify some depthwise declared_flops

2ab4be8

refactor convolution perf test struct. re-classify to 4 types

e2b34fa

resolve conflicts

5bb32e1

WanliZhong force-pushed the refactor_conv_perf_test branch from c38bfd4 to 5bb32e1 Compare November 28, 2023 14:33

make some variables constant

aa7e9b2

add winograd test option

4b77c63

asmorkalov approved these changes Dec 1, 2023

View reviewed changes

opencv-alalek reviewed Dec 1, 2023

View reviewed changes

only enable Winograd test for Conv_3x3S1D1

68a437a

vpisarev reviewed Dec 6, 2023

View reviewed changes

WanliZhong added 2 commits December 6, 2023 15:46

add Conv3x3S1D1TestParam_t

2252474

warmup again after setting winograd in Conv_3x3S1D1 test

0081f35

asmorkalov assigned vpisarev Dec 11, 2023

vpisarev self-requested a review December 11, 2023 18:31

vpisarev approved these changes Dec 11, 2023

View reviewed changes

asmorkalov merged commit 6ee71fe into opencv:4.x Dec 11, 2023

asmorkalov mentioned this pull request Jan 19, 2024

5.x merge 4.x #24862

Merged

Uh oh!

Conversation

WanliZhong commented Nov 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fengyuentau left a comment

Choose a reason for hiding this comment

Uh oh!

WanliZhong commented Nov 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

WanliZhong commented Nov 16, 2023

Uh oh!

zihaomu commented Nov 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vpisarev commented Nov 22, 2023

Uh oh!

WanliZhong commented Nov 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

WanliZhong commented Nov 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asmorkalov left a comment

Choose a reason for hiding this comment

Uh oh!

opencv-alalek Dec 1, 2023

Choose a reason for hiding this comment

Uh oh!

WanliZhong Dec 1, 2023

Choose a reason for hiding this comment

Uh oh!

opencv-alalek Dec 8, 2023

Choose a reason for hiding this comment

Uh oh!

WanliZhong Dec 8, 2023

Choose a reason for hiding this comment

Uh oh!

vpisarev commented Dec 3, 2023

Uh oh!

vpisarev Dec 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

WanliZhong commented Nov 15, 2023 •

edited

Loading

WanliZhong commented Nov 16, 2023 •

edited

Loading

zihaomu commented Nov 17, 2023 •

edited

Loading

WanliZhong commented Nov 28, 2023 •

edited

Loading

WanliZhong commented Nov 30, 2023 •

edited

Loading

vpisarev Dec 6, 2023 •

edited

Loading