Mcc add perf tests improve performance by AleksandrPanov · Pull Request #3699 · opencv/opencv_contrib

AleksandrPanov · 2024-03-15T21:12:53Z

Added perf tests to mcc module.
Also these optimizations have been added:

added parallel_for_ to performThreshold()
removed toL/fromL and added dst to avoid copy data
added parallel_for_ to elementWise() ("batch" optimization improves performance of Windows version, Linux without changes).

Configuration:
Ryzen 5950X, 2x16 GB 3000 MHz DDR4
OS: Windows 10, Ubuntu 20.04.5 LTS

Performance results in milliseconds:

OS and alg version	process, ms	infer, ms
win_default	63.09	457.57
win_optimized_without_batch	48.69	111.78
win_optimized_batch	48.42	47.28
linux_default	50.88	300.7
linux_optimized_batch	36.06	41.62

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

AleksandrPanov · 2024-03-15T21:21:57Z

modules/mcc/src/utils.hpp

+        const int num_elements = (int)src.total()*channel;
+        const double *psrc = (double*)src.data;
+        double *pdst = (double*)dst.data;
+        const int batch = 128;


This "batch" optimization improves performance in Windows

Which are common values of num_elements? We can make batch dependent on number of threads:

const int batch = num_elements / max(1, getNumThreads());

or

const int batch = num_elements / (getNumThreads() > 1 ? getNumThreads() * 4 : 1);

instead of 4 you may choose another constant to get batch=128 in you configuration.

In your second sample I got the same performance (47 ms) with a constant of 1024.
const int batch = std::max(1, getNumThreads() > 1 ? num_elements / (1024*getNumThreads()) : num_elements);
// if getNumThreads() == 1 -> batch = num_elements

In your first sample const int batch = num_elements / max(1, getNumThreads()); a regression in performance appears (from 47 ms to 57 ms).

I would suggest using batch 128, but your second sample would also work.

Batch - the minimum required number of consecutive elements in an array that a thread can process at one time.

modules/mcc/perf/perf_precomp.hpp

AleksandrPanov added 3 commits March 13, 2024 22:15

add perf test

75f86c7

removed toL/fromL, added dst

a98dcc7

add parallel_for_ to performThreshold

2a8db6d

AleksandrPanov added test optimization category: mcc color calibration module labels Mar 15, 2024

AleksandrPanov commented Mar 15, 2024

View reviewed changes

AleksandrPanov requested a review from dkurt March 18, 2024 09:59

dkurt reviewed Mar 18, 2024

View reviewed changes

modules/mcc/perf/perf_precomp.hpp Outdated Show resolved Hide resolved

AleksandrPanov force-pushed the mcc_add_perf_tests_improve_performance branch 4 times, most recently from b77f40d to 8ca90eb Compare March 22, 2024 07:43

add parallel_for_ to elementWise

5b829da

AleksandrPanov force-pushed the mcc_add_perf_tests_improve_performance branch from 8ca90eb to 5b829da Compare March 22, 2024 07:51

dkurt approved these changes Mar 22, 2024

View reviewed changes

AleksandrPanov requested a review from asmorkalov March 25, 2024 08:42

asmorkalov merged commit 5e592c2 into opencv:4.x Mar 26, 2024

asmorkalov assigned dkurt Mar 26, 2024

asmorkalov mentioned this pull request Apr 1, 2024

(5.x) Merge 4.x #3710

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mcc add perf tests improve performance#3699

Mcc add perf tests improve performance#3699
asmorkalov merged 4 commits intoopencv:4.xfrom
AleksandrPanov:mcc_add_perf_tests_improve_performance

AleksandrPanov commented Mar 15, 2024 •

edited

Loading

Uh oh!

AleksandrPanov Mar 15, 2024 •

edited

Loading

Uh oh!

dkurt Mar 18, 2024 •

edited

Loading

Uh oh!

AleksandrPanov Mar 21, 2024 •

edited

Loading

Uh oh!

AleksandrPanov Mar 21, 2024

Uh oh!

AleksandrPanov Mar 21, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

AleksandrPanov commented Mar 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Readiness Checklist

Uh oh!

AleksandrPanov Mar 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dkurt Mar 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AleksandrPanov Mar 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AleksandrPanov Mar 21, 2024

Choose a reason for hiding this comment

Uh oh!

AleksandrPanov Mar 21, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AleksandrPanov commented Mar 15, 2024 •

edited

Loading

AleksandrPanov Mar 15, 2024 •

edited

Loading

dkurt Mar 18, 2024 •

edited

Loading

AleksandrPanov Mar 21, 2024 •

edited

Loading