Parallelize implementation of HDR MergeMertens. by feuerste · Pull Request #22959 · opencv/opencv

feuerste · 2022-12-14T12:46:33Z

With this PR the process function for MergeMertens is parallelized. Also performance tests for all HDR methods are added as well as a determinism test for the parallel version of MergeMertens.

Running the photo performance tests locally on my workstation before this PR:

[----------] 3 tests from HDR
[ RUN      ] HDR.Mertens
[ PERFSTAT ]    (samples=13   mean=216.34   median=216.96   min=204.84   stddev=6.12 (2.8%))
[       OK ] HDR.Mertens (2949 ms)
[ RUN      ] HDR.Debevec
[ PERFSTAT ]    (samples=13   mean=69.54   median=69.15   min=67.06   stddev=1.77 (2.5%))
[       OK ] HDR.Debevec (1016 ms)
[ RUN      ] HDR.Robertson
[ PERFSTAT ]    (samples=13   mean=50.11   median=50.38   min=47.92   stddev=1.28 (2.6%))
[       OK ] HDR.Robertson (763 ms)
[----------] 3 tests from HDR (4728 ms total)

Running with the parallized code of this PR:

[ RUN      ] HDR.Mertens
[ PERFSTAT ]    (samples=100   mean=48.72   median=47.81   min=40.61   stddev=3.24 (6.6%))
[       OK ] HDR.Mertens (5144 ms)

Also all new files have been clang-formatted with https://github.com/opencv/opencv/wiki/.clang_format and the fix proposed in #22960.

Best to be reviewed with Hide whitespace because of the new indentations caused by the surrounding parallel_for_ loops.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

alalek · 2022-12-14T21:15:00Z

Also all edited files have been clang-formatted

Please don't do that for already existed files. This clang-format is recommendation for new files only.
It complicates review due to massive bunch of unnecessary changes (and hiding whitespace changes doesn't really help completely)
It breaks merges between branches.
Some discussions were in this thread: #12588

BTW, you may format added/modified lines.

feuerste · 2022-12-14T21:52:44Z

Also all edited files have been clang-formatted

Please don't do that for already existed files. This clang-format is recommendation for new files only. It complicates review due to massive bunch of unnecessary changes (and hiding whitespace changes doesn't really help completely) It breaks merges between branches. Some discussions were in this thread: #12588

Thanks @alalek, I removed the clang formats for merge.cpp.

alalek

Thank you for the update!

modules/photo/perf/perf_hdr.cpp

alalek · 2022-12-14T21:57:59Z

modules/photo/perf/perf_hdr.cpp

+static vector<float> DEFAULT_VECTOR;
+void loadExposureSeq(const std::string& list_filename, vector<Mat>& images,
+                     vector<float>& times = DEFAULT_VECTOR)


Looks like tests affects each other through the global variable.

It is better to drop times parameter if it is not used.

I removed the static vector and return a helper struct now.

alalek · 2022-12-14T22:03:02Z

modules/photo/src/merge.cpp


-            Laplacian(gray, contrast, CV_32F);
-            contrast = abs(contrast);
+                images[i].convertTo(images[i], CV_32F, 1.0f/255.0f);


Could we keep using img instead of multiple images[i]?

With declaring it in this way:

Mat& img = images[i];

Sure, done.

alalek · 2022-12-14T22:06:20Z

modules/photo/src/merge.cpp

        }
        int maxlevel = static_cast<int>(logf(static_cast<float>(min(size.width, size.height))) / logf(2.0f));
-        std::vector<Mat> res_pyr(maxlevel + 1);
+        std::vector<std::vector<Mat>> img_pyrs(images.size(), std::vector<Mat>(maxlevel + 1));


What is the memory consumption "x-factor"?

The memory consumption increases by the number of images * pyramid size, which should be ok for most use cases, where only a couple of images are used to create an HDR image, but in the worst case of course can explode. I rewrote the code to use a mutex/condition_variable instead, which showed only a very small overhead (~3ms slower) in my local tests. This way we are back to the original memory consumption.

feuerste

Thanks a lot for your review! I addressed all your comments.

modules/photo/perf/perf_hdr.cpp

feuerste · 2022-12-15T08:18:03Z

modules/photo/perf/perf_hdr.cpp

+static vector<float> DEFAULT_VECTOR;
+void loadExposureSeq(const std::string& list_filename, vector<Mat>& images,
+                     vector<float>& times = DEFAULT_VECTOR)


I removed the static vector and return a helper struct now.

feuerste · 2022-12-15T08:18:12Z

modules/photo/src/merge.cpp


-            Laplacian(gray, contrast, CV_32F);
-            contrast = abs(contrast);
+                images[i].convertTo(images[i], CV_32F, 1.0f/255.0f);


Sure, done.

feuerste · 2022-12-15T08:20:31Z

modules/photo/src/merge.cpp

        }
        int maxlevel = static_cast<int>(logf(static_cast<float>(min(size.width, size.height))) / logf(2.0f));
-        std::vector<Mat> res_pyr(maxlevel + 1);
+        std::vector<std::vector<Mat>> img_pyrs(images.size(), std::vector<Mat>(maxlevel + 1));


The memory consumption increases by the number of images * pyramid size, which should be ok for most use cases, where only a couple of images are used to create an HDR image, but in the worst case of course can explode. I rewrote the code to use a mutex/condition_variable instead, which showed only a very small overhead (~3ms slower) in my local tests. This way we are back to the original memory consumption.

feuerste · 2022-12-16T11:39:37Z

@alalek I think I addressed all your comments. Can you please take another look?

alalek · 2022-12-18T03:48:31Z

modules/photo/src/merge.cpp

+                std::unique_lock<std::mutex> lock(weight_sync.mutex);
+                weight_sync.cond.wait(lock, [&]{ return weight_sync.index == i; });
+                weight_sync.sum += weights[i];
+                weight_sync.index++;
+                weight_sync.cond.notify_all();


weight_sync.index == i

Do we really need to preserve summation operations order?

If no, then please use cv::AutoLock and cv::Mutex instead (to keep working OPENCV_DISABLE_THREAD_SUPPORT mode)

Also manual usage of cond variables is not exception-safe.

Do we really need to preserve summation operations order?

I wasn't sure what the policy for OpenCV is, so my first attempt was to preserve deterministic results.

If no, then please use cv::AutoLock and cv::Mutex instead (to keep working OPENCV_DISABLE_THREAD_SUPPORT mode)

If fine with the policy, then we can of course also remove the condition variable and use OpenCV's Mutex/AutoLock. I changed it accordingly, PTAL!
I adjusted the python test_umat_merge_mertens though to only use one thread for testing, as we want to have determinism for the summation here.

feuerste · 2022-12-21T10:58:54Z

@alalek Is there anything else I can do to support your review? Thanks a lot for your support!

alalek

LGTM 👍 Thank you for the contribution!

feuerste · 2022-12-21T12:25:38Z

LGTM +1 Thank you for the contribution!

Thanks a lot @alalek ! Unfortunately I'm not permitted to merge, only authorized users are. Whom can I contact for the merge?

Parallelize implementation of HDR MergeMertens. * Parallelize MergeMertens. * Added performance tests for HDR. * Ran clang-format. * Optimizations. * Fix data path for Windows. * Remove compiiation warning on Windows. * Remove clang-format for existing file. * Addressing reviewer comments. * Ensure correct summation order. * Add test for determinism. * Move result pyramid into sync struct. * Reuse sync for first loop as well. * Use OpenCV's threading primitives. * Remove cout.

feuerste added 5 commits December 14, 2022 10:10

Parallelize MergeMertens.

a2b0d06

Added performance tests for HDR.

3c023d1

Ran clang-format.

141c90d

Optimizations.

ffd51e5

Fix data path for Windows.

026aca5

feuerste force-pushed the parallel_mertens branch from bccb598 to 026aca5 Compare December 14, 2022 14:36

Remove compiiation warning on Windows.

d821769

asmorkalov added optimization category: photo labels Dec 14, 2022

Remove clang-format for existing file.

0f571e1

alalek reviewed Dec 14, 2022

View reviewed changes

Addressing reviewer comments.

391c1e4

feuerste commented Dec 15, 2022

View reviewed changes

feuerste added 4 commits December 15, 2022 11:24

Ensure correct summation order.

db91fe3

Add test for determinism.

09b1305

Move result pyramid into sync struct.

6ec7b7a

Reuse sync for first loop as well.

ae44d96

alalek reviewed Dec 18, 2022

View reviewed changes

feuerste added 2 commits December 19, 2022 10:30

Use OpenCV's threading primitives.

541830f

Remove cout.

2b2ba1e

alalek approved these changes Dec 21, 2022

View reviewed changes

alalek merged commit bc8d494 into opencv:4.x Dec 21, 2022

alalek mentioned this pull request Jan 8, 2023

(5.x) Merge 4.x #23113

Merged

Uh oh!

Conversation

feuerste commented Dec 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Readiness Checklist

Uh oh!

alalek commented Dec 14, 2022

Uh oh!

feuerste commented Dec 14, 2022

Uh oh!

alalek left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

feuerste Dec 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

feuerste left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

feuerste Dec 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

feuerste commented Dec 16, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

feuerste commented Dec 21, 2022

Uh oh!

alalek left a comment

Choose a reason for hiding this comment

Uh oh!

feuerste commented Dec 21, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feuerste commented Dec 14, 2022 •

edited

Loading

feuerste Dec 15, 2022 •

edited

Loading

feuerste Dec 15, 2022 •

edited

Loading