WUI based implementation to initUndistortRectifyMap#14994
Conversation
a7bfe15 to
eb5790a
Compare
ecc8e34 to
09dd6d3
Compare
|
Performance change for InitUndistortMap::Undistort
|
09dd6d3 to
f1f258c
Compare
modules/imgproc/src/undistort.cpp
Outdated
| #if CV_TRY_AVX2 | ||
| #if CV_TRY_AVX2 | ||
| if( useAVX2 ) | ||
| j = cv::initUndistortRectifyMapLine_AVX(m1f, m2f, m1, m2, |
There was a problem hiding this comment.
What is about performance difference between w.u.i below and this "AVX" code branch?
There was a problem hiding this comment.
I suppose AVX2 branch should be executed for AVX2 baseline. Also I've tested performance against CPU_BASELINE=AVX2 and CPU_DISPATCH=AVX2 and got 15% performance improvement. So I think average performance gain is about 10% due to manual cycle unrolling and a few reused constant values.
I think it is reasonable to replace AVX2 branch with dynamic dispatching of w.u.i code. Does it make sense to make this change a part of the PR?
There was a problem hiding this comment.
It make sense to avoid code duplication if we archiving similar performance.
e0c5c85 to
ee6591c
Compare
ee6591c to
abc2f1d
Compare
WUI based implementation to initUndistortRectifyMap (opencv#14994) * Add initUndistortRectifyMap performance test * Move cv namespace boundaries * Add wide universal intrinsics based implementation to initUndistortRectifyMap * Dispatch undistort
This pullrequest changes
WUI based implementation to initUndistortRectifyMap