-
-
Notifications
You must be signed in to change notification settings - Fork 56.5k
cv::Transform SIMD incorrect output when done in place (issue for 3.4.6, does not happen in 3.4.5) #14727
Description
System information (version)
General configuration for OpenCV 3.4.6 =====================================
Version control: unknown
Platform:
Timestamp: 2019-05-17T20:01:53Z
Host: Windows 6.1.7601 AMD64
CMake: 3.9.0
CMake generator: Visual Studio 14 2015
CMake build tool: C:/Program Files (x86)/MSBuild/14.0/bin/MSBuild.exe
MSVC: 1900
CPU/HW features:
Baseline: SSE SSE2 SSE3
requested: SSE2
required: SSE SSE2 SSE3
disabled: SSSE3 SSE4_1 SSE4_2 POPCNT AVX AVX2 FMA3
Dispatched code generation: SSE4_1 SSE4_2 FP16 AVX AVX2
requested: SSE4_1 SSE4_2 AVX FP16 AVX2
SSE4_1 (12 files): + SSSE3 SSE4_1
SSE4_2 (1 files): + SSSE3 SSE4_1 POPCNT SSE4_2
FP16 (0 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX
AVX (5 files): + SSSE3 SSE4_1 POPCNT SSE4_2 AVX
AVX2 (26 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2
C/C++:
Built as dynamic libs?: YES
C++11: YES
C++ Compiler: C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/cl.exe (ver 19.0.24215.1)
C++ flags (Release): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:fast /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:fast /arch:SSE /arch:SSE2 /arch:SSE /arch:SSE2 /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /MP4 /MD /O2 /Ob2 /D NDEBUG
C++ flags (Debug): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:fast /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:fast /arch:SSE /arch:SSE2 /arch:SSE /arch:SSE2 /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /MP4 /D_DEBUG /MDd /Zi /Ob0 /Od /RTC1
C Compiler: C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/cl.exe
C flags (Release): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:fast /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:fast /arch:SSE /arch:SSE2 /arch:SSE /arch:SSE2 /MP4 /MD /O2 /Ob2 /D NDEBUG
C flags (Debug): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:fast /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:fast /arch:SSE /arch:SSE2 /arch:SSE /arch:SSE2 /MP4 /D_DEBUG /MDd /Zi /Ob0 /Od /RTC1
Linker flags (Release): /machine:X86 /INCREMENTAL:NO
Linker flags (Debug): /machine:X86 /debug /INCREMENTAL
ccache: NO
Precompiled headers: YES
Extra dependencies:
3rdparty dependencies:
OpenCV modules:
To be built: calib3d core dnn features2d flann highgui imgcodecs imgproc ml objdetect photo shape stitching superres ts video videoio videostab
Disabled: world
Disabled by dependency: -
Unavailable: cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev java js python2 python3 viz
Applications: -
Documentation: NO
Non-free algorithms: NO
Windows RT support: NO
GUI:
Win32 UI: YES
VTK support: NO
Media I/O:
ZLib: build (ver 1.2.11)
JPEG: build-libjpeg-turbo (ver 2.0.2-62)
WEBP: build (ver encoder: 0x020e)
PNG: build (ver 1.6.36)
TIFF: build (ver 42 - 4.0.10)
OpenEXR: build (ver 1.7.1)
HDR: YES
SUNRASTER: YES
PXM: YES
Video I/O:
Video for Windows: YES
FFMPEG: YES (prebuilt binaries)
avcodec: YES (ver 57.107.100)
avformat: YES (ver 57.83.100)
avutil: YES (ver 55.78.100)
swscale: YES (ver 4.8.100)
avresample: YES (ver 3.7.0)
PvAPI: NO
DirectShow: YES
Parallel framework: Concurrency
Trace: YES (with Intel ITT)
Other third-party libraries:
Intel IPP: 2019.0.0 Gold [2019.0.0]
at: E:/Documents/opencv/build-86/3rdparty/ippicv/ippicv_win/icv
Intel IPP IW: sources (2019.0.0)
at: E:/Documents/opencv/build-86/3rdparty/ippicv/ippicv_win/iw
Lapack: NO
Custom HAL: NO
Protobuf: build (3.5.1)
NVIDIA CUDA: NO
OpenCL: YES (NVD3D11)
Include path: E:/Documents/opencv/3rdparty/include/opencl/1.2
Link libraries: Dynamic load
Python (for build): NO
Java:
ant: NO
JNI: NO
Java wrappers: NO
Java tests: NO
Install to: E:/Documents/opencv/build-86/install
Detailed description
Compiled as 32-bit (x86) on 64-bit platform, when cv::transform is done in-place, the results are wrong (see attached picture).
CPU is Intel i5-3570
Steps to reproduce
{
cv::Mat t = (cv::Mat_<double>(3, 3) << 575.7921, 0, 325.8341, 0, 575.2544, 244.8807, 0, 0, 1);
std::vector<cv::Vec3f> pts(2), pts_out;
pts[0] = cv::Vec3f(151226, 364400, 1822);
pts[1] = cv::Vec3f(1032900, 525840, 1878);
std::cout << "Input: " << std::endl << pts[0] << std::endl << pts[1] << std::endl << std::endl;
std::cout << "Transform: " << std::endl << t.inv() << std::endl << std::endl;
cv::transform(pts, pts_out, t.inv());
std::cout << "Correct output: " << std::endl << pts_out[0] << std::endl << pts_out[1] << std::endl << std::endl;
cv::transform(pts, pts, t.inv());
std::cout << "Incorrect output (in place): " << std::endl << pts[0] << std::endl << pts[1] << std::endl << std::endl;
}