Skip to content

cv::Transform SIMD incorrect output when done in place (issue for 3.4.6, does not happen in 3.4.5) #14727

@zhumxcq

Description

@zhumxcq
System information (version)

General configuration for OpenCV 3.4.6 =====================================
Version control: unknown

Platform:
Timestamp: 2019-05-17T20:01:53Z
Host: Windows 6.1.7601 AMD64
CMake: 3.9.0
CMake generator: Visual Studio 14 2015
CMake build tool: C:/Program Files (x86)/MSBuild/14.0/bin/MSBuild.exe
MSVC: 1900

CPU/HW features:
Baseline: SSE SSE2 SSE3
requested: SSE2
required: SSE SSE2 SSE3
disabled: SSSE3 SSE4_1 SSE4_2 POPCNT AVX AVX2 FMA3
Dispatched code generation: SSE4_1 SSE4_2 FP16 AVX AVX2
requested: SSE4_1 SSE4_2 AVX FP16 AVX2
SSE4_1 (12 files): + SSSE3 SSE4_1
SSE4_2 (1 files): + SSSE3 SSE4_1 POPCNT SSE4_2
FP16 (0 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX
AVX (5 files): + SSSE3 SSE4_1 POPCNT SSE4_2 AVX
AVX2 (26 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2

C/C++:
Built as dynamic libs?: YES
C++11: YES
C++ Compiler: C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/cl.exe (ver 19.0.24215.1)
C++ flags (Release): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:fast /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:fast /arch:SSE /arch:SSE2 /arch:SSE /arch:SSE2 /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /MP4 /MD /O2 /Ob2 /D NDEBUG
C++ flags (Debug): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:fast /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:fast /arch:SSE /arch:SSE2 /arch:SSE /arch:SSE2 /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /MP4 /D_DEBUG /MDd /Zi /Ob0 /Od /RTC1
C Compiler: C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/cl.exe
C flags (Release): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:fast /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:fast /arch:SSE /arch:SSE2 /arch:SSE /arch:SSE2 /MP4 /MD /O2 /Ob2 /D NDEBUG
C flags (Debug): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:fast /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:fast /arch:SSE /arch:SSE2 /arch:SSE /arch:SSE2 /MP4 /D_DEBUG /MDd /Zi /Ob0 /Od /RTC1
Linker flags (Release): /machine:X86 /INCREMENTAL:NO
Linker flags (Debug): /machine:X86 /debug /INCREMENTAL
ccache: NO
Precompiled headers: YES
Extra dependencies:
3rdparty dependencies:

OpenCV modules:
To be built: calib3d core dnn features2d flann highgui imgcodecs imgproc ml objdetect photo shape stitching superres ts video videoio videostab
Disabled: world
Disabled by dependency: -
Unavailable: cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev java js python2 python3 viz
Applications: -
Documentation: NO
Non-free algorithms: NO

Windows RT support: NO

GUI:
Win32 UI: YES
VTK support: NO

Media I/O:
ZLib: build (ver 1.2.11)
JPEG: build-libjpeg-turbo (ver 2.0.2-62)
WEBP: build (ver encoder: 0x020e)
PNG: build (ver 1.6.36)
TIFF: build (ver 42 - 4.0.10)
OpenEXR: build (ver 1.7.1)
HDR: YES
SUNRASTER: YES
PXM: YES

Video I/O:
Video for Windows: YES
FFMPEG: YES (prebuilt binaries)
avcodec: YES (ver 57.107.100)
avformat: YES (ver 57.83.100)
avutil: YES (ver 55.78.100)
swscale: YES (ver 4.8.100)
avresample: YES (ver 3.7.0)
PvAPI: NO
DirectShow: YES

Parallel framework: Concurrency

Trace: YES (with Intel ITT)

Other third-party libraries:
Intel IPP: 2019.0.0 Gold [2019.0.0]
at: E:/Documents/opencv/build-86/3rdparty/ippicv/ippicv_win/icv
Intel IPP IW: sources (2019.0.0)
at: E:/Documents/opencv/build-86/3rdparty/ippicv/ippicv_win/iw
Lapack: NO
Custom HAL: NO
Protobuf: build (3.5.1)

NVIDIA CUDA: NO

OpenCL: YES (NVD3D11)
Include path: E:/Documents/opencv/3rdparty/include/opencl/1.2
Link libraries: Dynamic load

Python (for build): NO

Java:
ant: NO
JNI: NO
Java wrappers: NO
Java tests: NO

Install to: E:/Documents/opencv/build-86/install

Detailed description

Compiled as 32-bit (x86) on 64-bit platform, when cv::transform is done in-place, the results are wrong (see attached picture).
CPU is Intel i5-3570

Steps to reproduce
{
		cv::Mat t = (cv::Mat_<double>(3, 3) << 575.7921, 0, 325.8341, 0, 575.2544, 244.8807, 0, 0, 1);
		std::vector<cv::Vec3f> pts(2), pts_out;
		pts[0] = cv::Vec3f(151226, 364400, 1822);
		pts[1] = cv::Vec3f(1032900, 525840, 1878);
		std::cout << "Input: " << std::endl << pts[0] << std::endl << pts[1] << std::endl << std::endl;
		std::cout << "Transform: " << std::endl << t.inv() << std::endl << std::endl;
		cv::transform(pts, pts_out, t.inv());
		std::cout << "Correct output: " << std::endl << pts_out[0] << std::endl << pts_out[1] << std::endl << std::endl;
		cv::transform(pts, pts, t.inv());
		std::cout << "Incorrect output (in place): " << std::endl << pts[0] << std::endl << pts[1] << std::endl << std::endl;
}

image

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions