Skip to content

cv2.dft segmentation fault with MKL-linked numpy and opencv #16326

@nmaxwell

Description

@nmaxwell
System information
  • OpenCV => 3.3.4 + (all versions after 3.4.3, including 4.2.0)
  • Operating System / Platform => Reproduced on Ubuntu 18.04 and 19.10
  • Compiler => gcc-8
  • Hardware => Reproduced on several intel systems, including Skylake, Coffee Lake.
Detailed description

We build numpy and opencv from source with MKL support. On Intel hardware, calls to cv2.dft cause a segmentation fault. The error does not occur on AMD hardware (tested on Threadripper). This occurs on any opencv version (including 4+), after 3.4.3. On 3.4.3, the error does not appear, so the error is likely introduced at version 3.4.4.

I discovered a hack that appears to solve the problem: load the opencv libraries before importing numpy (eg. via ctypes).

The following code causes segmentation fault

import numpy, cv2
cv2.dft(numpy.zeros((10,10)))

The following code does not cause a segfault

import ctypes 
ctypes.CDLL("/usr/local/lib/libopencv_core.so")
ctypes.CDLL("/usr/local/lib/libopencv_imgproc.so")
import numpy, cv2
cv2.dft(numpy.zeros((10,10)))
Steps to reproduce

I put together a minimal Dockerfile to reproduce the error

Dockerfile.txt

Download "Dockerfile.txt", rename to "Dockerfile" (the github GUI wouldn't let me upload without the extension)

# build the docker image
docker build -t cv_bug .

# run the docker image
docker run -it cv_bug /bin/bash

# (inside the image) run the test script, it should cause segfault
python3 test.py

# to test with the hack
python3 test.py fix

You can also apply the hack with LD_PRELOAD

LD_PRELOAD="/usr/local/lib/libopencv_core.so:/usr/local/lib/libopencv_imgproc.so" python3 test.py
Investigations

I stepped through the opencv code, the segfault occurs on line 3112 in /opencv/modules/core/src/dxt.cpp

if( getSizeFunc(opt.n, ipp_norm_flag, ippAlgHintNone, &specsize, &initsize, &worksize) >= 0 )

with

getSizeFunc = ippsDFTGetSize_R_64f;

I checked that the inputs to getSizeFunc are the same, with and without the hack. I don't have source code for ippsDFTGetSize_R_64f, so could only step trough assembly. With the hack, the segfault occurs in "icv_l9_mkl_dft_avx2_z_ipp_real_get_size", without the hack, that function is never invoked.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Hackathonhttps://opencv.org/opencv-hackathon-starts-next-week/bugcategory: coreconfirmedThere is stable reproducer / investigation complete

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions