Skip to content

imgproc: Passing sigma1=0.2 or less to GaussianBlur will end up in almost empty image #20792

@tomoaki0705

Description

@tomoaki0705
System information (version)
  • OpenCV => Recent 3.4 ( 4d587c3 ) master ( 9b093c9 )
  • Operating System / Platform => Ubuntu 18.04 Aarch64, Windows x86_64
  • Compiler => MSVC 2017 + CUDA 10.0, GCC 7.5, CUDA 10.0
Detailed description
  1. Test failure
    opencv_test_cudafilters was failing.
    The test combines many parameters, but the test failure were all from CV_16U
    The story will eventually arrive to imgproc but it started from opencv_test_cudafilters
[  FAILED  ] 73 tests, listed below:
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/388, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(1), KSize(9x9), BORDER_CONSTANT, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/393, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(1), KSize(11x11), BORDER_REFLECT101, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/396, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(1), KSize(11x11), BORDER_CONSTANT, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/397, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(1), KSize(11x11), BORDER_CONSTANT, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/413, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(1), KSize(15x15), BORDER_CONSTANT, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/417, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(1), KSize(17x17), BORDER_REFLECT101, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/434, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(1), KSize(21x21), BORDER_REPLICATE, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/446, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(1), KSize(23x23), BORDER_REFLECT, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/453, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(1), KSize(25x25), BORDER_CONSTANT, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/456, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(1), KSize(27x27), BORDER_REFLECT101, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/458, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(1), KSize(27x27), BORDER_REPLICATE, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/476, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(1), KSize(31x31), BORDER_CONSTANT, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/505, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(3), KSize(9x9), BORDER_REFLECT101, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/511, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(3), KSize(9x9), BORDER_REFLECT, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/537, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(3), KSize(17x17), BORDER_REFLECT101, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/538, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(3), KSize(17x17), BORDER_REPLICATE, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/566, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(3), KSize(23x23), BORDER_REFLECT, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/569, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(3), KSize(25x25), BORDER_REFLECT101, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/575, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(3), KSize(25x25), BORDER_REFLECT, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/591, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(3), KSize(29x29), BORDER_REFLECT, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/593, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(3), KSize(31x31), BORDER_REFLECT101, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/594, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(3), KSize(31x31), BORDER_REPLICATE, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/639, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(4), KSize(11x11), BORDER_REFLECT, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/648, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(4), KSize(15x15), BORDER_REFLECT101, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/649, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(4), KSize(15x15), BORDER_REFLECT101, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/652, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(4), KSize(15x15), BORDER_CONSTANT, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/658, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(4), KSize(17x17), BORDER_REPLICATE, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/664, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(4), KSize(19x19), BORDER_REFLECT101, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/667, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(4), KSize(19x19), BORDER_REPLICATE, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/703, where GetParam() = (NVIDIA Tegra X1, 128x128, CV_16U, Channels(4), KSize(27x27), BORDER_REFLECT, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1817, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(1), KSize(7x7), BORDER_REFLECT101, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1819, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(1), KSize(7x7), BORDER_REPLICATE, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1826, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(1), KSize(9x9), BORDER_REPLICATE, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1827, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(1), KSize(9x9), BORDER_REPLICATE, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1837, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(1), KSize(11x11), BORDER_CONSTANT, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1843, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(1), KSize(13x13), BORDER_REPLICATE, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1864, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(1), KSize(19x19), BORDER_REFLECT101, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1872, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(1), KSize(21x21), BORDER_REFLECT101, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1874, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(1), KSize(21x21), BORDER_REPLICATE, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1878, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(1), KSize(21x21), BORDER_REFLECT, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1884, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(1), KSize(23x23), BORDER_CONSTANT, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1892, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(1), KSize(25x25), BORDER_CONSTANT, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1898, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(1), KSize(27x27), BORDER_REPLICATE, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1902, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(1), KSize(27x27), BORDER_REFLECT, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1905, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(1), KSize(29x29), BORDER_REFLECT101, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1916, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(1), KSize(31x31), BORDER_CONSTANT, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1945, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(3), KSize(9x9), BORDER_REFLECT101, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1952, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(3), KSize(11x11), BORDER_REFLECT101, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1957, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(3), KSize(11x11), BORDER_CONSTANT, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1960, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(3), KSize(13x13), BORDER_REFLECT101, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1965, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(3), KSize(13x13), BORDER_CONSTANT, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1969, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(3), KSize(15x15), BORDER_REFLECT101, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1977, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(3), KSize(17x17), BORDER_REFLECT101, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1978, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(3), KSize(17x17), BORDER_REPLICATE, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/1982, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(3), KSize(17x17), BORDER_REFLECT, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/2008, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(3), KSize(25x25), BORDER_REFLECT101, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/2023, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(3), KSize(27x27), BORDER_REFLECT, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/2024, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(3), KSize(29x29), BORDER_REFLECT101, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/2032, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(3), KSize(31x31), BORDER_REFLECT101, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/2038, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(3), KSize(31x31), BORDER_REFLECT, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/2073, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(4), KSize(11x11), BORDER_REFLECT101, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/2086, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(4), KSize(13x13), BORDER_REFLECT, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/2088, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(4), KSize(15x15), BORDER_REFLECT101, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/2096, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(4), KSize(17x17), BORDER_REFLECT101, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/2108, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(4), KSize(19x19), BORDER_CONSTANT, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/2111, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(4), KSize(19x19), BORDER_REFLECT, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/2114, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(4), KSize(21x21), BORDER_REPLICATE, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/2118, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(4), KSize(21x21), BORDER_REFLECT, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/2123, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(4), KSize(23x23), BORDER_REPLICATE, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/2131, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(4), KSize(25x25), BORDER_REPLICATE, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/2135, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(4), KSize(25x25), BORDER_REFLECT, sub matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/2142, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(4), KSize(27x27), BORDER_REFLECT, whole matrix)
[  FAILED  ] CUDA_Filters/GaussianBlur.Accuracy/2153, where GetParam() = (NVIDIA Tegra X1, 113x113, CV_16U, Channels(4), KSize(31x31), BORDER_REFLECT101, sub matrix)

73 FAILED TESTS
  1. Deterministic-ness
    The test was failing constantly, so the failure was deterministic
$ for i in `seq 1 3 ` ; do ./bin/opencv_test_cudafilters --gtest_filter=*Gaussian* --gtest_param_filter=*CV_16U* | grep "FAILED TESTS" ; done
67 FAILED TESTS
67 FAILED TESTS
67 FAILED TESTS

But, when using the --gtest_repeat=-1 option, I can see that the failed test varies.
This shows that the failure is based on the random seed, so it may happen on any situation.

$ ./bin/opencv_test_cudafilters --gtest_filter=*Gaussian* --gtest_param_filter=*CV_16U* --gtest_repeat=3 | grep "FAILED TESTS"
67 FAILED TESTS
64 FAILED TESTS
77 FAILED TESTS

Still, only CV_16U was failing.

  1. Digging in the source code
    When I dig in the file, I saw the src, dst and dst_gold
    src was this ,
    GPU result was this and
    CPU result was this
    Now, clearly something wrong is happening on the "CPU" version.
    So, the test failure is on opencv_test_cudafilters, but the actual cause is in the CPU version of GaussianBlur

  2. Digging in GaussianBlur
    When I looked in there, I sometime realized that the kernel was very peaky.

width(_width), height(_height), cn(_cn), kx(_kx), ky(_ky), kxlen(_kxlen), kylen(_kylen), borderType(_borderType)

Now the Gaussian Blur function is bit-exact, and the kernel is implemented with softfloat
The float version of the kernel was as following when k=7

0.00000000
3.38778060e-21
7.62910940e-06
0.999984741
7.62910940e-06
3.38778060e-21
0.00000000

Now, this is the soft float version

0x00000000
0x00000000
0x00000000
0x00010000
0x00000000
0x00000000
0x00000000

So only the peak has value "1", which means no smoothing is happening.

  1. Actual cause
    I can see that the actual cause was here

v_mul_expand(vx_load(src + pre_shift * cn), vx_setall_u16((uint16_t) *((uint32_t*)(m + pre_shift))), v_res0, v_res1);

Here, m is the kernel array implemented in softfloat, and it's cast to uint16_t right after the load
(uint16_t) *((uint32_t*)(m + pre_shift))
So this causes 0x10000 to be cast to unsigned short which causes an overflow and the result becomes 0.

Now, other slots of kernels are also 0, hence every pixel will be convoluted with 0.
This results with the empty image of CPU version.

  1. Other aspects of the bug
    6.1. Two sigma
    The Gaussian Blur takes two sigma as parameters, but only sigma1 causes this issue.
    The sigma2 stands for the vertical kernel.
    Though peaky kernel is specified, the corresponding vertical filter part doesn't cast to uint16_t

v_uint32 v_mul = vx_setall_u32(*((uint32_t*)(m + pre_shift)));

6.2. Value
Technically, when the sigma1 is lower than appx 0.2, this "peaky" kernel gets generated and overflow happens.

6.3 Other implementation?
CUDA implementation was consistent.
I haven't checked about OpenCL implementation.
At least, both SSE and NEON implementation was failing.
For AVX, it may fail but I haven't checked yet.

Steps to reproduce

Here's my test code

{
    using namespace cv;
    Mat src(128, 128, CV_16UC1, Scalar(255));
    Mat dst;
    double sigma = theRNG().uniform(0.0, 0.2);        // a peaky kernel
    GaussianBlur(src, dst, Size(7, 7), sigma, 0.9);
    int count = (int)countNonZero(dst);
    int nintyPercent = (int)(src.rows*src.cols * 0.9);
    EXPECT_GT(count, nintyPercent);
}

This will generate failure image.
I'll send a patch with this test code later.

Issue submission checklist
  • I report the issue, it's not a question
  • I checked the problem with documentation, FAQ, open issues,
    forum.opencv.org, Stack Overflow, etc and have not found solution
  • I updated to latest OpenCV version and the issue is still there
  • There is reproducer code and related data files: videos, images, onnx, etc

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions