Skip to content

cudafilters: Median_Accuracy fails with CUDA 9.0 and after #12721

@tomoaki0705

Description

@tomoaki0705
System information (version)
  • OpenCV =>3.4 branch ( a1ffc00 )
  • Operating System / Platform => Windows 10 64bit
  • Compiler => Visual Studio 2015
  • CUDA => 10.0
  • GPU => Geforce GTX 1060
Summary

The opencv_test_cudafilters fails on Median_Accuracy test

opencv_test_cudafilters --gtest_filter=*Median.Accuracy*

[  PASSED  ] 0 tests.
[  FAILED  ] 28 tests, listed below:
[  FAILED  ] CUDA_Filters/Median.Accuracy/0, where GetParam() = (GeForce GTX 1060, 128x128, CV_8U, KernelSize(3), whole matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/1, where GetParam() = (GeForce GTX 1060, 128x128, CV_8U, KernelSize(3), sub matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/2, where GetParam() = (GeForce GTX 1060, 128x128, CV_8U, KernelSize(5), whole matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/3, where GetParam() = (GeForce GTX 1060, 128x128, CV_8U, KernelSize(5), sub matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/4, where GetParam() = (GeForce GTX 1060, 128x128, CV_8U, KernelSize(7), whole matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/5, where GetParam() = (GeForce GTX 1060, 128x128, CV_8U, KernelSize(7), sub matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/6, where GetParam() = (GeForce GTX 1060, 128x128, CV_8U, KernelSize(9), whole matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/7, where GetParam() = (GeForce GTX 1060, 128x128, CV_8U, KernelSize(9), sub matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/8, where GetParam() = (GeForce GTX 1060, 128x128, CV_8U, KernelSize(11), whole matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/9, where GetParam() = (GeForce GTX 1060, 128x128, CV_8U, KernelSize(11), sub matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/10, where GetParam() = (GeForce GTX 1060, 128x128, CV_8U, KernelSize(13), whole matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/11, where GetParam() = (GeForce GTX 1060, 128x128, CV_8U, KernelSize(13), sub matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/12, where GetParam() = (GeForce GTX 1060, 128x128, CV_8U, KernelSize(15), whole matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/13, where GetParam() = (GeForce GTX 1060, 128x128, CV_8U, KernelSize(15), sub matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/14, where GetParam() = (GeForce GTX 1060, 113x113, CV_8U, KernelSize(3), whole matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/15, where GetParam() = (GeForce GTX 1060, 113x113, CV_8U, KernelSize(3), sub matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/16, where GetParam() = (GeForce GTX 1060, 113x113, CV_8U, KernelSize(5), whole matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/17, where GetParam() = (GeForce GTX 1060, 113x113, CV_8U, KernelSize(5), sub matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/18, where GetParam() = (GeForce GTX 1060, 113x113, CV_8U, KernelSize(7), whole matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/19, where GetParam() = (GeForce GTX 1060, 113x113, CV_8U, KernelSize(7), sub matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/20, where GetParam() = (GeForce GTX 1060, 113x113, CV_8U, KernelSize(9), whole matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/21, where GetParam() = (GeForce GTX 1060, 113x113, CV_8U, KernelSize(9), sub matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/22, where GetParam() = (GeForce GTX 1060, 113x113, CV_8U, KernelSize(11), whole matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/23, where GetParam() = (GeForce GTX 1060, 113x113, CV_8U, KernelSize(11), sub matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/24, where GetParam() = (GeForce GTX 1060, 113x113, CV_8U, KernelSize(13), whole matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/25, where GetParam() = (GeForce GTX 1060, 113x113, CV_8U, KernelSize(13), sub matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/26, where GetParam() = (GeForce GTX 1060, 113x113, CV_8U, KernelSize(15), whole matrix)
[  FAILED  ] CUDA_Filters/Median.Accuracy/27, where GetParam() = (GeForce GTX 1060, 113x113, CV_8U, KernelSize(15), sub matrix)

28 FAILED TESTS
Detailed description
  • I downloaded the GpuMat and stored in file.

Referen (cpu version) and
result (gpu version), definitly, it's not a minor error. Something was wrong.

  • Also, I traced back my personal build farm log, and realized that in July, with CUDA 8.0 + Jetson TX2, the opencv_test_cudafilters had no problem
Device 0: "GP10B"
  CUDA Driver Version / Runtime Version          8.50 / 8.0
  CUDA Capability Major/Minor version number:    6.2

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version  = 8.50, CUDA Runtime Version = 8.0, NumDevs = 1

CTEST_FULL_OUTPUT
OpenCV version: 3.4.2-dev
OpenCV VCS version: 3.4.2-119-g8c75555

[----------] 28 tests from CUDA_Filters/Median
[ RUN      ] CUDA_Filters/Median.Accuracy/0, where GetParam() = (GP10B, 128x128, CV_8U, KernelSize(3), whole matrix)
[       OK ] CUDA_Filters/Median.Accuracy/0 (22 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/1, where GetParam() = (GP10B, 128x128, CV_8U, KernelSize(3), sub matrix)
[       OK ] CUDA_Filters/Median.Accuracy/1 (23 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/2, where GetParam() = (GP10B, 128x128, CV_8U, KernelSize(5), whole matrix)
[       OK ] CUDA_Filters/Median.Accuracy/2 (25 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/3, where GetParam() = (GP10B, 128x128, CV_8U, KernelSize(5), sub matrix)
[       OK ] CUDA_Filters/Median.Accuracy/3 (25 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/4, where GetParam() = (GP10B, 128x128, CV_8U, KernelSize(7), whole matrix)
[       OK ] CUDA_Filters/Median.Accuracy/4 (32 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/5, where GetParam() = (GP10B, 128x128, CV_8U, KernelSize(7), sub matrix)
[       OK ] CUDA_Filters/Median.Accuracy/5 (26 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/6, where GetParam() = (GP10B, 128x128, CV_8U, KernelSize(9), whole matrix)
[       OK ] CUDA_Filters/Median.Accuracy/6 (28 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/7, where GetParam() = (GP10B, 128x128, CV_8U, KernelSize(9), sub matrix)
[       OK ] CUDA_Filters/Median.Accuracy/7 (27 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/8, where GetParam() = (GP10B, 128x128, CV_8U, KernelSize(11), whole matrix)
[       OK ] CUDA_Filters/Median.Accuracy/8 (30 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/9, where GetParam() = (GP10B, 128x128, CV_8U, KernelSize(11), sub matrix)
[       OK ] CUDA_Filters/Median.Accuracy/9 (28 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/10, where GetParam() = (GP10B, 128x128, CV_8U, KernelSize(13), whole matrix)
[       OK ] CUDA_Filters/Median.Accuracy/10 (30 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/11, where GetParam() = (GP10B, 128x128, CV_8U, KernelSize(13), sub matrix)
[       OK ] CUDA_Filters/Median.Accuracy/11 (30 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/12, where GetParam() = (GP10B, 128x128, CV_8U, KernelSize(15), whole matrix)
[       OK ] CUDA_Filters/Median.Accuracy/12 (32 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/13, where GetParam() = (GP10B, 128x128, CV_8U, KernelSize(15), sub matrix)
[       OK ] CUDA_Filters/Median.Accuracy/13 (33 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/14, where GetParam() = (GP10B, 113x113, CV_8U, KernelSize(3), whole matrix)
[       OK ] CUDA_Filters/Median.Accuracy/14 (18 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/15, where GetParam() = (GP10B, 113x113, CV_8U, KernelSize(3), sub matrix)
[       OK ] CUDA_Filters/Median.Accuracy/15 (17 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/16, where GetParam() = (GP10B, 113x113, CV_8U, KernelSize(5), whole matrix)
[       OK ] CUDA_Filters/Median.Accuracy/16 (18 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/17, where GetParam() = (GP10B, 113x113, CV_8U, KernelSize(5), sub matrix)
[       OK ] CUDA_Filters/Median.Accuracy/17 (19 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/18, where GetParam() = (GP10B, 113x113, CV_8U, KernelSize(7), whole matrix)
[       OK ] CUDA_Filters/Median.Accuracy/18 (20 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/19, where GetParam() = (GP10B, 113x113, CV_8U, KernelSize(7), sub matrix)
[       OK ] CUDA_Filters/Median.Accuracy/19 (19 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/20, where GetParam() = (GP10B, 113x113, CV_8U, KernelSize(9), whole matrix)
[       OK ] CUDA_Filters/Median.Accuracy/20 (21 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/21, where GetParam() = (GP10B, 113x113, CV_8U, KernelSize(9), sub matrix)
[       OK ] CUDA_Filters/Median.Accuracy/21 (20 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/22, where GetParam() = (GP10B, 113x113, CV_8U, KernelSize(11), whole matrix)
[       OK ] CUDA_Filters/Median.Accuracy/22 (22 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/23, where GetParam() = (GP10B, 113x113, CV_8U, KernelSize(11), sub matrix)
[       OK ] CUDA_Filters/Median.Accuracy/23 (20 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/24, where GetParam() = (GP10B, 113x113, CV_8U, KernelSize(13), whole matrix)
[       OK ] CUDA_Filters/Median.Accuracy/24 (19 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/25, where GetParam() = (GP10B, 113x113, CV_8U, KernelSize(13), sub matrix)
[       OK ] CUDA_Filters/Median.Accuracy/25 (20 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/26, where GetParam() = (GP10B, 113x113, CV_8U, KernelSize(15), whole matrix)
[       OK ] CUDA_Filters/Median.Accuracy/26 (21 ms)
[ RUN      ] CUDA_Filters/Median.Accuracy/27, where GetParam() = (GP10B, 113x113, CV_8U, KernelSize(15), sub matrix)
[       OK ] CUDA_Filters/Median.Accuracy/27 (21 ms)
[----------] 28 tests from CUDA_Filters/Median (669 ms total)
  • I rolled back OpenCV source code on Windows to 8c75555 and confirmed that the test fails. (CUDA 10.0, Visual Studio 2015)
  • To narrow down the cause, I installed CUDA 8.0 on Windows, too.
  • Building OpenCV with CUDA 8.0 passes, but with CUDA 10.0, it fails. (Also, it failed on CUDA 9.0 on my platform)
  • To narrow down the cause further, I jumped in the CUDA code
  • Now, luckily, the output was done only at one location.
    dest.ptr(i)[j]=(firstBin<<5) + retval;
  • Comparing both CUDA 8.0 version and CUDA 10.0 version, both firstBin and retval differed in the source code
dest.ptr(i)[j]=(firstBin<<5) + retval; // differs
dest.ptr(i)[j]=firstBin;               // differs
dest.ptr(i)[j]=retval;                 // differs
opencv_test_cudafilters --gtest_filters=*Median_Accuracy*

[==========] Running 28 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 28 tests from CUDA_Filters/Median

[----------] 28 tests from CUDA_Filters/Median (1491 ms total)

[----------] Global test environment tear-down
[==========] 28 tests from 1 test case ran. (1521 ms total)
[  PASSED  ] 28 tests.
Press any key to continue . . .[----------] 28 tests from CUDA_Filters/Median (1491 ms total)

[----------] Global test environment tear-down
[==========] 28 tests from 1 test case ran. (1521 ms total)
[  PASSED  ] 28 tests.
  • The cause explains the situation. The bug did exists from before, but it just didn't appear.
  • Upgrading the CUDA was big enough to make the race condition visible.
  • I'll send a PR later
Steps to reproduce
  • Build OpenCV with CUDA 9.0 or later
  • run opencv_test_cudafilters

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions