Skip to content

Gaussian blur GPU tests not providing correct clBuildProgram flags to pocl #16214

@aarongreig

Description

@aarongreig
  • OpenCV => commit 545f8a8 (tip at time of writing)
  • Operating System / Platform => Ubuntu18, pocl version string (from clinfo) OpenCL 1.2 pocl 1.1 None+Asserts, LLVM 6.0.0, SPIR, SLEEF, DISTRO, POCL_DEBUG
  • Compiler => gcc 7.4.0
Detailed description

relates #15855

A number of gaussian blur GPU tests fail when using the pocl OpenCL 1.2 implementation. A bit of digging into how this code works revealed that as of this commit the gaussian blur kernel generation code (the various helper functions found in smooth.dispatch.cpp) creates kernels that contain one set of values, but examining the build options being passed to clBuildProgram in ocl.cpp when building the row_filter/col_filter kernels suggests a different set of values are being handed off to OpenCL. For example in the first test GaussianBlurPerfTestGPU/GaussianBlurPerfTest.TestPerformance/0:
adding a print to the effect of

        printf("kernel:\n");
        for(auto& fixedPointVal : fkx){
          double floatVal = static_cast<double>(fixedPointVal);
          printf("%f, ", floatVal);
        }
        printf("\n");

at line 663 of smooth.dispatch.cpp to print the final kernel gives the following output:

kernel:
0.332031, 0.335938, 0.332031,

with fixedpoint16's fixedshift value of 8 you can deduce that the raw 16 bit vals for this kernel are 85, 86, 85. Adding this line

printf("build options: %s\n", buildflags.c_str());

at ocl.cpp:3888 reveals that the build options being passed for both row_filter and col_filter contain the following: -D COEFF=DIG(85)DIG(85)DIG(85). Admittedly I'm not familiar with how these values translate into what the OpenCL kernels do (they're pretty dense) but after making the assumption that these definitions were the kernel values I hard coded the mat_kernel definition in filterSepCol.cl and filterSepRow.cl to be { 85, 86, 85 } instead of { COEFF }. This made the test pass, which I think confirms that something in the OpenCL backend is jumbling the options somehow, so that { 85, 85, 85} ends up in the OpenCL kernel instead of the correct {85, 86, 85}. Interestingly this isn't the case for all of these tests, only nine of the thirty tests fail in this manner. Note also that the intel OpenCL 2.1 drivers don't fail these tests, they get a whole other set of build options.

Steps to reproduce
  1. make sure pocl is installed
  2. clone and build opencv, once in the repo I did exactly this:
mkdir build && cd build
cmake
  -DCMAKE_BUILD_TYPE=Release
  -DWITH_CUDA=OFF
  -DWITH_OPENCL=ON
  -DWITH_FFMPEG=OFF
  -DBUILD_TESTS=ON
  -DBUILD_PERF_TESTS=ON
  -DBUILD_EXAMPLES=ON
  -DBUILD_DOCS=OFF
  -DWITH_IPP=OFF
  -DOPENCL_LIBRARY="/path/to/OpenCLICDLoader/libOpenCL.so"
  -DOPENCL_INCLUDE_DIR="/path/to/cl/headers/include"
  -GNinja ..
ninja
git clone https://github.com/opencv/opencv_extra
export OPENCV_TEST_DATA_PATH=/path/to/build/opencv_extra/testdata/

note:

  • opencl library can optionally be the pocl .so instead of the loader, I'm just dealing with multiple drivers
  • again depending on your driver situation you might have to export OPENCV_OPENCL_DEVICE to get the right one
  1. once everything has built run bin/opencv_perf_gapi --gtest_filter=GaussianBlurPerfTestGPU/GaussianBlurPerfTest.TestPerformance*, a number of tests will fail. The output from failing tests will be variations on this theme:
[ RUN      ] GaussianBlurPerfTestGPU/GaussianBlurPerfTest.TestPerformance/2, where GetParam() = (compare_f, 8UC1, 3, 1920x1080, { gapi.kernel_package })
AbsSimilarPoints error: err_points=113590  max_err_points=103680 (total=2073600)  diff_tolerance=1
../modules/gapi/perf/common/gapi_imgproc_perf_tests_inl.hpp:273: Failure
Value of: cmpF(out_mat_gapi, out_mat_ocv)
  Actual: false
Expected: true
params    = (compare_f, 8UC1, 3, 1920x1080, { gapi.kernel_package })
termination reason:  unknown
bytesIn   =          0
bytesOut  =          0
samples   =         10 of 100
outliers  =          0
frequency = 1000000000
min       =   11097065 = 11.10ms
median    =   11203118 = 11.20ms
gmean     =   11344524 = 11.34ms
gstddev   = 0.02755336 = 1.88ms for 97% dispersion interval
mean      =   11348445 = 11.35ms
stddev    =     318181 = 0.32ms

I apologise that I don't have the time to investigate this bug any further myself, the best I can do is give a heads up and dump my notes on the issue.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions