- OpenCV => commit 545f8a8 (tip at time of writing)
- Operating System / Platform => Ubuntu18, pocl version string (from
clinfo) OpenCL 1.2 pocl 1.1 None+Asserts, LLVM 6.0.0, SPIR, SLEEF, DISTRO, POCL_DEBUG
- Compiler => gcc 7.4.0
Detailed description
relates #15855
A number of gaussian blur GPU tests fail when using the pocl OpenCL 1.2 implementation. A bit of digging into how this code works revealed that as of this commit the gaussian blur kernel generation code (the various helper functions found in smooth.dispatch.cpp) creates kernels that contain one set of values, but examining the build options being passed to clBuildProgram in ocl.cpp when building the row_filter/col_filter kernels suggests a different set of values are being handed off to OpenCL. For example in the first test GaussianBlurPerfTestGPU/GaussianBlurPerfTest.TestPerformance/0:
adding a print to the effect of
printf("kernel:\n");
for(auto& fixedPointVal : fkx){
double floatVal = static_cast<double>(fixedPointVal);
printf("%f, ", floatVal);
}
printf("\n");
at line 663 of smooth.dispatch.cpp to print the final kernel gives the following output:
kernel:
0.332031, 0.335938, 0.332031,
with fixedpoint16's fixedshift value of 8 you can deduce that the raw 16 bit vals for this kernel are 85, 86, 85. Adding this line
printf("build options: %s\n", buildflags.c_str());
at ocl.cpp:3888 reveals that the build options being passed for both row_filter and col_filter contain the following: -D COEFF=DIG(85)DIG(85)DIG(85). Admittedly I'm not familiar with how these values translate into what the OpenCL kernels do (they're pretty dense) but after making the assumption that these definitions were the kernel values I hard coded the mat_kernel definition in filterSepCol.cl and filterSepRow.cl to be { 85, 86, 85 } instead of { COEFF }. This made the test pass, which I think confirms that something in the OpenCL backend is jumbling the options somehow, so that { 85, 85, 85} ends up in the OpenCL kernel instead of the correct {85, 86, 85}. Interestingly this isn't the case for all of these tests, only nine of the thirty tests fail in this manner. Note also that the intel OpenCL 2.1 drivers don't fail these tests, they get a whole other set of build options.
Steps to reproduce
- make sure pocl is installed
- clone and build opencv, once in the repo I did exactly this:
mkdir build && cd build
cmake
-DCMAKE_BUILD_TYPE=Release
-DWITH_CUDA=OFF
-DWITH_OPENCL=ON
-DWITH_FFMPEG=OFF
-DBUILD_TESTS=ON
-DBUILD_PERF_TESTS=ON
-DBUILD_EXAMPLES=ON
-DBUILD_DOCS=OFF
-DWITH_IPP=OFF
-DOPENCL_LIBRARY="/path/to/OpenCLICDLoader/libOpenCL.so"
-DOPENCL_INCLUDE_DIR="/path/to/cl/headers/include"
-GNinja ..
ninja
git clone https://github.com/opencv/opencv_extra
export OPENCV_TEST_DATA_PATH=/path/to/build/opencv_extra/testdata/
note:
- opencl library can optionally be the pocl
.so instead of the loader, I'm just dealing with multiple drivers
- again depending on your driver situation you might have to
export OPENCV_OPENCL_DEVICE to get the right one
- once everything has built run
bin/opencv_perf_gapi --gtest_filter=GaussianBlurPerfTestGPU/GaussianBlurPerfTest.TestPerformance*, a number of tests will fail. The output from failing tests will be variations on this theme:
[ RUN ] GaussianBlurPerfTestGPU/GaussianBlurPerfTest.TestPerformance/2, where GetParam() = (compare_f, 8UC1, 3, 1920x1080, { gapi.kernel_package })
AbsSimilarPoints error: err_points=113590 max_err_points=103680 (total=2073600) diff_tolerance=1
../modules/gapi/perf/common/gapi_imgproc_perf_tests_inl.hpp:273: Failure
Value of: cmpF(out_mat_gapi, out_mat_ocv)
Actual: false
Expected: true
params = (compare_f, 8UC1, 3, 1920x1080, { gapi.kernel_package })
termination reason: unknown
bytesIn = 0
bytesOut = 0
samples = 10 of 100
outliers = 0
frequency = 1000000000
min = 11097065 = 11.10ms
median = 11203118 = 11.20ms
gmean = 11344524 = 11.34ms
gstddev = 0.02755336 = 1.88ms for 97% dispersion interval
mean = 11348445 = 11.35ms
stddev = 318181 = 0.32ms
I apologise that I don't have the time to investigate this bug any further myself, the best I can do is give a heads up and dump my notes on the issue.
clinfo)OpenCL 1.2 pocl 1.1 None+Asserts, LLVM 6.0.0, SPIR, SLEEF, DISTRO, POCL_DEBUGDetailed description
relates #15855
A number of gaussian blur GPU tests fail when using the pocl OpenCL 1.2 implementation. A bit of digging into how this code works revealed that as of this commit the gaussian blur kernel generation code (the various helper functions found in smooth.dispatch.cpp) creates kernels that contain one set of values, but examining the build options being passed to
clBuildProgramin ocl.cpp when building therow_filter/col_filterkernels suggests a different set of values are being handed off to OpenCL. For example in the first testGaussianBlurPerfTestGPU/GaussianBlurPerfTest.TestPerformance/0:adding a print to the effect of
at line 663 of
smooth.dispatch.cppto print the final kernel gives the following output:with
fixedpoint16'sfixedshiftvalue of 8 you can deduce that the raw 16 bitvals for this kernel are85, 86, 85. Adding this lineat
ocl.cpp:3888reveals that the build options being passed for bothrow_filterandcol_filtercontain the following:-D COEFF=DIG(85)DIG(85)DIG(85). Admittedly I'm not familiar with how these values translate into what the OpenCL kernels do (they're pretty dense) but after making the assumption that these definitions were the kernel values I hard coded themat_kerneldefinition infilterSepCol.clandfilterSepRow.clto be{ 85, 86, 85 }instead of{ COEFF }. This made the test pass, which I think confirms that something in the OpenCL backend is jumbling the options somehow, so that{ 85, 85, 85}ends up in the OpenCL kernel instead of the correct{85, 86, 85}. Interestingly this isn't the case for all of these tests, only nine of the thirty tests fail in this manner. Note also that the intel OpenCL 2.1 drivers don't fail these tests, they get a whole other set of build options.Steps to reproduce
note:
.soinstead of the loader, I'm just dealing with multiple driversexport OPENCV_OPENCL_DEVICEto get the right onebin/opencv_perf_gapi --gtest_filter=GaussianBlurPerfTestGPU/GaussianBlurPerfTest.TestPerformance*, a number of tests will fail. The output from failing tests will be variations on this theme:I apologise that I don't have the time to investigate this bug any further myself, the best I can do is give a heads up and dump my notes on the issue.