System information (version)
- OpenCV => 4.5
- Operating System / Platform => Ubuntu 18.04
- Compiler => gcc 7.5
Detailed description
While doing some performance comparisons between cv::Mat and cv::UMat (OpenCL), I noticed that OpenCL was taking a lot longer (8x) when performing color conversions from YUV to BGR or RGB.
I compared my benchmarking code to the perf tests, and I was not populating the optional dstCn parameter in my calls to cv::cvtColor(). When I did pass that parameter, my performance greatly improved!
I think I tracked it down to the function dstChannels() in color.hpp
|
case COLOR_YUV2BGR_NV21: case COLOR_YUV2RGB_NV21: case COLOR_YUV2BGR_NV12: case COLOR_YUV2RGB_NV12: |
. The
cv::COLOR_YUV2RGB and
cv::COLOR_YUV2BGR codes are not in the switch-case list, so the default value of
0 is returned. This then gets passed to
oclCvtColorYUV2BGR();
I don't fully understand why passing 0 as the value for dcn to the OpenCL kernel results in such poor performance, but seems like a simple fix.
Steps to reproduce
```.cpp
cv::UMat yuv = cv::UMat::zeros(1920, 1080, CV_8UC3);
cv::UMat rgb;
cv::cvtColor(yuv, rgb, cv::COLOR_YUV2RGB); /// This is slow :-(
cv::cvtColor(yuv, rgb, cv::COLOR_YUV2RGB, 3); /// This is fast!
```
I can whip up a complete test case/perf test if that's helpful.
Issue submission checklist
System information (version)
Detailed description
While doing some performance comparisons between cv::Mat and cv::UMat (OpenCL), I noticed that OpenCL was taking a lot longer (8x) when performing color conversions from YUV to BGR or RGB.
I compared my benchmarking code to the perf tests, and I was not populating the optional
dstCnparameter in my calls to cv::cvtColor(). When I did pass that parameter, my performance greatly improved!I think I tracked it down to the function
dstChannels()in color.hppopencv/modules/imgproc/src/color.hpp
Line 109 in 049b50d
cv::COLOR_YUV2RGBandcv::COLOR_YUV2BGRcodes are not in the switch-case list, so the default value of0is returned. This then gets passed tooclCvtColorYUV2BGR();I don't fully understand why passing
0as the value fordcnto the OpenCL kernel results in such poor performance, but seems like a simple fix.Steps to reproduce
I can whip up a complete test case/perf test if that's helpful.
Issue submission checklist
answers.opencv.org, Stack Overflow, etc and have not found solution