fixed several problems when running tests on Mac by vpisarev · Pull Request #16608 · opencv/opencv

vpisarev · 2020-02-17T20:48:09Z

failures in ocl_pyrUp
failures in ocl_flip when processing CV_16UC3 or CV_16SC3 data
some basic UMat tests; Mat => UMat => Mat chain did not work properly on Discrete video cards.
One of histogram badarg tests tried to access the 2nd element of 1-element input array. Most of the time it worked fine, because the function reported the expected error because of some other bad arg, but sometimes it crashed on histogram allocation when the size was valid but huge.

force_builders=Custom Mac,Custom
test_opencl:Custom Mac=ON

build_image:Custom=ubuntu:18.04
buildworker:Custom=linux-5
test_opencl:Custom=ON

* OCL_pyrUp * OCL_flip * some basic UMat tests * histogram badarg test (out of range access)

vpisarev · 2020-02-17T20:49:30Z

@alalek, could you please add the magic comment to enable OpenCL on Mac?

alalek · 2020-02-18T15:46:05Z

modules/core/src/ocl.cpp


+            bool copyOnMap = (flags0 & UMatData::COPY_ON_MAP) != 0;
+            if( copyOnMap )
+                accessFlags &= ~ACCESS_FAST;


Do you remember which test triggers that?

I have forced hostUnifiedMemory = false, but nothing is failed on Macs.

@alalek, the test is "UMat.SyncTemp". When you put some condition and nothing is failed, it does not mean that there is no bug in the logic. Before the patch the code attempts to do Map/Unmap even on machines without unified (read "shared") GPU-CPU memory. Somehow modern drivers on Windows/Linux implement this map unmap by doing implicit copying. On Mac it does not work well. On your macs everything works well probably because there is just iGPU there, no dGPU. I have 2 macs with dGPU, and they just freeze on "UMat.SyncTemp" test for infinite amount of time. After the patch everything works well.

On Mac it does not work well.

So, where should we put #ifdef __APPLE__?

Dropping ACCESS_FAST flag here does not look like a good idea.
AFAIK, it was designed to fail fast if any data copying is required.

Please change in ocl.cpp:

-#define CV_OPENCL_TRACE_CHECK 0 +#define CV_OPENCL_TRACE_CHECK 1

and collect stdout dump on problematic test (e.g, which requires this fix).

BTW, I can't find any place where ACCESS_FAST is enabled (including tests).
GitHub results: 1 declaration + 2 checks.

With the tracing turned on sometimes it hangs on unmap:

OpenCV(OpenCL:0): clCreateBuffer(CL_MEM_USE_HOST_PTR|createFlags, sz=100, origdata=0x7fa0d5278680) => 0x7fa0d5277ea0 OpenCV(OpenCL:0): clCreateBuffer() => 0x7fa0d5277ea0 OpenCV(OpenCL:0): handle = clCreateCommandQueue_pfn(ch, dh, props, &retval) OpenCV(OpenCL:0): clEnqueueReadBuffer(q, handle=0x7fa0d5277ea0, CL_TRUE, 0, sz=100, data=0x7fa0d5278680, 0, 0, 0) OpenCV(OpenCL:0): clEnqueueWriteBuffer(q, handle=0x7fa0d5277ea0, CL_TRUE, 0, sz=100, data=0x7fa0d5278680, 0, 0, 0) [ INFO:0] Successfully initialized OpenCL cache directory: /var/folders/ll/ddsp7fq10lvb2lf9v91x6ks40000gn/T/opencv/3.4.x-dev/opencl_cache/ [ INFO:0] Preparing OpenCL cache configuration for context: 32-bit--AMD--AMD_Radeon_HD_-_FirePro_D700_Compute_Engine--1_2__Jan_13_2020_20_12_40_ OpenCV(OpenCL:0): clBuildProgram(binary: core/copyset) OpenCV(OpenCL:0): result = clGetProgramBuildInfo_pfn(handle, devices[0], 0x1181, sizeof(build_status), &build_status, &retsz) OpenCV(OpenCL:0): clCreateKernel('set') OpenCV(OpenCL:0): clSetKernelArg('set', arg_index=0, cl_mem=0x7fa0d5277ea0) OpenCV(OpenCL:0): clSetKernelArg('set', arg_index=1, step_value=10) OpenCV(OpenCL:0): clSetKernelArg('set', arg_index=2, offset_value=0) OpenCV(OpenCL:0): clSetKernelArg('set', arg_index=3, rows_value=10) OpenCV(OpenCL:0): clSetKernelArg('set', arg_index=4, cols_value=5) OpenCV(OpenCL:0): clSetKernelArg('set', arg_index=5, size=2, obj=0x7ffeea546ea0) OpenCV(OpenCL:0): clFinish_pfn(qq) OpenCV(OpenCL:0): clReleaseKernel_pfn(handle) OpenCV(OpenCL:0): clEnqueueReadBuffer(q, handle=0x7fa0d5277ea0, CL_TRUE, 0, sz=100, data=0x7fa0d5278680, 0, 0, 0) OpenCV(OpenCL:0): clEnqueueWriteBuffer(q, handle=0x7fa0d5277ea0, CL_TRUE, 0, sz=100, data=0x7fa0d5278680, 0, 0, 0) OpenCV(OpenCL:0): clEnqueueMapBuffer(handle=0x7fa0d5277ea0, sz=100) => 0x7fa0d5278680 OpenCV(OpenCL:0): clEnqueueUnmapMemObject(handle=0x7fa0d5277ea0, data=0x7fa0d5278680, [sz=100])

sometimes it crashes at the OpenCL kernel (setTo) release,
sometimes it crashes at some other random places.

if the code that removes ACCESS_FAST flag is commented off, no any error/exception happens. The code returns UMat with OpenCL handle=NULL. Then, obviously, OpenCL kernel refuses to run and then it automatically falls back to CPU implementation. But I'm almost sure that this is much worse and error-prone behaviour than the real data copy. I moved ACCESS_FAST remove code into non-SVM branch, because ACCESS_FAST is really for SVM.

modules/core/src/opencl/flip.cl

…e the OpenCL compiler on Mac generates incorrect code

…ared virtual memory (in OpenCL 2.x), not support vector machine)

…ines with discrete video only on Macs. On Windows/Linux the drivers are seemingly smart enough to implement map/unmap properly (and maybe more efficiently than explicit read/write)

alalek

Looks good to me 👍

* fixed several problems when running tests on Mac: * OCL_pyrUp * OCL_flip * some basic UMat tests * histogram badarg test (out of range access) * retained the storepix fix in ocl_flip only for 16U/16S datatype, where the OpenCL compiler on Mac generates incorrect code * moved deletion of ACCESS_FAST flag to non-SVM branch (where SVM is shared virtual memory (in OpenCL 2.x), not support vector machine) * force OpenCL to use read/write for GPU<=>CPU memory transfers on machines with discrete video only on Macs. On Windows/Linux the drivers are seemingly smart enough to implement map/unmap properly (and maybe more efficiently than explicit read/write)

…m added dnn: don't require setInput in .dump() dnn: auto network dump through parameter Improved GStreamer documentation. improved documentation for imread() videoio/MSMF: refactored format handling and selection, property reading and writing dnn(test): skip failed ngraph tests add cuda 10 support 3.4 docs for 3.4 branch Resolve opencv#14566 Merge pull request opencv#16445 from atinfinity:fixed-typo * fixed typo * add compatibility code to handle migration Enable Mask R-CNN with Inference Engine. Full coverage with nGraph intrin: fixed int64->double conversion for AVX-512 Merge pull request opencv#16608 from vpisarev:fix_mac_ocl_tests * fixed several problems when running tests on Mac: * OCL_pyrUp * OCL_flip * some basic UMat tests * histogram badarg test (out of range access) * retained the storepix fix in ocl_flip only for 16U/16S datatype, where the OpenCL compiler on Mac generates incorrect code * moved deletion of ACCESS_FAST flag to non-SVM branch (where SVM is shared virtual memory (in OpenCL 2.x), not support vector machine) * force OpenCL to use read/write for GPU<=>CPU memory transfers on machines with discrete video only on Macs. On Windows/Linux the drivers are seemingly smart enough to implement map/unmap properly (and maybe more efficiently than explicit read/write) Merge pull request opencv#16594 from vpisarev:hull_ordering_fix fixed the ordering of contour convex hull points * partially fixed the issue opencv#4539 * fixed warnings and test failures * fixed integer overflow (issue opencv#14521) * added comment to force buildbot to re-run * extended the test for the issue 4539. Check the expected behaviour on the original contour as well * added comment; fixed typo, renamed another variable for a little better clarity * added yet another part to the test for issue opencv#4539, where we run convexHull and convexityDetects on the original contour, without any manipulations. the rest of the test stays the same

fixed several problems when running tests on Mac:

4957a2b

* OCL_pyrUp * OCL_flip * some basic UMat tests * histogram badarg test (out of range access)

alalek reviewed Feb 18, 2020

View reviewed changes

modules/core/src/opencl/flip.cl Show resolved Hide resolved

vpisarev added 3 commits February 19, 2020 15:56

retained the storepix fix in ocl_flip only for 16U/16S datatype, wher…

a317fc3

…e the OpenCL compiler on Mac generates incorrect code

moved deletion of ACCESS_FAST flag to non-SVM branch (where SVM is sh…

eaa5692

…ared virtual memory (in OpenCL 2.x), not support vector machine)

force OpenCL to use read/write for GPU<=>CPU memory transfers on mach…

b2d91d7

…ines with discrete video only on Macs. On Windows/Linux the drivers are seemingly smart enough to implement map/unmap properly (and maybe more efficiently than explicit read/write)

alalek reviewed Feb 21, 2020

View reviewed changes

alalek merged commit 07b4750 into opencv:3.4 Feb 21, 2020

alalek mentioned this pull request Feb 26, 2020

Merge 3.4 #16674

Merged

vpisarev deleted the fix_mac_ocl_tests branch May 28, 2020 18:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fixed several problems when running tests on Mac#16608

fixed several problems when running tests on Mac#16608
alalek merged 4 commits intoopencv:3.4from
vpisarev:fix_mac_ocl_tests

vpisarev commented Feb 17, 2020 •

edited by alalek

Loading

Uh oh!

vpisarev commented Feb 17, 2020

Uh oh!

alalek Feb 18, 2020

Uh oh!

vpisarev Feb 19, 2020

Uh oh!

alalek Feb 19, 2020

Uh oh!

vpisarev Feb 19, 2020 •

edited

Loading

Uh oh!

Uh oh!

alalek left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

vpisarev commented Feb 17, 2020 • edited by alalek Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vpisarev commented Feb 17, 2020

Uh oh!

alalek Feb 18, 2020

Choose a reason for hiding this comment

Uh oh!

vpisarev Feb 19, 2020

Choose a reason for hiding this comment

Uh oh!

alalek Feb 19, 2020

Choose a reason for hiding this comment

Uh oh!

vpisarev Feb 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

alalek left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vpisarev commented Feb 17, 2020 •

edited by alalek

Loading

vpisarev Feb 19, 2020 •

edited

Loading