Skip to content

OpenCL error CL_INVALID_CONTEXT when ~UMat() and default OpenCL context has changed #20486

@diablodale

Description

@diablodale

When a UMat is created in an opencv opencl context (e.g. Intel gpu), then the opencv opencl context is changed to another device (e.g. NVidia gpu), and then the UMat is destructed... it leads to errors of CL_INVALID_CONTEXT.

I think this is due to OpenCLAllocator::unmap() wrongly using the command queue of the current "default" opencv opencl context for the unmap. It should instead use the command queue for the opencl context that OpenCLAllocator::map() itself.

This issue probably also exists in OpenCLAllocator::deallocate_()

I would appreciate some feedback on this from the core OpenCV team. I'll investigate myself in parallel.

System information (version)

  • OpenCV 4.5.3
  • Microsoft Windows [Version 10.0.19043.1110]
  • VS2019 v16.10.4
  • Laptop with Intel integrated GPU, and NVidia discrete GPU

Reproduction

  1. Create opencv app that uses opencl and enables the user to change opencl contexts via cv::ocl::Context::create(config), cv::ocl::OpenCLExecutionContext::create, and exec_context.bind()
  2. Run app
  3. Set to Intel GPU
  4. Create UMats
  5. Change context to NVidia GPU, but don't proactively ~UMat(). Instead let some UMats exists from both contexts.
  6. Now destroy the older Intel gpu UMats.

Result

For each old intel UMat that is destructed...

OpenCV(4.5.3) Error: Unknown error code -220 (OpenCL error CL_INVALID_CONTEXT (-34) during call: clEnqueueUnmapMemObject(handle=00000000FDD6BE30, data=00000000E84A0000, [sz=99532800])) in cv::ocl::OpenCLAllocator::unmap, file ..\modules\core\src\ocl.cpp, line 5926
Exception thrown at 0x00007FFDE3EA4ED9 in Max.exe: Microsoft C++ exception: cv::Exception at memory location 0x00000000007DECF0.
exception in outputmatrix() OpenCV(4.5.3) ..\modules\core\src\ocl.cpp:5926: error: (-220:Unknown error code -220) OpenCL error CL_INVALID_CONTEXT (-34) during call: clEnqueueUnmapMemObject(handle=00000000FDD6BE30, data=00000000E84A0000, [sz=99532800]) in function 'cv::ocl::OpenCLAllocator::unmap'

Do the reverse (old nvidia and new intel), then same function fails but different line because of a different map/copy approach

OpenCV(4.5.3) Error: Unknown error code -220 (OpenCL error CL_INVALID_MEM_OBJECT (-38) during call: clEnqueueWriteBuffer(q, handle=000000006EFD34C0, CL_TRUE, 0, sz=99532800, data=000000011071B080, 0, 0, 0)) in cv::ocl::OpenCLAllocator::unmap, file ..\modules\core\src\ocl.cpp, line 5947
Exception thrown at 0x00007FFDE3EA4ED9 in Max.exe: Microsoft C++ exception: cv::Exception at memory location 0x00000000007DECF0.
exception in outputmatrix() OpenCV(4.5.3) ..\modules\core\src\ocl.cpp:5947: error: (-220:Unknown error code -220) OpenCL error CL_INVALID_MEM_OBJECT (-38) during call: clEnqueueWriteBuffer(q, handle=000000006EFD34C0, CL_TRUE, 0, sz=99532800, data=000000011071B080, 0, 0, 0) in function 'cv::ocl::OpenCLAllocator::unmap'

Expected

No errors. And the old Intel UMats have their resources released.

Notes

In unmap(), all three methods (SVM, map, copy) use a command queue gotten via

cl_command_queue q = (cl_command_queue)Queue::getDefault().ptr();

This is an errant approach. When releasing opencl resources for a UMat, it needs to use the opencl context that allocated those resources for that specific Umat. Otherwise, it is unsafe and will lead to errors like above when contexts are changed.

That means the UMatData needs to know its own opencl context or command queue.
Does this knowledge already exist on the UMatData?
Is it available at UMatData::allocatorContext, prevAllocator, or currAllocator?
If the knowledge is already on UMatData (or can be easily added), then this function's code can be updated to use that command queue.

OpenCLAllocator::deallocate_() I think has a similar errant approach but its code is different. For example

  • SVM approach it uses Context::getDefault() and Queue::getDefault().ptr(). I think it needs to use the context/queues from original allocation context.
  • non-SVM uses cl_command_queue q = (cl_command_queue)Queue::getDefault().ptr(). I think it needs to use the context/queues from original allocation context.
  • etc.

Related issues

There are a number of related issues about "switching opencl devices" but none are so exact as to isolate this single line of code. If we can address this, then it might contribute to those other issues. I'm subscribed to several of them myself.

Resoving this issue will resolve issue #18919
Related to issue #9593, issue #6926, issue #8035

Issue submission checklist

  • I report the issue, it's not a question
  • I checked the problem with documentation, FAQ, open issues,
    forum.opencv.org, Stack Overflow, etc and have not found solution
  • I updated to latest OpenCV version and the issue is still there

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions