Skip to content

Refcounts for ocl::Context::Impl not decremented when UMats out of scope #18919

@diablodale

Description

@diablodale

When UMat objects are created, they increment the refcount on the ocl::Context that was bind() at the time of their creation. Unfortunately, they:

  1. do not decrement that refcount when the UMat go out of scope
  2. do not decrement when a different Context is bound via OpenCLExecutionContext.bind()

This results in the original Context::Impl before the bind() to have many refcounts and not destroy itself yet there is nothing logically that needs it anymore (since everything is out of scope and a new Context was bind()).

Also, all the UMats (or their underlying pool of GPU-based data objects) that were created before the bind are still associated with the original GPU; retaining resources there and potentially leading to faults if those UMat data objects are used against the newly bound GPU.

All this found while debug tracing for #18906

System information (version)
  • OpenCV => 4.5.0
  • Operating System / Platform => Windows 10 64-Bit
  • Compiler => Visual Studio Community 2019 v16.8.2
Steps to reproduce
  1. Setup, compile the master branch (4.5.0 tag is fine) as a Debug build
  2. Setup your debugger to run opencv_test_cored.exe --gtest_filter=OCL_OpenCL*
  3. Set breakpoint at the following line

OpenCLExecutionContext ctx = OpenCLExecutionContext::getCurrent();

  1. Start debugging
  2. When it stops at the above line 113, set a new debug break at the following line (or step the debugger several functions deep until you reach this line)

if (configuration.empty() && !container.empty())

  1. Look at the var container, its size, and the Context::Impl that are stored within it.
Result

Container has 1 item. It is the default ExecutionContext created on line 29 of the test case file.
That one Impl has 5 ref counts.

Expected

Container has 1 item. And that 1 item has 1 ref count.
Review the test case code itself between that line and line 113. All objects are out of scope. The only thing that is in scope and needs this Context::Impl is the global cv::ocl system due to it being bind()

Notes

I suspect these additional 4 refcounts are due to the UMat. The ocl::Context variable that is created in some scopes correctly decrements the refcount when it is destroyed at the scope's end. I also suspect it is the UMat because when I remove calls to the executeUMatCall(), I have the refcount I would expect.

If this behavior is desired, then I caution all use of UMat and OpenCLExecutionContext in test cases. Why? Because this behavior makes earlier test cases change the state of both individual Context's and the global Context collection -- and these changes persist across test cases. Therefore, it may surface issues (or hide issues) that the test cases are trying to test.

I am also concerned about the integrity of UMats before/after a bind() that change GPU device. UMat's before the bind() will still be associated with the old GPU. New UMats, and all OCL-enabled functions will expect only the new GPU. So what happens when:

  1. create a OpenCLExecutionContext on a default GPU and bind() it. Or let this happen automatically
  2. Create some UMat
  3. create new OpenCLExecutionContext on a different GPU and bind() it
  4. call a function like cv::remap() with old UMats, or a mix of old+new UMat

This area of concern is perhaps supported by something @alalek wrote

Unfortunately UMat doesn't support context sharing/migration at all. It is assumed that UMat is created, used and destroyed with the same active OpenCL context. UMat requires redesign to support multiple OpenCL contexts.

Issue submission checklist
  • I report the issue, it's not a question
  • I checked the problem with documentation, FAQ, open issues,
    answers.opencv.org, Stack Overflow, etc and have not found solution
  • I updated to latest OpenCV version and the issue is still there
  • There is reproducer code and related data files: videos, images, onnx, etc

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions