Skip to content

fix missing addref() in ocl::Context::create(str)#18907

Merged
opencv-pushbot merged 1 commit intoopencv:masterfrom
diablodale:exec_context_create_addref
Nov 25, 2020
Merged

fix missing addref() in ocl::Context::create(str)#18907
opencv-pushbot merged 1 commit intoopencv:masterfrom
diablodale:exec_context_create_addref

Conversation

@diablodale
Copy link
Copy Markdown
Contributor

fix #18906
add two test cases

I ran opencv_test_cored.exe --gtest_filter=OCL* and all passed successfully including the two new test cases

[----------] Global test environment tear-down
[==========] 6837 tests from 60 test cases ran. (267680 ms total)
[  PASSED  ] 6837 tests.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • The PR is proposed to proper branch
  • There is reference to original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for contribution 👍

if (impl)
{
CV_LOG_INFO(NULL, "OpenCL: reuse context@" << impl->contextId << " for configuration: " << configuration)
impl->addref();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch 👍

Similar to implementation below:

static Impl* findOrCreateContext(cl_context h)

{
void *p1, *p2, *p3;
{
ocl::Context context1 = ocl::Context::create(":GPU:1");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:GPU:1

Please use :GPU:0, so this can be tested on CI with one GPU too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do. However...in this same test_opencl.cpp file...I copied verbatim that code. So if you are concerned about this for my two test cases, then your concern should apply to the other two test cases that do the same thing.

Would you like me to fix those other two test cases while I'm in this PR?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:GPU:1 requires extra unnecessary requirements to run this test (to have dual GPU configurations at least).

other two test cases

I will take a look on them once again a bit later (not in the scope of this PR)

p2 = context2.ptr();
}

ASSERT_NE(reinterpret_cast<uintptr_t>(p1), reinterpret_cast<uintptr_t>(p2));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check may fail (with both :GPU:0 or :GPU:1 in case of multiple OpenCL devices).

The same handle value can be reused by OpenCL runtime. There is not guarantee which forces of new handle.
It is similar to malloc()/free()/malloc() chain where both malloc() calls returns the same pointer.

So both ASSERT_NE and ASSERT_EQ checks are not valid.


destroyUnusedExecContext

BTW, during implementation there was an idea to keep all created contexts till the program termination.

Copy link
Copy Markdown
Contributor Author

@diablodale diablodale Nov 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is a problem. Using a memory address to check for "sameness" is unsafe. In my testing, I saw the runtime happily allocate new Context::Impl in the same memory location as the prematurely destroyed one. Efficient. 😆

My challenge is that users of the SDK have almost no visibility into the Context::Impl. It is a private class and the public can't get to contextId within it. I believe that contextId is the best identifier with which to check sameness.

Does this team have an approach you prefer when functionality is needed for test cases but less value for the general public?

I have suggestions, in order of my preference

  1. Create a public API Context::getId() which returns an int. I think this is the simplest.
  2. Make a typedef contextId_t. Do getId() and return it. Update other code in ocl.* files to cascade the use of this typedef. However, I don't see a reasonable benefit of this approach because the codebase is using CV_XADD() to create/increment these context id's so there is already friction to use something other than a int-like type.
  3. Create a custom Context::operator==(). However, this will cascade substantially more work with little benefit other than test cases.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to disable / remove this test for now.
To unblock fix merging.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I have to remove both new test cases. Two reasons:

  1. I can't reliably use the ptr() to check for same or not-same. There is no mechanism for any test case to verify refcounting or sameness on OpenCLExecutionContext.
  2. There is an issue (or side-affect) that is exposed by the already existing test cases; discovered today with deeper tracing in a debugger...

The ":GPU:1" config in the already existing test cases is definitely using the 2nd GPU device in the system. Another PR needs to change the code to match your desire to run tests on systems that have only 1 GPU. I wanted to pre-emptively verify my test cases so I changed them to use the 0 device and deeply traced in the debugger. I found a side-affect. The existing test cases are affecting the state of the overall OpenCV system. This is undesired and readily invalidates other test cases (like mine).

When the existing test cases run, they call executeUMatCall() and it creates two UMat objects. These UMat objects increment the refcount on the ocl::Context that was bind() at the time of their creation. Unfortunately, they do not decrement that refcount when the two UMat go out of scope. The UMat also do not decrement when another OpenCLExecutionContext.bind().

This UMat behavior leads to earlier test cases changing the state of both individual Context's and the global Context collection -- and these changes persist across test cases. For example, when the test OCL_OpenCLExecutionContext, createGPU runs, the default Context "" already has 5 refcounts on it that were created by the earlier test cases. This will also affect other Context that were created like ":GPU:0" or ":GPU:1". The refcounts made by UMat's will persist across test cases.

This means that if another PR creates a Context::getId(), then all test cases that need to check for Context sameness need to be before all tests cases that will persist Context refcounts across test cases.

I'm starting to understand the general OpenCL approach. However, I lack similar knowledge with UMat and therefore I am unclear if the above UMat refcount side-affects are intentional or a issue themselves.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately UMat doesn't support context sharing/migration at all.
It is assumed that UMat is created, used and destroyed with the same active OpenCL context. UMat requires redesign to support multiple OpenCL contexts.


The UMat also do not decrement when another OpenCLExecutionContext.bind()

I will take a look on this week.


keep all created contexts till the program termination

One more reason for that is handling of such non-binded UMat buffers.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I'll open to separate issues to cover the topics outside this direct PR. I've also updated this PR to only have the single line change addref() and remove the two test cases.

- fix #18906
- unable to add related test cases as there is
  no public access to Context:Impl refcounts
@opencv-pushbot opencv-pushbot merged commit d65c6af into opencv:master Nov 25, 2020
@alalek alalek mentioned this pull request Nov 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ocl::Context::create(string) reference count wrong when reusing contexts

3 participants