Skip to content

changed UMataData allocatorContext to OpenCLExecutionContext#24326

Open
kallaballa wants to merge 18 commits intoopencv:4.xfrom
kallaballa:fix_diverging_cl_contexts
Open

changed UMataData allocatorContext to OpenCLExecutionContext#24326
kallaballa wants to merge 18 commits intoopencv:4.xfrom
kallaballa:fix_diverging_cl_contexts

Conversation

@kallaballa
Copy link
Copy Markdown
Contributor

@kallaballa kallaballa commented Sep 27, 2023

… and bind it via scope everywhere needed.

Changing it from Context to OpenCLExecutionContext poses no problem, because everywhere I need the Context I can get it from the OpenCLExecutionContext.

The pertaining issue: #20486

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake
force_builders=Win64 OpenCL,Linux OpenCL,Linux AVX2

@kallaballa
Copy link
Copy Markdown
Contributor Author

Fixed a problem that was preventing UMat to UMat copy always

@asmorkalov
Copy link
Copy Markdown
Contributor

@opencv-alalek could you review?

@kallaballa
Copy link
Copy Markdown
Contributor Author

@opencv-alalek
Copy link
Copy Markdown
Contributor

To reproduce problem build locally with cmake ... -DWITH_OPENCL=OFF or run with OPENCV_OPENCL_RUNTIME=disabled

@kallaballa
Copy link
Copy Markdown
Contributor Author

kallaballa commented Oct 4, 2023

Oh my! Of course. please hold your horses. I'll fix that.

@asmorkalov asmorkalov added this to the 4.9.0 milestone Oct 5, 2023
@kallaballa
Copy link
Copy Markdown
Contributor Author

Doc build complains about trailing white space. gonna push a fix once this build has finished

@kallaballa
Copy link
Copy Markdown
Contributor Author

Seeing the fails and will fix.

@kallaballa
Copy link
Copy Markdown
Contributor Author

What is the best way to reproduce those cases? is the build configuration available?

@opencv-alalek
Copy link
Copy Markdown
Contributor

Failed builds are with OpenCL (default). With / without OpenCL device available (both fails).

@asmorkalov
Copy link
Copy Markdown
Contributor

/home/ci/opencv/modules/core/src/ocl.cpp:5826:62: warning: declaration of 'pExecCtx' shadows a previous local [-Wshadow]

/home/ci/opencv/modules/core/src/ocl.cpp:5827:50: warning: declaration of 'scope' shadows a previous local [-Wshadow]

/home/ci/opencv/modules/core/src/ocl.cpp:5835:62: warning: declaration of 'pExecCtx' shadows a previous local [-Wshadow]

/home/ci/opencv/modules/core/src/ocl.cpp:5836:50: warning: declaration of 'scope' shadows a previous local [-Wshadow]

@kallaballa
Copy link
Copy Markdown
Contributor Author

/home/ci/opencv/modules/core/src/ocl.cpp:5826:62: warning: declaration of 'pExecCtx' shadows a previous local [-Wshadow]

/home/ci/opencv/modules/core/src/ocl.cpp:5827:50: warning: declaration of 'scope' shadows a previous local [-Wshadow]

/home/ci/opencv/modules/core/src/ocl.cpp:5835:62: warning: declaration of 'pExecCtx' shadows a previous local [-Wshadow]

/home/ci/opencv/modules/core/src/ocl.cpp:5836:50: warning: declaration of 'scope' shadows a previous local [-Wshadow]

Thanks! On it right now.

@kallaballa
Copy link
Copy Markdown
Contributor Author

fixed: 784d86a

@kallaballa
Copy link
Copy Markdown
Contributor Author

[ FAILED ] 16 tests, listed below:
[ FAILED ] Test_Caffe_layers.MVN/0, where GetParam() = OCV/OCL
[ FAILED ] Test_Caffe_layers.MVN/1, where GetParam() = OCV/OCL_FP16
[ FAILED ] Test_Caffe_layers.BatchNorm/0, where GetParam() = OCV/OCL
[ FAILED ] Test_Caffe_layers.BatchNorm/1, where GetParam() = OCV/OCL_FP16
[ FAILED ] Test_ONNX_layers.InstanceNorm/0, where GetParam() = OCV/OCL
[ FAILED ] Test_ONNX_layers.InstanceNorm/1, where GetParam() = OCV/OCL_FP16
[ FAILED ] Test_TensorFlow_layers.batch_norm_10/0, where GetParam() = OCV/OCL
[ FAILED ] Test_TensorFlow_layers.batch_norm_10/1, where GetParam() = OCV/OCL_FP16
[ FAILED ] Test_TensorFlow_layers.batch_norm_11/0, where GetParam() = OCV/OCL
[ FAILED ] Test_TensorFlow_layers.batch_norm_11/1, where GetParam() = OCV/OCL_FP16
[ FAILED ] Test_TensorFlow_layers.batch_norm_13/0, where GetParam() = OCV/OCL
[ FAILED ] Test_TensorFlow_layers.batch_norm_13/1, where GetParam() = OCV/OCL_FP16
[ FAILED ] Test_Torch_layers.run_batch_norm/0, where GetParam() = OCV/OCL
[ FAILED ] Test_Torch_layers.run_batch_norm/1, where GetParam() = OCV/OCL_FP16
[ FAILED ] Test_Torch_nets.FastNeuralStyle_accuracy/0, where GetParam() = OCV/OCL
[ FAILED ] Test_Torch_nets.FastNeuralStyle_accuracy/1, where GetParam() = OCV/OCL_FP16

That sounds serious. Best course of action is a Window 10 VM and building with tests?

@kallaballa
Copy link
Copy Markdown
Contributor Author

kallaballa commented Oct 16, 2023

But there seems something wrong with the tests:

test_dnn-ippicv-opencl test_dnn-ippicv-opencl ; cases (tests): 85 (8069) ; passed: 8053 ; skipped: 23 ; failed

It's the same issue.

Copy link
Copy Markdown
Contributor

@opencv-alalek opencv-alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if( arg.m )
{
std::shared_ptr<ocl::OpenCLExecutionContext> pExecCtx = std::static_pointer_cast<ocl::OpenCLExecutionContext>(arg.m->u->allocatorContext);
OpenCLExecutionContextScope scope(pExecCtx ? *pExecCtx.get() : ocl::OpenCLExecutionContext::getCurrent());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kernel::

OpenCLExecutionContextScope

We should not change the current context selected by caller.

We need to ensure that:

  • kernel is from the current (default) OpenCLExecutionContext (or assume that)
  • and all passed buffers are compatible with the current (default) OpenCLExecutionContext (the same "OpenCL context")

If it is not true, then raise an error.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to properly compare to OpenCLExecutionContexts? I'd write the operator, but I don't know what to compare.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"OpenCL context" means handle of type cl_context.

E.g. compare these values related to OpenCLExecutionContexts: execCtx.getContext().ptr()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

#ifdef HAVE_OPENCL
std::shared_ptr<ocl::OpenCLExecutionContext> pExecCtxSrc = std::static_pointer_cast<ocl::OpenCLExecutionContext>(u->allocatorContext);
std::shared_ptr<ocl::OpenCLExecutionContext> pExecCtxDst = std::static_pointer_cast<ocl::OpenCLExecutionContext>(dst.u->allocatorContext);
if ((pExecCtxSrc && pExecCtxDst && pExecCtxSrc->getContext().ptr() == pExecCtxDst->getContext().ptr()) && u->currAllocator == dst.u->currAllocator)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dst should belong to the current context.

if src is not related to the current context then we should force OpenCLExecutionContextScope below

Mat dst = _dst.getMat();
{
    OpenCLExecutionContextScope ...
    u->currAllocator->download(u, dst.ptr(), dims, sz, srcofs, step.p, dst.step.p);
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

haveDstUninit ? " -D HAVE_DST_UNINIT" : "");

std::shared_ptr<ocl::OpenCLExecutionContext> pExecCtx = std::static_pointer_cast<ocl::OpenCLExecutionContext>(u->allocatorContext);
ocl::OpenCLExecutionContextScope scope(pExecCtx ? *pExecCtx.get() : ocl::OpenCLExecutionContext::getCurrent());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenCLExecutionContextScope

We should not change execution scope here.

If src's scope is not the same (as dst/thread), then .clone() src UMat first.
If mask's scope is not the same (as as dst/thread), then .clone() mask UMat first.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, and i used the same solution for convertTo

@asmorkalov
Copy link
Copy Markdown
Contributor

@kallaballa Friendly reminder.

1 similar comment
@asmorkalov
Copy link
Copy Markdown
Contributor

@kallaballa Friendly reminder.

…convertTo clones UMats if they don't match the current context.
@kallaballa
Copy link
Copy Markdown
Contributor Author

Waiting for the tests. Apart from that i think the PR is ready.

@kallaballa
Copy link
Copy Markdown
Contributor Author

kallaballa commented Nov 21, 2023

Local tests failed so i made a new patch. now local tests succeed.

@kallaballa
Copy link
Copy Markdown
Contributor Author

kallaballa commented Nov 21, 2023

found more cases that slipped by me. local tests succeed. waiting for the CI.

@kallaballa
Copy link
Copy Markdown
Contributor Author

The failing tests look suspicious and i will look into it. anyway it did some testing using my demos of V4D (on a unpublished branch) and I found several problems that prevented me from using UMat with multiple OpenCL contexts.

I will add another commit that was necessary to fix the issues i had but I am overall not sure I got this right in the first place. Maybe I misunderstood? @opencv-alalek could you have a second look on my approach to this?

Also. I currently do the following to copy cross cl-context:

CV_EXPORTS void copy_cross(const cv::UMat& src, cv::UMat& dst) {
	if(dst.empty())
		dst.create(src.size(), src.type());
	src.copyTo(dst.getMat(cv::ACCESS_READ));
}

Is that the right/recommended way? I tried several other variants but to no avail.

@asmorkalov
Copy link
Copy Markdown
Contributor

@kallaballa @opencv-alalek What is the PR status?

@asmorkalov asmorkalov modified the milestones: 4.9.0, 4.10.0 Dec 20, 2023
@opencv-alalek opencv-alalek added the pr: needs test New functionality requires minimal tests set label Dec 20, 2023
Copy link
Copy Markdown
Contributor

@opencv-alalek opencv-alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the update!

Could you prepare test cases which cover new code paths?

E.g, here: https://github.com/opencv/opencv/blob/4.x/modules/core/test/test_opencl.cpp (contains tests for OpenCLExecutionContext)

@@ -1,4 +1,4 @@
/*M///////////////////////////////////////////////////////////////////////////////////////
/*M///////////////////////////////////////////////////////////////////////////////////////
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary change

(
(currentExecCtx.empty() && !pExecCtxDst->empty())
|| (!currentExecCtx.empty() && pExecCtxDst->empty())
|| (pExecCtxDst->getContext().ptr() != currentExecCtx.getContext().ptr())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to add method + test (for such method) instead of writing complex conditions multiple times.

// class OpenCLExecutionContext in ocl.hpp
bool isCompatibleContext(const std::shared_ptr<ocl::OpenCLExecutionContext>& other)
{
    ...
}

Usage:

if (pExecCtxDst && !pExecCtxDst->isCompatibleContext(pExecCtxSrc))
    CV_Error(...)

@asmorkalov asmorkalov modified the milestones: 4.10.0, 4.11.0 May 16, 2024
@asmorkalov asmorkalov modified the milestones: 4.11.0, 4.12.0 Dec 20, 2024
@kallaballa
Copy link
Copy Markdown
Contributor Author

kallaballa commented Jan 3, 2025

This is a prototype that needs rework mostly to improve code-reuse and clarity. That said it does fix the problem and considerably improves performance by avoiding errors and copies (whenever possible) on then many simple and complex tested workloads.

@kallaballa
Copy link
Copy Markdown
Contributor Author

Also when switching between 4.x and this branch i can't find unwanted side effects.

@asmorkalov asmorkalov modified the milestones: 4.12.0, 4.13.0 May 5, 2025
@asmorkalov asmorkalov self-requested a review November 17, 2025 10:51
@kallaballa
Copy link
Copy Markdown
Contributor Author

should i pursue this one? i know it looks very ugly but the gains are real.
}

@asmorkalov asmorkalov modified the milestones: 4.13.0, 4.14.0 Dec 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: ocl feature pr: needs test New functionality requires minimal tests set

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants