dnn(OpenCL): fix automatic globalsize adjusting by alalek · Pull Request #20655 · opencv/opencv

alalek · 2021-09-06T03:16:41Z

if kernel code doesn't support that

relates #20615

force_builders=Custom,Linux AVX2,Linux OpenCL
build_image:Custom=ubuntu:18.04
buildworker:Custom=linux-5
test_opencl:Custom=ON

build_image:Linux AVX2=ubuntu:18.04
buildworker:Linux AVX2=linux-3
test_opencl:Linux AVX2=ON

buildworker:Custom Win=windows-3
build_image:Custom Win=msvs2019
test_opencl:Custom Win=ON
test_filter:Custom Win=*:-Test_Caffe_nets.FasterRCNN_vgg16/1

- if kernel code doesn't support that

alalek · 2021-09-06T17:19:58Z

👍

diablodale · 2021-09-07T13:17:13Z

modules/core/include/opencv2/core/ocl.hpp

+     * @param sync specify whether to wait for OpenCL computation to finish before return.
+     * @param q command queue
+     */
+    bool run_(int dims, size_t globalsize[], size_t localsize[], bool sync, const Queue& q=Queue());


Can I be in the loop on these quick changes you are making? I don't see anyone reviewing them before you merge.

A strange and difficult to understand API like run vs run_ isn't readable or maintainable.
Instead, if you want a new api, then something like the following is clean and readable. And on the next ABI change can be condensed into a single API

bool run(int dims, size_t globalsize[], size_t localsize[], bool sync, const Queue& q=Queue()); bool run(int dims, size_t globalsize[], size_t localsize[], bool sync, const Queue& q=Queue(), const bool tuneGlobalSize = true);

cc @mshabunin

@diablodale , IMO this name is fine, it exposes raw private method and often underscores are used to mark such private-public functions (e.g. in Python). I believe this patch is not meant to be final, either dnn kernels should be updated, either globalsize tuning mechanism should be improved (like you suggested).

This series of patches is needed to enable new CI machine which is blocked by hanging and crashing dnn and ocl tests. We will take in account any comments and patches by the community even after this PR is merged.

This is a public method. Not private! It introduces an ABI promise across 3.x, 4.x, and ongoing maintenance. The ocl.cpp file is quite hectic and the patchwork of changes made over the years continues to grow with less-long-term hacks like this.

DNN is not the core of OpenCV. Meaning it is built on top of the core parts. In OpenCV terms, it is built on top of the core module. The core module can present to modules like DNN a set of public services.
The correction of this API is trivial to make. And provides consistency, legacy support, and ABI changing future support. There is no known downside...compared to this run_ thing.

I found this bug and design flaw. It has existed for ~8 years. I recommend those invested into a solution (like us) not move too quickly. That CI machine can wait a week.

I through about extra bool parameter.
However it is better to avoid multiple bool parameters in any function / method. (true, false) is confusing and non-readable API.

run() vs run_() usually means the the second one do "less" work and the first one is an "extended" variant.
In general, for properly designed OpenCL kernels we should not use run_() at all. But there are several external contributions which doesn't follow the spirit of OpenCV OpenCL computations and we have such bugs.

Suggestions are welcome for the name of new method.

Alternatives are:

enqueue (as reference to clEnqueueNDRangeKernel, but there is "sync" parameter - so this name is not accurate). Also there would be questions about difference of run() and enqueue() and they may be improperly used.

runNoTuneGlobalSize (long name, but do less job). It is better to name existed behavior of run() as runWithTuneGlobalSize(), but it is too late.

something else?

BTW, existed "adjusting" empirics are buggy too. It provides weird results for some cases.

I follow your thinking on multiple booleans. Yet, that is far less concern for me than overall readibility.
https://google.github.io/styleguide/cppguide.html#General_Naming_Rules

Optimize for readability using names that would be clear even to people on a different team. Use names that describe the purpose or intent of the object. Do not worry about saving horizontal space as it is far more important to make your code immediately understandable by a new reader. Minimize the use of abbreviations that would likely be unknown to someone outside your project (especially acronyms and initialisms). Do not abbreviate by deleting letters within a word.

When a new API is required, then please it needs to be readable. A developers should be able to read the code and know what it does.

The existing run apis are fine. I do not see value in changing the existing "run" family of APIs. Kernel::run, Kernel::runTask, Kernel::runProfiling() are all readable and provide distinction between each.

Kernel::run() is great because it is the root, default, basic functionality. It is short, clear, and (I think) good. It reads in english exact what it does "kernel run" aka "run a kernel".

I recommend follow the same pattern for a new API. It should start with "run" and then provide distinction.

Your second idea Kernel::runNoTuneGlobalSize. works for me. Also the shorter Kernel::runNoTune. The second I think works because there is no other tuning that is occuring.

Adding the 4th Kernel "run" api demonstrates API proliferation. This is concerning but also a choice already made two times before with runProfiling and runTask.

// unreadable example //... else { if (haveSrc2) k.args(srcarg, src.cols, (int)src.total(), ngroups, dbarg, src2arg); else k.args(srcarg, src.cols, (int)src.total(), ngroups, dbarg); } size_t globalsize = ngroups * wgs; if (k.run_(1, &globalsize, &wgs, true)) { typedef Scalar (*part_sum)(Mat m); part_sum funcs[3] = { ocl_part_sum<int>, ocl_part_sum<float>, ocl_part_sum<double> }, func = funcs[ddepth - CV_32S]; Mat mres = db.getMat(ACCESS_READ); if (calc2) const_cast<Scalar &>(res2) = func(mres.colRange(ngroups, dbsize)); res = func(mres.colRange(0, ngroups)); return true; }

// good readable example // ... else { if (haveSrc2) k.args(srcarg, src.cols, (int)src.total(), ngroups, dbarg, src2arg); else k.args(srcarg, src.cols, (int)src.total(), ngroups, dbarg); } size_t globalsize = ngroups * wgs; if (k.runNoTuneGlobalSize(1, &globalsize, &wgs, true)) { typedef Scalar (*part_sum)(Mat m); part_sum funcs[3] = { ocl_part_sum<int>, ocl_part_sum<float>, ocl_part_sum<double> }, func = funcs[ddepth - CV_32S]; Mat mres = db.getMat(ACCESS_READ); if (calc2) const_cast<Scalar &>(res2) = func(mres.colRange(ngroups, dbsize)); res = func(mres.colRange(0, ngroups)); return true; }

dnn(ocl): fix automatic globalsize adjusting

5578ad5

- if kernel code doesn't support that

opencv-pushbot merged commit 1e0d290 into opencv:3.4 Sep 6, 2021

diablodale reviewed Sep 7, 2021

View reviewed changes

alalek mentioned this pull request Sep 11, 2021

(4.x) Merge 3.4 #20691

Merged

alalek mentioned this pull request Sep 29, 2021

dnn(OpenCL): fix conv BASIC workgroup #20774

Merged

alalek mentioned this pull request Oct 15, 2021

(5.x) Merge 4.x #20886

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

dnn(OpenCL): fix automatic globalsize adjusting#20655

dnn(OpenCL): fix automatic globalsize adjusting#20655
opencv-pushbot merged 1 commit intoopencv:3.4from
alalek:dnn_ocl_fix_globalsize

alalek commented Sep 6, 2021 •

edited

Loading

Uh oh!

alalek commented Sep 6, 2021

Uh oh!

diablodale Sep 7, 2021

Uh oh!

mshabunin Sep 7, 2021

Uh oh!

diablodale Sep 7, 2021

Uh oh!

alalek Sep 7, 2021

Uh oh!

diablodale Sep 8, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

alalek commented Sep 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alalek commented Sep 6, 2021

Uh oh!

diablodale Sep 7, 2021

Choose a reason for hiding this comment

Uh oh!

mshabunin Sep 7, 2021

Choose a reason for hiding this comment

Uh oh!

diablodale Sep 7, 2021

Choose a reason for hiding this comment

Uh oh!

alalek Sep 7, 2021

Choose a reason for hiding this comment

Uh oh!

diablodale Sep 8, 2021

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

alalek commented Sep 6, 2021 •

edited

Loading