Conversation
a7220f4 to
2e9cb58
Compare
|
@anton-potapov Friendly reminder. Please take a look on CI failures. |
- opt in for now (default off)
- enable it by default to otest on CI
- fix warnings - fixed some typos in comments
- try to fix warnings
- use GAPI/OCV debugging facilities
- wait on condition variable only when there active async tasks - use current arena instead of explicit one
- fix compilation error on TBB 4.4. U1
- Added custom GAPI_EXPECT_THROW and machinery to workaround missing support for exact exception propogation in custom build tbb binaries
5c9be4a to
e029b24
Compare
| #define ITT_AUTO_TRACE_GUARD_IMPL(LINE, h) ITT_AUTO_TRACE_GUARD_IMPL_(LINE, h) | ||
| #define ITT_AUTO_TRACE_GUARD(h) ITT_AUTO_TRACE_GUARD_IMPL(__LINE__ , h) | ||
| }} //gimpl::parallel | ||
| } //namespace cv |
There was a problem hiding this comment.
So what happens if ITT is not available? It is not clear from all these #ifdefs what these declarations unfold to. Can you please somehow restructure it?
| } | ||
| o<<"]"; | ||
| return o; | ||
| } |
There was a problem hiding this comment.
It reminds me something...
|
|
||
| # Executor | ||
| src/executor/gexecutor.cpp | ||
| src/executor/gapi_tbb_executor.cpp |
There was a problem hiding this comment.
not sure if gapi_ needs to be here
| target_link_libraries(opencv_test_gapi PRIVATE ade) | ||
| endif() | ||
|
|
||
| if(USE_GAPI_TBB_EXECUTOR) |
There was a problem hiding this comment.
I think it shouldn't be CMake option. Let it compile always if there is TBB available.
To use or not to use it could be decided via our compile_args
|
|
||
|
|
||
|
|
||
| #endif /* OPENCV_GAPI_GAPI_ITT_HPP */ |
There was a problem hiding this comment.
also I'd recommend to drop gapi_ thing from here as well. Everything in modules/gapi is about G-API, but we don't have this prefix everywhere
| //place in totally ordered queue of tasks to execute. Inverse to priority, i.e. lower index means higher priority | ||
| size_t total_order_index = 0; | ||
| std::function<void()> task_body; | ||
| std::function<void(std::function<void()> callback, size_t total_order_index)> async_task_body; |
There was a problem hiding this comment.
I'd give these types names like we do in the rest of our code.
| run_op(op, m_res); | ||
| } | ||
|
|
||
| #endif |
There was a problem hiding this comment.
Instead of modifying the existing class, can we just put a new one on side? Then there's no changes, just additions to.
| cv::GComputation comp(cv::GIn(in), cv::GOut(out)); | ||
|
|
||
| EXPECT_THROW(comp.apply(in_mat, out_mat, cv::compile_args(pkg)), std::logic_error); | ||
| GAPI_EXPECT_THROW(comp.apply(in_mat, out_mat, cv::compile_args(pkg)), std::logic_error); |
| EXPECT_THROW(expr, expected_exception); \ | ||
| } else { \ | ||
| EXPECT_ANY_THROW(expr); \ | ||
| } \ |
There was a problem hiding this comment.
This shouldn't be in precomp, I believe
| // ); | ||
|
|
||
| LOG_INFO(NULL, "Done. Executed " <<executed<<" tasks"); | ||
| } |
There was a problem hiding this comment.
200-LOC functions shouldn't exist.
|
@anton-potapov Friendly reminder. How does the PR correlates with #17851 ? |
|
@anton-potapov probably we should close this one? |
|
@anton-potapov Friendly reminder. |
1 similar comment
|
@anton-potapov Friendly reminder. |
This patch was split into parts to speed up integration. #17851 is the first part - the executor itself. The actual usage it the next part. So this pull request can be closed. |
This Merge Request adds and opt in TBB based executor to GAPI
TBB executor description
TBB executor engine operate with graph of tasks represented via
cv::gimpl::parallel::tile_nodeinstances.Dependencies
For every task (
tile_node) there might be a number of input dependencies (i.e. tasks that should be executed before). This number is represented twice in thetile_nodestructure. They aredependenciesanddependency_count. First one describes total number (i.e. input edges in the graph before execution) and does not change during execution, while second one describes current number of unresolved input dependencies. Oncedependency_countdrops to zero, task is ready for execution as all it's dependencies are satisfiedDependents
Every task that directly depends on the current one is listed in the
tile_node::dependeesmember. The engine decreesdependency_countof these tasks right after the completion of the current task.Sync Vs. Async payload
Task's body in enclosed in either
task_bodyorasync_task_bodymembers depending on task's nature (andasyncflag). While in both cases the engine calls passed in function there are two differences:Execution and Priorities
Once all input dependencies for the task are satisfied (i.e. it's
dependency_countdrops to zero) it is treated as ready for execution and is put into the "ready" queue. The queue (tbb::concurrent_priority_queue) sorts all task according to their priorities.Mapping to TBB
TBB uses
tbb::taskderivatives to describe a piece of work. They can be linked together into a tree describing tasks dependencies. Every TBB task has arefcountfield describing unsatisfied dependencies. Once it drops to zero task is ready for execution (i.e. can bespawned). Dependency tasks are called child tasks (i.e. parent depends on it's childs). Every child task on creation increases it's parent'srefcount. After child task is executed TBB task scheduler automatically decrease parent'srefcount. Root of tasks tree is used only to call it'stbb::task::wait_for_allmethod to wait for the CPU work to complete.wait_for_allwaits for rootrefcountvalue to become1as a sign that all work is done.The executor engine creates a
roottask (derived fromtbb::task) - a root of the task's tree. Once newtile_nodeitem is pushed to the ready queue, newroot's child task is created. Body of every child task does the followingtile_nodeinstance from the priority queuedependency_countfor every dependenttile_nodeinstance (viatile_node::dependeesmember)tile_nodewithdependency_countdropped to zero, extra (tbb) child task is created.Asynchronicity work and TBB active waiting
While waiting for the root task, TBB does not put to sleep calling thread. Instead it keep searching for
spawned (orenqueued) TBB tasks to do, assuming it somehow related with the root task.How ether, in case of async work (like GPU or non TBB CPU work) there may be no such. In this case calling thread constantly spinning, wasting CPU time. To overcome this more elaborated scheme then plain calling to
tbb::task::wait_for_allis needed.tile_node graph creation and re-usage
dependency_countis reset todependenciesvalue for easier reuse oftile_nodePull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.