Skip to content

core(parallel): plugins support#19470

Merged
opencv-pushbot merged 1 commit intoopencv:masterfrom
alalek:core_parallel_plugins
Feb 16, 2021
Merged

core(parallel): plugins support#19470
opencv-pushbot merged 1 commit intoopencv:masterfrom
alalek:core_parallel_plugins

Conversation

@alalek
Copy link
Copy Markdown
Member

@alalek alalek commented Feb 6, 2021

relates #19365

Usage examples are on GitHub PR page.

  • TBB plugin build example:
cd plugin_tbb
cmake <opencv>/modules/core/misc/plugins/parallel_tbb
cmake --build . --config Release
  • OpenMP plugin build example:
cd plugin_openmp
cmake <opencv>/modules/core/misc/plugins/parallel_openmp
cmake --build . --config Release
  • oneTBB plugin build example (reuse TBB plugin code under different binary name):
cd plugin_onetbb
cmake <opencv>/modules/core/misc/plugins/parallel_tbb -DOPENCV_PLUGIN_OUTPUT_NAME=opencv_core_parallel_onetbb
cmake --build . --config Release
  • Extra steps for Windows:
cmake -G "Visual Studio 16 2019" -A x64 -DOpenCV_DIR:PATH=...
  • Install target (no .pdb):
cmake "-DCMAKE_INSTALL_PREFIX:PATH=<opencv_build>/bin/Release" ...`
cmake --build . --config Release --target INSTALL

Hardening option:

cmake -DENABLE_BUILD_HARDENING=ON ...`

Control environment variables:

  • OPENCV_PARALLEL_PRIORITY_TBB=0 - disable
  • OPENCV_PARALLEL_PRIORITY_LIST=TBB - preferred backends names
  • OPENCV_PARALLEL_BACKEND=TBB - forced backend name

Tasks:

  • (skip) extract videoio changes into dedicated PR
  • API to select backend by name (setParallelForBackend(const std::string& backendName, ...))
    • Python works
  • compiler options, hardening
  • reduce number of .cpp files
  • (postpone) drop OpenMP / TBB builtin code from src/parallel.cpp
  • (postpone) cmake: support plugins build with OpenCV
    • problem with cvconfig.h / gapi (link with tbb binaries)
  • (postpone) refactor builtin parallel code as ParallelForAPI
    • fix HAVE_OPENMP (create define from CMake)
  • documentation
  • (postpone) adding new plugins
  • ? needed OPENCV_PLUGIN_MODULE_PREFIX
  • Unify plugin_standalone.cmake
    • validated videoio/misc/build_plugins.sh
  • Workaround crashes in worker threads after plugin unloading (do not unload parallel plugins)
force_builders=ARMv7,Custom
build_image:Custom=centos:7
buildworker:Custom=linux-1,linux-4,linux-6

@mshabunin mshabunin self-assigned this Feb 9, 2021
@mshabunin
Copy link
Copy Markdown
Contributor

I have a problem with OpenMP backend built as a plugin (Ubuntu 18):

$ ./bin/opencv_test_core --gtest_filter=*parallel*
<...>
CTEST_FULL_OUTPUT
OpenCV version: 4.5.1-dev
OpenCV VCS version: 4.5.1-143-geb10e518cc-dirty
Build type: Release
Compiler: /usr/bin/c++  (ver 7.5.0)
[DEBUG:0] global /opencv/modules/core/src/parallel/parallel.cpp (103) createDefaultParallelForAPI core(parallel): Initializing parallel backend...
[DEBUG:0] global /opencv/modules/core/src/parallel/registry_parallel.impl.hpp (63) ParallelBackendRegistry core(parallel): Builtin backends(3): ONETBB(1000); TBB(990); OPENMP(980)
[DEBUG:0] global /opencv/modules/core/src/parallel/registry_parallel.impl.hpp (88) ParallelBackendRegistry core(parallel): Available backends(3): ONETBB(1000); TBB(990); OPENMP(980)
[ INFO:0] global /opencv/modules/core/src/parallel/registry_parallel.impl.hpp (90) ParallelBackendRegistry core(parallel): Enabled backends(3, sorted by priority): ONETBB(1000); TBB(990); OPENMP(980)
[ INFO:0] global /opencv/modules/core/src/parallel/parallel.cpp (50) createParallelForAPI core(parallel): requested backend name: OPENMP
[DEBUG:0] global /opencv/modules/core/src/parallel/parallel.cpp (65) createParallelForAPI core(parallel): trying backend: OPENMP (priority=980)
[DEBUG:0] global /opencv/modules/core/src/parallel/plugin_parallel_wrapper.impl.hpp (219) getPluginCandidates core(parallel): OPENMP plugin's glob is 'libopencv_core_parallel_openmp*.so', 1 location(s)
[DEBUG:0] global /opencv/modules/core/src/parallel/plugin_parallel_wrapper.impl.hpp (226) getPluginCandidates     - /build/lib: 1
[DEBUG:0] global /opencv/modules/core/src/parallel/plugin_parallel_wrapper.impl.hpp (230) getPluginCandidates Found 1 plugin(s) for OPENMP
[ INFO:0] global /opencv/modules/core/src/utils/plugin_loader.impl.hpp (43) libraryLoad load /build/lib/libopencv_core_parallel_openmp.so => OK
[DEBUG:0] global /opencv/modules/core/src/parallel/plugin_parallel_wrapper.impl.hpp (31) initPluginAPI Found entry: 'opencv_core_parallel_plugin_init_v0'
[DEBUG:0] global /opencv/modules/core/src/parallel/plugin_parallel_wrapper.impl.hpp (80) checkCompatibility core(parallel): initialized 'OpenMP (201511) OpenCV parallel plugin': built with OpenCV 4.5 (ABI/API = 0/0), current OpenCV version is '4.5.1-dev' (ABI/API = 0/0)
[ INFO:0] global /opencv/modules/core/src/parallel/plugin_parallel_wrapper.impl.hpp (48) initPluginAPI core(parallel): plugin is ready to use 'OpenMP (201511) OpenCV parallel plugin'
[ INFO:0] global /opencv/modules/core/src/parallel/parallel.cpp (73) createParallelForAPI core(parallel): using backend: OPENMP (priority=980)
Parallel framework: openmp (nthreads=8)
CPU features: SSE SSE2 SSE3 *SSE4.1 *SSE4.2 *FP16 *AVX *AVX2 *AVX512-SKX?
Intel(R) IPP version: ippIP AVX2 (l9) 2020.0.0 Gold (-) Oct 19 2019
Intel(R) IPP features code: 0x8000
OpenCL Platforms:
<...>
TEST: Skip tests with tags: 'mem_6gb', 'verylong'
Note: Google Test filter = *parallel*
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from Core_Rand
[ RUN      ] Core_Rand.parallel_for_stable_results
[       OK ] Core_Rand.parallel_for_stable_results (3 ms)
[----------] 1 test from Core_Rand (3 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (3 ms total)
[  PASSED  ] 1 test.
[ INFO:0] global /opencv/modules/core/src/utils/plugin_loader.impl.hpp (50) libraryRelease unload /build/lib/libopencv_core_parallel_openmp.so
Segmentation fault

With gdb:

[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from Core_Rand
[ RUN      ] Core_Rand.parallel_for_stable_results
[New Thread 0x7fffdbc45700 (LWP 27795)]
[New Thread 0x7fffdb444700 (LWP 27796)]
[New Thread 0x7fffdac43700 (LWP 27797)]
[New Thread 0x7fffda442700 (LWP 27798)]
[New Thread 0x7fffd9c41700 (LWP 27799)]
[New Thread 0x7fffd9440700 (LWP 27800)]
[New Thread 0x7fffd8c3f700 (LWP 27801)]
[       OK ] Core_Rand.parallel_for_stable_results (1 ms)
[----------] 1 test from Core_Rand (1 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (3 ms total)
[  PASSED  ] 1 test.
[ INFO:0] global /opencv/modules/core/src/utils/plugin_loader.impl.hpp (50) libraryRelease unload /build/lib/libopencv_core_parallel_openmp.so

Thread 9 "opencv_test_cor" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffd8c3f700 (LWP 27801)]
0x00007fffdbc5ef22 in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
(gdb) bt
#0  0x00007fffdbc5ef22 in  () at /usr/lib/x86_64-linux-gnu/libgomp.so.1
#1  0x00007fffdbc5c948 in  () at /usr/lib/x86_64-linux-gnu/libgomp.so.1
#2  0x00007ffff79a66db in start_thread (arg=0x7fffd8c3f700) at pthread_create.c:463
#3  0x00007ffff3c0488f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

@@ -0,0 +1,60 @@
function(ocv_create_builtin_core_parallel_plugin name target)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file and plugin_standalone.cmake are similar to videoio/cmake/plugin*.cmake. Can they be unified somehow? Maybe moved to the root cmake/ directory?

Copy link
Copy Markdown
Member Author

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mshabunin Thanks!
Added task to find workaround for cases with unloading crashes

message(STATUS "OpenMP detection requires CMake 3.9+") # OpenMP::OpenMP_CXX target
endif()

find_package(OpenMP)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libraryRelease unload /build/lib/libopencv_core_parallel_openmp.so
Segmentation fault

LD_DEBUG=libs dump is:

[  PASSED  ] 1 test.
[ INFO:0] global /home/alalek/projects/opencv/dev/modules/core/src/utils/plugin_loader.impl.hpp (50) libraryRelease unload /home/alalek/projects/opencv/build/opencv/lib/libopencv_core_parallel_openmp.so
    232271:	
    232271:	calling fini: /home/alalek/projects/opencv/build/opencv/lib/libopencv_core_parallel_openmp.so [0]
    232271:	
    232271:	
    232271:	calling fini: /lib64/libgomp.so.1 [0]
    232271:	
Segmentation fault (core dumped)

libgomp.so.1 is de-initialized (finit call), but its worker threads are still alive (gdb shows that).

So,

  • this is a bug of libgomp.so.1 (no idea why worker threads are not terminated during unload stage)
  • we should not force OpenMP thread pool termination from OpenCV side, because application can use OpenMP directly
  • some workaround is required (probably don't unload parallel plugins)

The similar issue can be observed with TBB

Issue is sporadic with TBB (not 100% reproducible).

GDB:

[ RUN      ] Core_Rand.parallel_for_stable_results
[New Thread 0x7fffd46b7640 (LWP 232758)]
[New Thread 0x7fffa8ffa640 (LWP 232760)]
[New Thread 0x7fffb07f8640 (LWP 232759)]
[       OK ] Core_Rand.parallel_for_stable_results (174 ms)
...
[ INFO:0] global /home/alalek/projects/opencv/dev/modules/core/src/utils/plugin_loader.impl.hpp (50) libraryRelease unload /home/alalek/projects/opencv/build/opencv/lib/libopencv_core_parallel_onetbb.so

Thread 34 "opencv_test_cor" received signal SIGSEGV, Segmentation fault.
(gdb) info threads 
...
  33   Thread 0x7fffd46b7640 (LWP 232758) "opencv_test_cor" 0x00007ffff47ba55d in syscall () from /lib64/libc.so.6
* 34   Thread 0x7fffa8ffa640 (LWP 232760) "opencv_test_cor" 0x00007fffe749ba32 in ?? ()
  35   Thread 0x7fffb07f8640 (LWP 232759) "opencv_test_cor" 0x00007ffff47ba55d in syscall () from /lib64/libc.so.6

(232760 is created during the test run)

LD_DEBUG dump:

[ INFO:0] global /home/alalek/projects/opencv/dev/modules/core/src/utils/plugin_loader.impl.hpp (50) libraryRelease unload /home/alalek/projects/opencv/build/opencv/lib/libopencv_core_parallel_onetbb.so
    232863:	
    232863:	calling fini: /home/alalek/projects/opencv/build/opencv/lib/libopencv_core_parallel_onetbb.so [0]
    232863:	
    232863:	
    232863:	calling fini: /lib64/libtbb.so.2 [0]
    232863:	
    232863:	
    232863:	calling fini: /lib64/libirml.so.1 [0]
    232863:	
Segmentation fault (core dumped)

TODO: Will try to check some workarounds.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added code to prevent automatic unloading of parallel backend library (including underlying libraries).
There is no huge impact, because plugins are kept loaded until opencv_core unloading anyway.

Comment on lines +3 to +16
# FIXIT: stop using PARENT_SCOPE in dependencies
if(PROJECT_NAME STREQUAL "OpenCV")
macro(add_backend backend_id cond_var)
if(${cond_var})
include("${CMAKE_CURRENT_LIST_DIR}/detect_${backend_id}.cmake")
endif()
endmacro()
else()
function(add_backend backend_id cond_var)
if(${cond_var})
include("${CMAKE_CURRENT_LIST_DIR}/detect_${backend_id}.cmake")
endif()
endfunction()
endif()
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is inconsistency in scope of dependencies.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could switch to target checks instead of HAVE_ variables. Additional information can be stored in custom target properties.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants