(5.x) Merge 4.x by asmorkalov · Pull Request #24338 · opencv/opencv

asmorkalov · 2023-09-28T13:46:40Z

OpenCV Contrib: opencv/opencv_contrib#3566
OpenCV Extra: opencv/opencv_extra#1102

#23897 from fengyuentau:refactor_fc
#24037 from Abdurrahheem:ash/dev_einsum
#24058 from hanliutong:rewrite-imgporc
#24074 from Kumataro/fix24057
#24126 from AleksandrPanov:fix_charuco_checkBoard
#24131 from cudawarped:cuda_add_default_ptx
#24132 from hanliutong:rewrite-imgproc2
#24166 from hanliutong:rewrite-remaining
#24201 from lpylpy0514:4.x
#24239 from asmorkalov:as/msmf_returned_fourcc
#24247 from AleksandrPanov:fix_drawDetectedCornersCharuco_type_error
#24250 from dkurt:ts_fixture_constructor_skip_2
#24260 from vrabaud:ubsan
#24263 from georgthegreat:msan-include
#24266 from alexlyulkov:al/tf-argmax-default-dim
#24269 from FlyinTeller:patch-1
#24270 from dkurt:fix_24256
#24274 from vrabaud:webp_1.3.2
#24275 from alexlyulkov:al/fix-tf-graph-simplifier
#24278 from georgthegreat:compat-fixes
#24280 from casualwind:parallel_opt
#24283 from fengyuentau:halide_tests
#24286 from ashadrina:intel_icx_compiler_support
#24288 from tailsu:sd/emscripten-3.1.45-fixes
#24291 from visitorckw:fix-memory-leak
#24295 from fengyuentau:add_onnx_expand
#24301 from hanliutong:rewrite-stereo-sift
#24302 from dkurt:ts_setup_skip
#24303 from asmorkalov:as/vittack_warning_fix
#24305 from hanliutong:toolchain
#24309 from dkurt:gemm_ov_hotfix
#24316 from alexlyulkov:al/fix-caffe-read-segfault
#24329 from asmorkalov/as/openvino_ci
#24334 from fengyuentau:fix_24319

Temporary disabled new test Charuco.testSeveralBoardsWithCustomIds. PR #23473 changed output shape for detector corners. More detailed investigation is required.

Previous "Merge 4.x: #24254

…CH_PTX to be passed in isolation

More strict test for MSMF FOURCC (camera)

…onds added tests

…ornersCharuco_type_error fix type cast in drawDetectedMarkers, drawDetectedCornersCharuco, drawDetectedDiamonds

Add missing sanitizer interface include

openBLAS windows release calls their library libopenblas which was not recognized before. see opencv#24268

…-dim Added default dimension value to tensorflow ArgMax and ArgMin layers opencv#24266 Added default dimension value to tensorflow ArgMax and ArgMin layers. Added exception when accessing layer's input with out of range index. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=48452

Update OpenCVFindOpenBLAS.cmake to accomodate alternative lib name

`cuda`: update default PTX behaviour when `CUDA_ARCH_BIN` is unset

Fix undefined behavior arithmetic in copyMakeBorder and adjustROI. opencv#24260 This is due to the undefined: negative int multiplied by size_t pointer increment. To test, compile with: ``` mkdir build cd build cmake ../ -DCMAKE_C_FLAGS_INIT="-fsanitize=undefined" -DCMAKE_CXX_FLAGS_INIT="-fsanitize=undefined" -DCMAKE_C_COMPILER="/usr/bin/clang" -DCMAKE_CXX_COMPILER="/usr/bin/clang++" -DCMAKE_SHARED_LINKER_FLAGS="-fsanitize=undefined -lubsan" ``` And run: ``` make -j opencv_test_core && ./bin/opencv_test_core --gtest_filter=*UndefinedBehavior* ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

Higher threshold for FasterRCNN_vgg16

Rewrite Universal Intrinsic code by using new API: ImgProc module. opencv#24058 The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro in the `opencv/modules/imgproc` folder: rewrite them by using the new Universal Intrinsic API. For easier review, this PR includes a part of the rewritten code, and another part will be brought in the next PR (coming soon). I tested this patch on RVV (QEMU) and AVX devices, `opencv_test_imgproc` is passed. The patch is partially auto-generated by using the [rewriter](https://github.com/hanliutong/rewriter), related PR opencv#23885 and opencv#23980. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake

Merge pull request opencv#24274 from vrabaud:webp_1.3.2 Bump libwebp to 1.3.2 opencv#24274 This is version [c1ffd9a](https://chromium.googlesource.com/webm/libwebp/+/c1ffd9ac7593894c40a1de99d03f0b7af8af2577) It is 1.3.2 with a few patches that were made right after to help compilation. No need for patches on the OpenCV side! ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch

Skip test on SkipTestException at fixture's constructor (version 2) opencv#24250 ### Pull Request Readiness Checklist Another version of opencv#24186 (reverted by opencv#24223). Current implementation cannot handle skip exception at `static void SetUpTestCase` but works on `virtual void SetUp`. See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

More fixes for iterators-are-pointers case

…lifier Fixed removePhaseSwitches in tf_graph_simplifier

In the previous code, there was a memory leak issue where the previously allocated memory was not freed upon a failed realloc operation. This commit addresses the problem by releasing the old memory before setting the pointer to NULL in case of a realloc failure. This ensures that memory is properly managed and avoids potential memory leaks.

build fixes for emscripten 3.1.45

Rewrite Universal Intrinsic code by using new API: ImgProc module Part 2 opencv#24132 The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro in the opencv/modules/imgproc folder: rewrite them by using the new Universal Intrinsic API. This is the second part of the modification to the Imgproc module ( Part 1: opencv#24058 ), And I tested this patch on RVV (QEMU) and AVX devices, `opencv_test_imgproc` is passed. The patch is partially auto-generated by using the [rewriter](https://github.com/hanliutong/rewriter). ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake

Python: support tuple src for cv::add()/subtract()/... opencv#24074 fix opencv#24057 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ x The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

Rewrite Universal Intrinsic code: ImgProc (CV_SIMD_WIDTH related Part) opencv#24166 Related PR: opencv#24058, opencv#24132. The goal of this series of PRs is to modify the SIMD code blocks in the opencv/modules/imgproc folder by using the new Universal Intrinsic API. The modification of this PR mainly focuses on the code that uses the `CV_SIMD_WIDTH` macro. This macro is sometimes used for loop tail processing, such as `box_filter.simd.hpp` and `morph.simd.hpp`. ```cpp #if CV_SIMD int i = 0; for (i < n - v_uint16::nlanes; i += v_uint16::nlanes) { // some universal intrinsic code // e.g. v_uint16... } #if CV_SIMD_WIDTH > 16 for (i < n - v_uint16x8::nlanes; i += v_uint16x8::nlanes) { // handle loop tail by 128 bit SIMD // e.g. v_uint16x8 } #endif //CV_SIMD_WIDTH #endif// CV_SIMD ``` The main contradiction is that the variable-length Universal Intrinsic backend cannot use 128bit fixed-length data structures. Therefore, this PR uses the scalar loop to handle the loop tail. This PR is marked as draft because the modification of the `box_filter.simd.hpp` file caused a compilation error. The cause of the error is initially believed to be due to an internal error in the GCC compiler. ```bash box_filter.simd.hpp:1162:5: internal compiler error: Segmentation fault 1162 | } | ^ 0xe03883 crash_signal /wafer/share/gcc/gcc/toplev.cc:314 0x7ff261c4251f ??? ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0 0x6bde48 hash_set<rtl_ssa::set_info*, false, default_hash_traits<rtl_ssa::set_info*> >::iterator::operator*() /wafer/share/gcc/gcc/hash-set.h:125 0x6bde48 extract_single_source /wafer/share/gcc/gcc/config/riscv/riscv-vsetvl.cc:1184 0x6bde48 extract_single_source /wafer/share/gcc/gcc/config/riscv/riscv-vsetvl.cc:1174 0x119ad9e pass_vsetvl::propagate_avl() const /wafer/share/gcc/gcc/config/riscv/riscv-vsetvl.cc:4087 0x119ceaf pass_vsetvl::execute(function*) /wafer/share/gcc/gcc/config/riscv/riscv-vsetvl.cc:4344 0x119ceaf pass_vsetvl::execute(function*) /wafer/share/gcc/gcc/config/riscv/riscv-vsetvl.cc:4325 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. ``` This PR can be compiled with Clang 16, and `opencv_test_imgproc` is passed on QEMU. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake

VIT track(gsoc realtime object tracking model) opencv#24201 Vit tracker(vision transformer tracker) is a much better model for real-time object tracking. Vit tracker can achieve speeds exceeding nanotrack by 20% in single-threaded mode with ARM chip, and the advantage becomes even more pronounced in multi-threaded mode. In addition, on the dataset, vit tracker demonstrates better performance compared to nanotrack. Moreover, vit trackerprovides confidence values during the tracking process, which can be used to determine if the tracking is currently lost. opencv_zoo: opencv/opencv_zoo#194 opencv_extra: [https://github.com/opencv/opencv_extra/pull/1088](https://github.com/opencv/opencv_extra/pull/1088) # Performance comparison is as follows: NOTE: The speed below is tested by **onnxruntime** because opencv has poor support for the transformer architecture for now. ONNX speed test on ARM platform(apple M2)(ms): | thread nums | 1| 2| 3| 4| |--------|--------|--------|--------|--------| | nanotrack| 5.25| 4.86| 4.72| 4.49| | vit tracker| 4.18| 2.41| 1.97| **1.46 (3X)**| ONNX speed test on x86 platform(intel i3 10105)(ms): | thread nums | 1| 2| 3| 4| |--------|--------|--------|--------|--------| | nanotrack|3.20|2.75|2.46|2.55| | vit tracker|3.84|2.37|2.10|2.01| opencv speed test on x86 platform(intel i3 10105)(ms): | thread nums | 1| 2| 3| 4| |--------|--------|--------|--------|--------| | vit tracker|31.3|31.4|31.4|31.4| preformance test on lasot dataset(AUC is the most important data. Higher AUC means better tracker): |LASOT | AUC| P| Pnorm| |--------|--------|--------|--------| | nanotrack| 46.8| 45.0| 43.3| | vit tracker| 48.6| 44.8| 54.7| [https://youtu.be/MJiPnu1ZQRI](https://youtu.be/MJiPnu1ZQRI) In target tracking tasks, the score is an important indicator that can indicate whether the current target is lost. In the video, vit tracker can track the target and display the current score in the upper left corner of the video. When the target is lost, the score drops significantly. While nanotrack will only return 0.9 score in any situation, so that we cannot determine whether the target is lost. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake

…opencv#23897) * first commit * turned C from input to constant; force C constant in impl; better handling 0d/1d cases * integrate with gemm from ficus nn * fix const inputs * adjust threshold for int8 tryQuantize * adjust threshold for int8 quantized 2 * support batched gemm and matmul; tune threshold for rcnn_ilsvrc13; update googlenet * add gemm perf against innerproduct * add perf tests for innerproduct with bias * fix perf * add memset * renamings for next step * add dedicated perf gemm * add innerproduct in perf_gemm * remove gemm and innerproduct perf tests from perf_layer * add perf cases for vit sizes; prepack constants * remove batched gemm; fix wrong trans; optimize KC * remove prepacking for const A; several fixes for const B prepacking * add todos and gemm expression * add optimized branch for avx/avx2 * trigger build * update macros and signature * update signature * fix macro * fix bugs for neon aarch64 & x64 * add backends: cuda, cann, inf_ngraph and vkcom * fix cuda backend * test commit for cuda * test cuda backend * remove debug message from cuda backend * use cpu dispatcher * fix neon macro undef in dispatcher * fix dispatcher * fix inner kernel for neon aarch64 * fix compiling issue on armv7; try fixing accuracy issue on other platforms * broadcast C with beta multiplied; improve func namings * fix bug for avx and avx2 * put all platform-specific kernels in dispatcher * fix typos * attempt to fix compile issues on x64 * run old gemm when neon, avx, avx2 are all not available; add kernel for armv7 neon * fix typo * quick fix: add macros for pack4 * quick fix: use vmlaq_f32 for armv7 * quick fix for missing macro of fast gemm pack f32 4 * disable conformance tests when optimized branches are not supported * disable perf tests when optimized branches are not supported * decouple cv_try_neon and cv_neon_aarch64 * drop googlenet_2023; add fastGemmBatched * fix step in fastGemmBatched * cpu: fix initialization ofb; gpu: support batch * quick followup fix for cuda * add default kernels * quick followup fix to avoid macro redef * optmized kernels for lasx * resolve mis-alignment; remove comments * tune performance for x64 platform * tune performance for neon aarch64 * tune for armv7 * comment time consuming tests * quick follow-up fix

Fix memory leak and handle realloc failure

…port Add Intel® oneAPI DPC++/C++ Compiler (icx) opencv#24286 Intel® C++ Compiler Classic (icc) is deprecated and will be removed in a oneAPI release in the second half of 2023 ([deprecation notice](https://community.intel.com/t5/Intel-oneAPI-IoT-Toolkit/DEPRECATION-NOTICE-Intel-C-Compiler-Classic/m-p/1412267#:~:text=Intel%C2%AE%20C%2B%2B%20Compiler%20Classic%20(icc)%20is%20deprecated%20and%20will,the%20second%20half%20of%202023.)). This commit is intended to add support for the next-generation compiler, Intel® oneAPI DPC++/C++ Compiler (icx) (the documentation for the compiler is available on the [link](https://www.intel.com/content/www/us/en/docs/dpcpp-cpp-compiler/developer-guide-reference/2023-2/overview.html)). ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake

Rewrite Universal Intrinsic code: features2d and calib3d module. opencv#24301 The goal of this series of PRs is to modify the SIMD code blocks guarded by CV_SIMD macro: rewrite them by using the new Universal Intrinsic API. This is the modification to the features2d module and calib3d module. Test with clang 16 and QEMU v7.0.0. `AP3P.ctheta1p_nan_23607` failed beacuse of a small calculation error. But this patch does not touch the relevant code, and this error always reproduce on QEMU, regardless of whether the patch is applied or not. I think we can ignore it ``` [ RUN ] AP3P.ctheta1p_nan_23607 /home/hanliutong/project/opencv/modules/calib3d/test/test_solvepnp_ransac.cpp:2319: Failure Expected: (cvtest::norm(res.colRange(0, 2), expected, NORM_INF)) <= (3e-16), actual: 3.33067e-16 vs 3e-16 [ FAILED ] AP3P.ctheta1p_nan_23607 (26 ms) ... [==========] 148 tests from 64 test cases ran. (1147114 ms total) [ PASSED ] 147 tests. [ FAILED ] 1 test, listed below: [ FAILED ] AP3P.ctheta1p_nan_23607 ``` Note: There are 2 test cases failed with GCC 13.2.1 without this patch, seems like there are someting wrong with RVV part on GCC. ``` [----------] Global test environment tear-down [==========] 148 tests from 64 test cases ran. (1511399 ms total) [ PASSED ] 146 tests. [ FAILED ] 2 tests, listed below: [ FAILED ] Calib3d_StereoSGBM.regression [ FAILED ] Calib3d_StereoSGBM_HH4.regression ``` The patch is partially auto-generated by using the [rewriter](https://github.com/hanliutong/rewriter). ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake

cmake: Fix riscv-gnu toolchain file. opencv#24305 cmake(3.22.1) failed without the keyword `PATHS` on my device when I manually set `TOOLCHAIN_COMPILER_LOCATION_HINT` in command. And this patch is going to fix this issue. [CMake Doc](https://cmake.org/cmake/help/latest/command/find_program.html): > find_program ( > <VAR> > name | NAMES name1 [name2 ...] [NAMES_PER_DIR] > [HINTS [path | ENV var]... ] > [PATHS [path | ENV var]... ] ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake

…gfault Fixed segfault when reading Caffe model

* add expand impl with cv::broadcast * remove expandMid * deduce shape from -1 * add constant folding * handle input constant; handle input constant 1d * add expand conformance tests; add checks to disallow shape of neg values; add early copy for unchanged total elements * fix ExpandSubgraph * dummy commit to trigger build * dummy commit to trigger build 1 * remove conformance from test names

Update OpenVINO init of new GEMM layer opencv#24309 ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request CI validation: - [x] 2022.1.0: https://pullrequest.opencv.org/buildbot/builders/precommit_custom_linux/builds/100368 - [ ] 2021.4.2: https://pullrequest.opencv.org/buildbot/builders/precommit_custom_linux/builds/100373 Checklist: - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

dnn: merge tests from test_halide_layers to test_backends opencv#24283 Context: opencv#24231 (review) ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

Added CI with OpenVINO for DNN and G-API.

Optimization for parallelization when large core number opencv#24280 **Problem description：** When the number of cores is large, OpenCV’s thread library may reduce performance when processing parallel jobs. **The reason for this problem:** When the number of cores (the thread pool initialized the threads, whose number is as same as the number of cores) is large, the main thread will spend too much time on waking up unnecessary threads. When a parallel job needs to be executed, the main thread will wake up all threads in sequence, and then wait for the signal for the job completion after waking up all threads. When the number of threads is larger than the parallel number of a job slices, there will be a situation where the main thread wakes up the threads in sequence and the awakened threads have completed the job, but the main thread is still waking up the other threads. The threads woken up by the main thread after this have nothing to do, and the broadcasts made by the waking threads take a lot of time, which reduce the performance. **Solution：** Reduce the time for the process of main thread waking up the worker threads through the following two methods: • The number of threads awakened by the main thread should be adjusted according to the parallel number of a job slices. If the number of threads is greater than the number of the parallel number of job slices, the total number of threads awakened should be reduced. • In the process of waking up threads in sequence, if the main thread finds that all parallel job slices have been allocated, it will jump out of the loop in time and wait for the signal for the job completion. **Performance Test:** The tests were run in the manner described by https://github.com/opencv/opencv/wiki/HowToUsePerfTests. At core number = 160, There are big performance gain in some cases. Take the following cases in the video module as examples: OpticalFlowPyrLK_self::Path_Idx_Cn_NPoints_WSize_Deriv::("cv/optflow/frames/VGA_%02d.png", 2, 1, (9, 9), 11, true) Performance improves 191%:0.185405ms ->0.0636496ms perf::DenseOpticalFlow_VariationalRefinement::(320x240, 10, 10) Performance improves 112%:23.88938ms -> 11.2562ms Among all the modules, the performance improvement is greatest on module video, and there are also certain improvements on other modules. At core number = 160, the times labeled below are the geometric mean of the average time of all cases for one module. The optimization is available on each module. overall | time(ms) | | | | | | | -- | -- | -- | -- | -- | -- | -- | -- | -- module name | gapi | dnn | features2d | objdetect | core | imgproc | stitching | video original | 0.185 | 1.586 | 9.998 | 11.846 | 0.205 | 0.215 | 164.409 | 0.803 optimized | 0.174 | 1.353 | 9.535 | 11.105 | 0.199 | 0.185 | 153.972 | 0.489 Performance improves | 6% | 17% | 5% | 7% | 3% | 16% | 7% | 64% Meanwhile, It is found that adjusting the order of test cases will have an impact on some test cases. For example, we used option --gtest-shuffle to run opencv_perf_gapi, the performance of TestPerformance::CmpWithScalarPerfTestFluid/CmpWithScalarPerfTest::(compare_f, CMP_GE, 1920x1080, 32FC1, { gapi.kernel_package }) case had 30% changes compared to the case without shuffle. I would like to ask if you have also encountered such a situation and could you share your experience? ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake

dnn onnx: fix not-found constant indices for Gather if shared

opencv-alalek

Why do you not using the latest 4.x commits?

Missing commit: c7ec0d5
Merged at 1:24 PM UTC.

This PR is opened at 1:46 PM UTC (merge commit is created at 1:42 PM UTC).

Both events are handled by you.
Messing with date order is not a good practice for such synchronization points like merging of branches.

Memory corruption hotfixes for #24126 should be landed on 4.x first (instead of trying to provide fixes during the merge loosing the half of the patch).

Refer to email with subject fix charuco checkBoard (60ae973) from @vrabaud

/cc @AleksandrPanov

opencv-alalek · 2023-09-28T18:42:04Z

There is another fix which should be landed on 4.x before merging to 5.x: #24315
(to avoid propagation of test crashes due to memory corruption)

fengyuentau · 2023-09-29T08:30:36Z

I will try to finalize #24315 asap.

asmorkalov · 2023-10-02T07:22:02Z

@opencv-alalek I disabled failed test for now and we will debug the issue this week. Let's merge, if you do not have objections. I'll repeat the merge procedure, when #24315 is done.

opencv-alalek · 2023-10-02T12:37:29Z

I disabled failed test

Single test is not enough.
Again, there is memory corruption in checkBoard() code of Charuco.
To properly avoid sporadic failures you need to disable almost all tests of Charuco.

Did you read comment from Vincent?

opencv-alalek

Fixes are postponed to the next iteration (more time is required)

cudawarped and others added 30 commits August 12, 2023 11:09

cuda: add default ptx when CUDA_ARCH_BIN is missing and allow CUDA_AR…

358e306

…CH_PTX to be passed in isolation

More strict test for MSMF FOURCC (camera).

5c9f58e

Merge pull request opencv#24239 from asmorkalov:as/msmf_returned_fourcc

6694d87

More strict test for MSMF FOURCC (camera)

fix drawDetectedCornersCharuco, drawDetectedMarkers, drawDetectedDiam…

ae1d1b6

…onds added tests

Merge pull request opencv#24247 from AleksandrPanov:fix_drawDetectedC…

9761492

…ornersCharuco_type_error fix type cast in drawDetectedMarkers, drawDetectedCornersCharuco, drawDetectedDiamonds

Add missing sanitizer interface include

eb20bb3

Merge pull request opencv#24263 from georgthegreat:msan-include

4790a37

Add missing sanitizer interface include

Update OpenCVFindOpenBLAS.cmake to accomodate alternative lib name

347a1e2

openBLAS windows release calls their library libopenblas which was not recognized before. see opencv#24268

Higher threshold for FasterRCNN_vgg16

c5edd20

Merge pull request opencv#24269 from FlyinTeller:patch-1

fa81936

Update OpenCVFindOpenBLAS.cmake to accomodate alternative lib name

Merge pull request opencv#24131 from cudawarped:cuda_add_default_ptx

ec1c060

`cuda`: update default PTX behaviour when `CUDA_ARCH_BIN` is unset

Merge pull request opencv#24270 from dkurt:fix_24256

515f119

Higher threshold for FasterRCNN_vgg16

Fixed removePhaseSwitches in tf_graph_simplifier

d4cb564

More fixes for iterators-are-pointers case

638c575

Merge pull request opencv#24278 from georgthegreat:compat-fixes

0a53afe

More fixes for iterators-are-pointers case

Merge pull request opencv#24275 from alexlyulkov:al/fix-tf-graph-simp…

157b0e7

…lifier Fixed removePhaseSwitches in tf_graph_simplifier

build fixes for emscripten 3.1.45

9b5a719

Merge pull request opencv#24288 from tailsu:sd/emscripten-3.1.45-fixes

8f2e664

build fixes for emscripten 3.1.45

Merge pull request opencv#24291 from visitorckw:fix-memory-leak

799bb0c

Fix memory leak and handle realloc failure

ashadrina and others added 13 commits September 22, 2023 17:09

Fixed segfault when reading Caffe model

72e7672

Merge pull request opencv#24316 from alexlyulkov:al/fix-caffe-read-se…

9942757

…gfault Fixed segfault when reading Caffe model

Added CI with OpenVINO for DNN and G-API.

43036e0

Merge pull request opencv#24329 from asmorkalov/as/openvino_ci

1baaac2

Added CI with OpenVINO for DNN and G-API.

init commit

7fa0493

Merge pull request opencv#24334 from fengyuentau:fix_24319

b8d4ac5

dnn onnx: fix not-found constant indices for Gather if shared

This was referenced Sep 28, 2023

(5.x) Merge 4.x opencv/opencv_contrib#3566

Merged

(5.x) Merge 4.x opencv/opencv_extra#1102

Merged

asmorkalov changed the title ~~(5.x) Merge 4.x~~ WIP: (5.x) Merge 4.x Sep 28, 2023

asmorkalov force-pushed the 5.x-merge-4.x branch from 41a40b5 to 1767f2d Compare September 28, 2023 16:35

opencv-alalek reviewed Sep 28, 2023

View reviewed changes

Merge branch 4.x

163d544

asmorkalov force-pushed the 5.x-merge-4.x branch from 1767f2d to 163d544 Compare October 2, 2023 07:18

asmorkalov changed the title ~~WIP: (5.x) Merge 4.x~~ (5.x) Merge 4.x Oct 2, 2023

asmorkalov assigned opencv-alalek Oct 4, 2023

asmorkalov added this to the 5.0 milestone Oct 4, 2023

opencv-alalek approved these changes Oct 4, 2023

View reviewed changes

asmorkalov merged commit 163d544 into opencv:5.x Oct 5, 2023

asmorkalov mentioned this pull request Oct 17, 2023

(5.x) Merge 4.x #24416

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

(5.x) Merge 4.x#24338

(5.x) Merge 4.x#24338
asmorkalov merged 51 commits intoopencv:5.xfrom
asmorkalov:5.x-merge-4.x

asmorkalov commented Sep 28, 2023 •

edited

Loading

Uh oh!

opencv-alalek left a comment

Uh oh!

opencv-alalek commented Sep 28, 2023

Uh oh!

fengyuentau commented Sep 29, 2023

Uh oh!

asmorkalov commented Oct 2, 2023

Uh oh!

opencv-alalek commented Oct 2, 2023

Uh oh!

opencv-alalek left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

17 participants

Uh oh!

Conversation

asmorkalov commented Sep 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

opencv-alalek left a comment

Choose a reason for hiding this comment

Uh oh!

opencv-alalek commented Sep 28, 2023

Uh oh!

fengyuentau commented Sep 29, 2023

Uh oh!

asmorkalov commented Oct 2, 2023

Uh oh!

opencv-alalek commented Oct 2, 2023

Uh oh!

opencv-alalek left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

17 participants

asmorkalov commented Sep 28, 2023 •

edited

Loading