Skip to content

Hough many circles#10232

Merged
alalek merged 37 commits intoopencv:masterfrom
TomBecker-BD:hough-many-circles
Dec 28, 2017
Merged

Hough many circles#10232
alalek merged 37 commits intoopencv:masterfrom
TomBecker-BD:hough-many-circles

Conversation

@TomBecker-BD
Copy link
Copy Markdown
Contributor

@TomBecker-BD TomBecker-BD commented Dec 5, 2017

Merge with extra: opencv/opencv_extra#418

resolves #10227

This pullrequest changes

  1. Optimize for finding many small circles in an image. Uses a matrix of flags to hold the non-zero edge positions. If the number of non-zero edges is small it switches to a vector.

  2. Improve performance checking the minimum distance between circles. Move the code out of the EstimateRadius parallel loop body so there is not lock contention.

  3. Sort the circles so the circles with higher accumulator value are first. The API documentation says that it did this, but the code was not there.

  4. The maxCircles limit is applied only when copying the circles to the output array. Otherwise circles with higher accumulator values could be excluded.

  5. Set maxRadius < 0 to return the center points without the radius search. Preserves the existing behavior where setting maxRadius = 0 defaults maxRadius to the image size.

  6. Add unit and performance tests.

Thanks to Hui Liu for the original performance optimization.

docker_image:Custom=powerpc64le
docker_image:Docs=docs-js

Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenCV should compile with C++98 compiler.
Please eliminate usage of nullptr (replace to NULL).


// Estimate best radius
for(; j < nzSz; ++j)
for(int iz = 0; iz < nnz; ++iz)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop is a "tail processing" of SIMD optimized code. It can't start from zero again.

SIMD loop above is broken too. Also SIMD code is totally optional and can be excluded from sources.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong. I had to study the code to figure out how it worked. The previous code was loading data directly from the nz vector. It looped in increments of 4 for SIMD processing. This would leave a remainder of 0 to 3 points in nz that still needed to be processed. This is conveniently done by falling through to the non-SIMD loop.

The new code can't do that, because nz is a container that is accessed only through a forward iterator. If the container is matrix-based, it contains flags, not points. The first loop fills the nzx and nzy arrays. When nnz == 4 the arrays are loaded into vector registers and processed. When the iterator reaches the end, there will be 0 to 3 remaining points in nzx and nzy. The second loop is there to process them. The loop starts from 0 because it is an index into the nzx and nzy arrays.

I also added an else to prevent the SIMD code from falling through into the non-SIMD loop.

There still is a non-SIMD loop for the case where SIMD is disabled either at compile time or run time. To be honest, I have only been testing the SIMD code. The non-SIMD code should work, but it needs to be tested.

I don't know why you are saying the SIMD loop is broken. The design was changed to work with a forward iterator instead of a random-addressable vector. It still produces the same results. Maybe there is something I did wrong or could do better.

Copy link
Copy Markdown
Contributor

@savuor savuor Dec 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SIMD code is correct (although it has some non-evident logic) but it's arguable whether it gives us additional performance. I think you can freely exclude it from source.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some performance test results:

Geometric mean
                                        NoSIMD          SIMD           SIMD
                                    getNumberOfCPUs getNumThreads getNumberOfCPUs
Basic::PerfHoughCircles                8.401 ms       8.754 ms       8.417 ms
ManySmallCircles::PerfHoughCircles2   138.989 ms     148.279 ms     139.745 ms

SIMD is basically a wash. Here it is 1% slower. In some other tests I ran it was 1-2% faster.

Using getNumberOfCPUs() to get numThreads is about 5% faster than using getNumThreads(). This is on a quad-core system where getNumberOfCPUs() == 8 and getNumThreads() == 9.

@savuor savuor self-assigned this Dec 6, 2017
@savuor
Copy link
Copy Markdown
Contributor

savuor commented Dec 6, 2017

As I said in issue description, the typo was in the documentation, this is the maxCircles argument which signalizes that user doesn't ask for radiuses.
Therefore the condition if(maxCIrcles == 0) is correct and should be left with one more call to RemoveOverlaps()

using namespace cv;
using namespace perf;

PERF_TEST(Basic, PerfHoughCircles)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PerfHoughCircles should be the first argument, Basic should be the second one like that:
PERF_TEST(test_case, specific_test)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that was the right way to do it, but I got this error when I tried to run the tests:

Failed
All tests in the same test case must use the same test fixture
class. However, in test case PerfHoughCircles,
you defined test Basic and test ManySmallCircles
using two different test fixture classes. This can happen if
the two classes are from different namespaces or translation
units and have the same name. You should probably rename one
of the classes to put the tests into different test cases.

Maybe the problem is because I am using the PERF_TEST macro. I will look into it further.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was assuming PERF_TEST works like the googletest TEST macro. Unfortunately it only allows one test per test case. I tried PERF_TEST_F and it had the same limitation. I worked around it by changin the test case names to "PerfHoughCircles" and "PerfHoughCircles2".

SANITY_CHECK_NOTHING();
}

PERF_TEST(ManySmallCircles, PerfHoughCircles)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same as above

@savuor
Copy link
Copy Markdown
Contributor

savuor commented Dec 6, 2017

I made performance measurements and found out that (it was easy to predict though) using NZPointSet is faster for small circles search but slower for big range of radiuses (for example like that)
My suggestions are the following:

  1. When iterating inside NZPointSet use both minRadius and maxRadius to skip inner radiuses too. This will make calculations faster also for cases when minRadius is quite big. In that case inner rectangle will be sqrt(2) times smaller than minRadius.
  2. We can make a decision about what to use, NZPointSet or NZPointList at runtime. Since we reduce amount of iterations per circle center we just need to calculate the following comparison:
    (maxRadius*maxRadius - minRadius*minRadius/2) < nz.size()
    (divide by 2 is because of mentioned sqrt(2) scale)
    Based on what is smaller we decide what method to use. To be able to do that we need to keep both structures, which means that during accumulator build NZ we should fill both Set and List parts with the same pooints.

OpenCV should compile with C++98 compiler.
@TomBecker-BD
Copy link
Copy Markdown
Contributor Author

Unfortunately, there does not seem to be a way the user can use the maxCircles argument to ask for no radius search. The maxCircles argument is not in the external API.

The exported cv::HoughCircles function does not have a maxCircles argument. It calls the internal HoughCircles function with a hard-coded -1 for maxCircles.

The cvHoughCircles wrapper does not have a maxCircles argument. It calls the internal HoughCircles function with either INT_MAX or circles->total for maxCircles.

The only way it might be possible to get a maxCircles value of 0 is by calling the cvHoughCircles wrapper with a zero-length circle_storage. But then there is no room to return any center coordinates.

If the functionality is important, it could be implemented by adding a new function argument where it can be set by the user. I looked for any issues requesting that it be fixed, but I could not find any.

@TomBecker-BD
Copy link
Copy Markdown
Contributor Author

Good idea about skipping the pixels inside minRadius / sqrt(2).

If you want to choose between NZPointList or NZPointSet dynamically, it would be easy to copy from one data structure to the other. I would lean towards using NZPointSet for the accumulator build because the memory is pre-allocated, and because it automatically eliminates duplicates. I would move the code that gets int nzSz = int(nz.size()); into the main function. That way there is no added overhead if the code decides to stay with the same data structure, and the cost of a single pass over the matrix if it decides to switch. The HoughCircleEstimateRadiusInvoker can get the NZPoints type as a template parameter.

Use different test suite names for each test to avoid an error from the test runner.
@savuor
Copy link
Copy Markdown
Contributor

savuor commented Dec 7, 2017

Sorry, I forgot that maxCircles is not a part of API.
There should be another way to turn off radius search through API.
Remember that maxRadius=0 is a default argument with which the algorithm shouldn't require any post-processing to get circles.
That's why let's use negative values of maxRadius to turn off radius search. When building accumulator we need to use image size instead of maxRadius.

@TomBecker-BD
Copy link
Copy Markdown
Contributor Author

Thanks for the feedback. I am going to work on:

  1. Fixing build errors and warnings.
  2. Skip radius search if maxRadius < 0.
  3. Dynamically switch to NZPointList if it will be faster than NZPointSet.

EXPECT_GT(circles.size(), size_t(0)) << "Should find at least some circles";
for (size_t i = 0; i < circles.size(); ++i)
{
EXPECT_EQ(circles[0][2], 0.0f) << "Did not ask for radius";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you forget i index?

for (size_t i = 0; i < circles.size(); ++i)
{
EXPECT_GE(circles[0][2], minRadius) << "Radius should be >= minRadius";
EXPECT_LE(circles[0][2], maxDimension) << "Radius should be <= max image dimension";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i index seems to be forgotten here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching that. I just pushed a fix.

@savuor
Copy link
Copy Markdown
Contributor

savuor commented Dec 14, 2017

And by the way, have you found out what was wrong on Mac build?
Maybe removing that SIMD part will work.

@TomBecker-BD
Copy link
Copy Markdown
Contributor Author

I don't know why it was failing on the Mac. It was failing only in the "ManySmallCircles" test. I'm pretty sure the SIMD path is executed on the passing tests. It could be a compilation issue with NZPointSet, but that code was well exercised on other platforms. It could have been a glitch in pulling "Beads.png" or in loading it, due to the size of the image. But that's just guessing - I don't know. I removed the failing test as a workaround. I'm not happy about that, but it was the only way I could think of to stop the build from failing.

I have a Mac at home I could use to debug the failure. If I can find the problem and fix it, I will submit a new PR with the "ManySmallCircles" test added back. But it's going to be a while before I have the time to work on it at home.

@TomBecker-BD
Copy link
Copy Markdown
Contributor Author

TomBecker-BD commented Dec 14, 2017

Should have looked at the build results first. The Mac test timed out in ImgProc/HoughCirclesTestFixture.regression/0, where GetParam() = ("imgproc/stuff.jpg", 20, 20, 30, 20, 200). It's not clear why. Something is flaky. I will try to debug it.

@savuor
Copy link
Copy Markdown
Contributor

savuor commented Dec 14, 2017

There could be build system problems with Mac build, I just re-ran tests and everything works fine (except patch size for extra which we've fixed once and this wil be no problem).
Sure, you can add the test for ManySmallCircles later as a separate PR.

@TomBecker-BD
Copy link
Copy Markdown
Contributor Author

OK, thanks.

@@ -0,0 +1,233 @@
/*M///////////////////////////////////////////////////////////////////////////////////////
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid using of Capitalized letters in file names: test_houghCircles.cpp

Step 1: remove the old file names
Step 2: add them back with the new names
@savuor
Copy link
Copy Markdown
Contributor

savuor commented Dec 18, 2017

👍

@alalek
Copy link
Copy Markdown
Member

alalek commented Dec 18, 2017

Mac build with an error -11

This means that code tries reading/writing memory out of buffer. This usually leads to crashes like this or non-predictable results of this algorithm or other algorithms because their memory was corrupted (worst case because these things very hard to debug).

Does this problem really get fixed (or failed tests are just disabled)?

@TomBecker-BD
Copy link
Copy Markdown
Contributor Author

I tried converting all the pointer references to go through at so access is range-checked in the debug build, and added range checking code around other arrays. So far I have not found any buffer errors.

I am doing a Mac build now. I want to try running with AddressSanitizer.

@TomBecker-BD
Copy link
Copy Markdown
Contributor Author

I found a bug that was causing an EXC_BAD_ACCESS exception on the Mac. The cmpAccum function was not strictly weak ordered and it caused std::sort to go out of bounds. It should be safe to add back the ManySmallCircles tests. Unless the "beads.jpg" file is too big. It's 668 KB.

@TomBecker-BD
Copy link
Copy Markdown
Contributor Author

I ran the Hough Circles tests under Address Sanitizer, Thread Sanitizer, Undefined Behavior Sanitizer, Malloc Scribble, Malloc Guard Edges, and Guard Malloc. Thread Sanitizer reported a data race in ipp_gaussianBlurParallel. Undefined Behavior Sanitizer reported warnings about bit shifting in libjpeg. Otherwise no issues.

@savuor
Copy link
Copy Markdown
Contributor

savuor commented Dec 20, 2017

@TomBecker-BD Thank you for a huge amount of your work!
I think now we can merge it

@savuor
Copy link
Copy Markdown
Contributor

savuor commented Dec 20, 2017

👍

- simplify NZPointList
- drop broken (de-synchronization of 'current'/'mi' fields) NZPointSet iterator
- NZPointSet iterator is replaced to direct area scan
- use SIMD intrinsics
- avoid std exceptions (build for embedded systems)
@alalek
Copy link
Copy Markdown
Member

alalek commented Dec 21, 2017

@TomBecker-BD I pushed additional patches for this PR (simplify code, SIMD intrinsics).
Please take a look.

@TomBecker-BD
Copy link
Copy Markdown
Contributor Author

The new code looks good. It's faster. The results are not identical, but they look okay.

@alalek
Copy link
Copy Markdown
Member

alalek commented Dec 21, 2017

The results are not identical

There was an issue in NZPointSet iterator. It lost synchronization between "current" and "mi" and returns points with "zero" value sometimes:

++current.x;
if ((current.y >= yInner.start && current.y < yInner.end) &&
    (current.x >= xInner.start && current.x < xInner.end))
{
    current.x = xInner.end;
    // FIXIT lost proper "mi" update like
    // mi = positions.ptr<uchar>(current.y, current.x - 1) // -1 see below: "++mi;"
}
if (current.x < xOuter.end)
{
    ++mi;
}
else
...

@TomBecker-BD
Copy link
Copy Markdown
Contributor Author

Good. That explains it. In retrospect excluding the inner range was too much complexity. The iterator needed its own suite of unit tests.

@alalek alalek merged commit 592f8d8 into opencv:master Dec 28, 2017
carol-mohemian pushed a commit to mohemian/mohemian-opencv that referenced this pull request Jan 29, 2018
* cmake: fix -fPIC/-fPIE handling in precompiled headers (PCH)

* Added fallback to generic linear resize in case bit-exact resize of provided matrix isn't supported

* samples: check for valid input in gpu/super_resolution.cpp

* Replaced incorrect CV_Assert calls with CV_Error

* ml: simplify interfaces of SimulatedAnnealingSolver

* add one more convolution kernel tuning candidate

Signed-off-by: Li Peng <peng.li@intel.com>

* Reverted calls to linear resize back to generic version for floating point matrices

* TensorFlow weights dequantization

* ml(ANN_MLP): ensure that train() call is always successful

* OpenCV version++

3.4.0

* build: fix MSVS2010 build error

* Optimize OpenCL BackgroundSubstractionMOG2

* Add basic plumbing for AVX512 support

The opencv infrastructure mostly has the basics for supporting avx512 math functions,
but it wasn't hooked up (likely due to lack of users)

In order to compile the DNN functions for AVX512, a few things need to be hooked up
and this patch does that

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>

* JS examples - FFT didn't work for non-square images because rows/cols were switched, Histogram example misspelled point

* opencl/cvtclr_dx: fix not compile-time constants issue.

fix the "initializing global variables with values that are not
compile-time constants" issue in Intel SDK for OpenCL. The root cause
is when initializing global variables with value, the variable need is
compile-time constants.

Thanks Zheng, Yang <yang.zheng@intel.com>,
Chodor, Jaroslaw <jaroslaw.chodor@intel.com> give a help.

Signed-off-by: Liu,Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Jun Zhao <jun.zhao@intel.com>

* dnn(ocl4dnn): update pre-tuned kernel config

Signed-off-by: Li Peng <peng.li@intel.com>

* Limit Concat layer optimization

* Provide a few AVX512 optimized functions for the DNN module

This patch adds AVX512 optimized fastConv as well as the hookups
needed to get these called in the convolution_layer.

AVX512 fastConv is code-identical on a C level to the AVX2 one,
but is measurably faster due to AVX512 having more registers available
to cache results in.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>

* Fix functions‘ class attribution error

* Use class' method to set attribute value

* Parallelization of BackgroundSubtractorKNN

* Fixed opencv#10433

* ts(feature): add "--test_threads=<N>" command-line option

* dnn(test): avoid calling of cv::setNumThreads() in tests directly

It is not necessary by default.
Also it breaks test system command-line parameters: --perf_threads / --test_threads

* imgcodecs(png): resolve ASAN issue with vars scope and setjmp() call

* Changed VA device in MediaSDK session initialization

* Add ocl version FasterRCNN accuracy test

Signed-off-by: Li Peng <peng.li@intel.com>

* Allocate new memory for optimized concat to prevent collisions.
Add a flag to disable memory reusing in dnn module.

* Merge pull request opencv#10232 from TomBecker-BD:hough-many-circles

Hough many circles (opencv#10232)

* Add Hui's optimization. Merge with latest changes in OpenCV.

* Use conditional compilation instead of a runtime flag.

* Whitespace.

* Create the sequence for the nonzero edge pixels only if using that approach.

* Improve performance for finding very large numbers of circles

* Return the circles with the larger accumulator values first, as per API documentation.
Use a separate step to check distance between circles. Allows circles to be sorted by strength first. Avoids locking in EstimateRadius which was slowing it down.
Return centers only if maxRadius == 0 as per API documentation.

* Sort the circles so results are deterministic. Otherwise the order of circles with the same strength depends on parallel processing completion order.

* Add test for HoughCircles.

* Add beads test.

* Wrap the non-zero points structure in a common interface so the code can use either a vector or a matrix.

* Remove the special case for skipping the radius search if maxRadius==0.

* Add performance tests.

* Use NULL instead of nullptr.
OpenCV should compile with C++98 compiler.

* Put test suite name first.
Use different test suite names for each test to avoid an error from the test runner.

* Address build bot errors and warnings.

* Skip radius search if maxRadius < 0.

* Dynamically switch to NZPointList when it will be faster than NZPointSet.

* Fix compile error: missing 'typename' prior to dependent type name.

* Fix compile error: missing 'typename' prior to dependent type name.
This time fix it the non C++ 11 way.

* Fix compile error: no type named 'const_reference' in 'class cv::NZPointList'

* Disable ManySmallCircles tests. Failing on Mac.

* Change beads image to JPEG for smaller file size.
Try enabling the ManySmallCircles tests again.

* Remove ManySmallCircles tests. They are failing on the Mac build.

* Fix expectations to check all circles.

* Changing case on a case-insensitive file system
Step 1: remove the old file names

* Changing case on a case-insensitive file system
Step 2: add them back with the new names

* Fix cmpAccum function to be strictly weak ordered.

* Add tests for many small circles.

* imgproc(perf): fix HoughCircles tests

* imgproc(houghCircles): refactor code

- simplify NZPointList
- drop broken (de-synchronization of 'current'/'mi' fields) NZPointSet iterator
- NZPointSet iterator is replaced to direct area scan
- use SIMD intrinsics
- avoid std exceptions (build for embedded systems)

* Merge pull request opencv#10352 from vinay0410:write_pbm

* added write as pbm

* add tests for pbm

* imgcodecs: PBM support

- drop additional PBM parameters
- write: fix P1/P4 mode (no maxval 255 value after width/height)
- write: invert values for P1/P4
- write: P1: compact ASCII mode (no spaces)
- simplify pbm test
- drop .pxm extension (http://netpbm.sourceforge.net/doc/ doesn't know such extension)

* cmake: AVX512 -> AVX_512F

* cmake(opt): AVX512_SKX

* Updating rotcalipers.cpp to resolve issue opencv#10096

Updating the documentation of the rotcalipers.cpp to resolve issue opencv#10096

* Merge pull request opencv#10469 from victor-ludorum:stichingbranch

Updating stiching.cpp to resolve new line issue opencv#10461 (opencv#10469)

* cmake: avoid timestamp change of version_string.inc file

* video: clean up bg subtraction tutorial

* imgproc(hdr): fix bounds check in HdrDecoder::checkSignature()

* python: filter modules headers (from <module>/include directory)

* Update C++ MobileNet-SSD object detection sample

* Merge pull request opencv#10446 from cabelo:style-dnn-yolo

* add style draw in yolo

* fix sintaxe whitespace

* fix conversion from float to int

* sample refactoring

- minor code style fixes
- avoid confusing "bottom" names
- use cv::format
- always draw object detection/roi

* cmake: avoid unnecessary files creation in ocv_cmake_configure()

* add ocl implementation of proposal layer

Signed-off-by: Li Peng <peng.li@intel.com>

* add opencl option for resnet_ssd_face sample

Signed-off-by: Li Peng <peng.li@intel.com>

* Merge pull request opencv#10493 from RachexCoralie:tiff32FC1Codec

* Load and save tiff images in CV_32FC1 format (1 channel of floats).

* Add test

* Fix error handling and resources leak. Improve test.

* build: eliminate warnings

warning: comparison between signed and unsigned integer expressions [-Wsign-compare]

* build: eliminate warning

warning: 'layout.channel_layout::gchan' may be used uninitialized in this function [-Wmaybe-uninitialized]

* cmake: eliminate ninja generator warning (CMake 3.10), refactor code

* Allow compilation with unified include directory

This makes it possible to compile OpenCV with for instance ndk-r16

* android(toolchain): detailed error message

* android(toolchain): fix find_path() behavior (zlib search problem)

* 3rdparty(protobuf): fix build with Android NDK 16

* android: IPPICV static libraries workarounds for NDK 16

* 3rdparty(libtiff): fix build with Android NDK 16

Compiler message:
tif_unix.c:170:12: error: call to 'mmap' declared with attribute error:
mmap is not available with _FILE_OFFSET_BITS=64 when using GCC until android-21.
Either raise your minSdkVersion, disable _FILE_OFFSET_BITS=64, or switch to Clang.

* cmake(android): fix ccache detection (native Android toolchain with NDK_CCACHE)

To prevent this result:
    /usr/bin/ccache ccache <android-ndk-r16.1>/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/arm-linux-androideabi-gcc ...
with:
    ccache: error: Recursive invocation (the name of the ccache binary must be "ccache")

* cmake(android): update zlib alias condition

To use 'z' in Android.mk instead of absolute path from the build machine.

* cmake(android): update CMAKE_FIND_ROOT_PATH_MODE_* handling

Avoids this error message with CMake 3.10:
CMake Error: CMake was unable to find a build program corresponding to "Ninja".
CMAKE_MAKE_PROGRAM is not set. You probably need to select a different build tool.

Related changes is that pkg-config tool is found (/usr/bin/pkg-config).

* video(perf): add Mog2/KNN tests, fixed bug in prepareData()

prepareData() bug feeds the same image (the latest)
OCL perf test doesn't pass accuracy(!) check now, so this check is disabled.

* add eltwise layer ocl implementation

Signed-off-by: Li Peng <peng.li@intel.com>

* add normalize_bbox layer ocl implementation

Signed-off-by: Li Peng <peng.li@intel.com>

* Fixed missing #include "../precomp.hpp"

* cmake(android): fix non-idempotent INSTALL scripts

* cmake(android): refactor copying of Android samples project files

* Improve the documentation for cv::Affine3.

* Merge pull request opencv#10529 from LaurentBerger:ExampleGoogleNet

* Add a parameter labels to command line

* default value

* samples: caffe_googlenet.cpp minor refactoring

* Fix typo in video_input_psnr_ssim

* BLOB - Support RGBA

* Bitwise "and false"

Bitwise "and false" is always false.

* dnn: add OPENCV_DNN_DISABLE_MEMORY_OPTIMIZATIONS runtime option

replaces REUSE_DNN_MEMORY compile-time option

* resolves opencv#10548 - `FLANN::knnSearch` garbage bug (when kNN is larger than the dataset size)

* Replace Caffe's psroi_pooling_param tag from 10001 to 10002

* cmake: allow BUILD_FAT_JAVA_LIB for non-Android targets too

* Merge pull request opencv#10502 from cabelo/save-dnn-yolo

Save video file (opencv#10502)

* Merge pull request opencv#10522 from tobycollins:issue10519

* batch_norm and blank layer ocl implementation

Signed-off-by: Li Peng <peng.li@intel.com>

* imgcodecs: remove assert() usage

* imgcodecs: add overflow checks

* imgcodecs: add more Jasper checks for supported and tested cases

* Fixed exception when ROI for generated sample is evaluated out of image borders

* core: fix unresolved symbols from utils::fs

* Merge pull request opencv#10497 from aaron-bray:msvc2017-findcuda

* Update to properly find the compiler tools for MSVC 2017

* FindCUDA: Fix the MSVC 2017 compiler tool locations

* Merge pull request opencv#10466 from dkurt:reduce_umat_try_2

* UMat blobs are wrapped

* Replace getUMat and getMat at OpenCLBackendWrapper

* Added getFontScaleFromHeight()

* modules/videoio: fix PS3Eye camera property window

v2: fix stray trailing whitespace

v3: only allow for up to one property window at the time

Opening multiple windows in the same process will just confuse
the camera filter or outright crash.

Suggested-by: @alalek

Also return whether a dialog was opened at the time.

* Merge pull request opencv#10489 from SarenT:offset-mat_put

Adding capability to parse subsections of a byte array in Java bindings (opencv#10489)

* Adding capability to parse subsections of a byte array in Java bindings. (Because Java lacks pointers. Therefore, reading images within a subsection of a byte array is impossible by Java's nature and limitations. Because of this, many IO functions in Java require additional parameters offset and length to define, which section of an array to be read.)

* Corrected according to the review. Previous interfaces were restored, instead internal interfaces were modified to provide subsampling of java byte arrays.

* Adding tests and test related files.

* Adding missing files for the test.

* Simplified the test

* Check was corrected according to discussion. An OutOfRangeException will be thrown instead of returning.

* java: update MatOfByte implementation checks / tests

* Fixed concurrent OpenCL cache folder name generation

* java(test): fix test names

* java: files rename

intermediate commit (to simplify code review)

* Update documentation

* cmake: Java/Android SDK refactoring

* fixes for old CMake (2.8.12.2)

* java: disable highgui wrapped code

* java: fix bindings generator

- fix imports override.
  Problem is observed with BoostDesc.

- add Ptr<> handling (constructor is protected from other packages).
  Observed in ximgproc:
      Ptr<StereoMatcher> createRightMatcher(Ptr<StereoMatcher> matcher_left)"
  where, "StereoMather" is from another package (calib3d)

* java: fix MacOS Java problem

* android: fix SDK build

- fix Javadoc:
  - generate Javadoc after gather step to process all Java files (including Android 21)
  - generate into 'OpenCV-android-sdk' directly without additional copy step
- use smart copy/move utility functions ('shutil' doesn't well with existed destination)
- by default move files to reduce pressure on storage I/O (> 800Mb)

* cmake: ocv_target_include_directories() handle SYSTEM directories

* Untrainable version of Scale layer from Caffe

* dnn::blobFromImage with OutputArray

* android: update build_sdk.py

- configuration files for ABIs configuration
- use builtin Android NDK's toolchain by default (force flag to use OpenCV's toolchain)
- default values for 'work_dir' and 'opencv_dir'

* Parallelize initUndistortRectifyMap

* Updated protobuf version to 3.5.1

* dnn: Updated protobuf files (3.5.1)

* dnn: protobuf build warnings

* protobuf: drop unused files

* copyright: 2018

* batch_norm layer ocl update

use a batch_norm ocl kernel to do the work

Signed-off-by: Li Peng <peng.li@intel.com>

* Fixed several warnings produced by clang 6 and static analyzers

* Add ThinLTO support for clang

* Propagate calculated Gaussian kernel size(ref). Otherwise, ipp_GaussianBlur will fail if user doesn't specify a kernel size. (opencv#10579)

* core(ocl): fix deadlock in UMatDataAutoLock

UMatData locks are not mapped on real locks (they are mapped to some "pre-initialized" pool).

Concurrent execution of these statements may lead to deadlock:
- a.copyTo(b) from thread 1
- c.copyTo(d) from thread 2
where:
- 'a' and 'd' are mapped to single lock "A".
- 'b' and 'c' are mapped to single lock "B".

Workaround is to process locks with strict order.

* fix: use CXX_STANDARD when extracting compiler flags for PCH with GNUCXX

When compiling with cmake using -DCMAKE_CXX_STANDARD=11 use `-std=gnu++11`
for PCH compiler flags, otherwise it triggers an error:

   opencv_core_Release.gch: not used because `__cplusplus' defined as ` 201103L' not ` 199711L' [-Winvalid-pch]

Use CXX_EXTENSIONS property to select `gnu++11` or `c++11`.
Trying to mimic cmake logic here: https://gitlab.kitware.com/cmake/cmake/blob/master/Source/cmLocalGenerator.cxx#L1527-1557

* fix issue opencv#10612.

* Improve the documentation for cv::completeSymm and cv::RANSACUpdateNumIters.

* Merge pull request opencv#10574 from razerhell:patch-1

* Newton's method can be more efficient

when we get the result of function distortPoint with a point (0, 0) and then undistortPoint with  the result, we get the point not (0, 0). and then we discovered that the old method is not convergence sometimes. finally we have gotten the right values by Newton's method.

* modify by advice  Newton's method...opencv#10574

* calib3d(fisheye): fix codestyle, update theta before exit EPS check

* Power, Tanh and Channels ReLU layer ocl support

Signed-off-by: Li Peng <peng.li@intel.com>

* MVN layer ocl implementation

Signed-off-by: Li Peng <peng.li@intel.com>

* Make DNN Crop layer match Caffe default offset behavior
and add parametric unit test for crop layer.

* Re-apply protobuf fix for JavaScript builds

original commit: f503515
JavaScript bindings for dnn module

* Update samples

* Merge pull request opencv#10412 from GregoryMorse:patch-2

Update to add window position and size retrieval to HighGUI (opencv#10412)

* Update highgui.hpp

Add read only property retrieval for enhanced rendering capabilities and more sophisticated research tools

* Update window.cpp

* Update window_w32.cpp

* Update window_QT.cpp

* Update window_QT.h

* Update window_QT.h

* Update window_gtk.cpp

* Update precomp.hpp

* Update highgui_c.h

* Update highgui_c.h

* Update window_w32.cpp

* Update precomp.hpp

* Update window_QT.cpp

* Update window_QT.h

* Update window_gtk.cpp

* Update window_gtk.cpp

* Update window_w32.cpp

* Update window_QT.cpp

* Update window_carbon.cpp

* Update window_cocoa.mm

* Update precomp.hpp

* Update window_cocoa.mm

* Update window_w32.cpp

* Update window_gtk.cpp

* Update window_QT.cpp

* Update window_gtk.cpp

* Update window_QT.cpp

* Update window_cocoa.mm

* Update window_carbon.cpp

* Update window_w32.cpp

* Update window_cocoa.mm

* Update window_gtk.cpp

* Update window_cocoa.mm

* Update window_gtk.cpp

* Update window_cocoa.mm

* Update window_cocoa.mm

* Update window.cpp

* Update test_gui.cpp

* Update test_gui.cpp

* Update test_gui.cpp

* Update highgui_c.h

* Update highgui.hpp

* Update window.cpp

* Update highgui_c.h

* Update test_gui.cpp

* Update highgui.hpp

* Update window.cpp

* Update window.cpp

* Update window.cpp

* Update window.cpp

* Update window.cpp

* Fix opencv#10525

* Merge pull request opencv#10621 from mshabunin:disable-docs

Documentation generation refactoring (opencv#10621)

* Documentation build updates:

- disable documentation by default, do not add to ALL target
- combine Doxygen and Javadoc
- optimize Doxygen html

* javadoc: fix path in build directory

* cmake: fix "Documentation" status line

* ocl support for Deconvolution layer

Signed-off-by: Li Peng <peng.li@intel.com>

* Improve the doc for cv::Mat::checkVector.

* Improve the documentation for affine transform estimation.

* Fix perf build with CUDA 9.

* cv::cuda::cvtColor bug fix (opencv#10640)

* cuda::cvtColor bug fix

Fixed bug in conversion formula between RGB space and LUV space.
Testing with opencv_test_cudaimgproc.exe, this commit reduces the number
of failed tests from 191 to 95. (96 more tests pass)

* Rename variables

* Improve the doc for fundamental matrix.

* run.py: simplified scripts, fixed most of PEP8 warnings

* more update on MVN layer ocl implementation

cut one ocl kernel if normVariance is disabled,
also use native_powr for performance reason.

Signed-off-by: Li Peng <peng.li@intel.com>

* Merge pull request opencv#10649 from GregoryMorse:patch-3

* Fix for QT image window rectangle

* Update window_QT.h

* Update window_QT.cpp

* trailing whitespace

* highgui: fix QT getWindowImageRect()

* ocl: Avoid unnecessary initializing when non-UMat parameters are used

* bitexact gaussianblur implementation (opencv#10345)

* Bit-exact implementation of GaussianBlur smoothing

* Added universal intrinsics based implementation for bit-exact CV_8U GaussianBlur smoothing.

* Added parallel_for to evaluation of bit-exact GaussianBlur

* Added custom implementations for 3x3 and 5x5 bit-exact GaussianBlur

* core(perf): refactor kmeans test

- don't use RNG for "task size" parameters (N, K, dims)
- add "good" kmeans test data (without singularities: K > unique points)

* core: fix kmeans multi-threaded performance

* core: kmeans refactoring

- reduce scope of i,k,j variables
- use cv::AutoBuffer
- template<bool onlyDistance> class KMeansDistanceComputer
- eliminate manual unrolling: CV_ENABLE_UNROLLED

* OpenCV face detection network test

* persistence: replace arbitrary limit of cn to 4 by CV_CN_MAX (opencv#10636)

* persistence: replace arbitrary limit of cn to 4 by CV_CN_MAX

* python: added persistence test, remove temp files

* fixup! python: added persistence test, remove temp files

* fixup! python: added persistence test, remove temp files

* add HAL for FAST (opencv#10362)

* add HAL for FAST

* add new interface

* bad image file

* PriorBox layer with explicit normalized sizes

* core(lapack): fix build issues related to 'extern "C"'

* Merge pull request opencv#10663 from jmlich:master

* hogsvm compatibility with python3

* VS with hardening: added guard flag, moved dynamicbase and safeseh to linker flags

* Exported a high level stitcher the DLL

allows Stitcher to be used for scans from within python.
I had to use very strange notation because I couldn't export the `enum`
`Mode` making the Cpython generated code unable to compile.

```c++
class Stitcher {
public:
enum Mode
    {
        PANORAMA = 0,
        SCANS = 1,
    };
...
```

Also removed duplicate code from the `createStitcher` function making
use of the `Stitcher::create` function

* convolution and tanh layer fusion

Signed-off-by: Li Peng <peng.li@intel.com>

* mvn, batch_norm and relu layer fusion

Signed-off-by: Li Peng <peng.li@intel.com>

* IntrinsicParams operator+ fix

* solve issue opencv#10687

* Merge pull request opencv#10700 from alalek:cpu_dispatch_axv512

* cmake: enable CPU dispatching for AVX512 (SKX)

* cmake: update handling of unsupported flags/modes

* Merge pull request opencv#10667 from paroj:stereo_calib_ex

 calib3d: add stereoCalibrateExtended (opencv#10667)

* cvCalibrateCamera2Internal: simplify per view error computation

* calib3d: add stereoCalibrateExtended

- allow CALIB_USE_EXTRINSIC_GUESS
- returns per view errors

* calib3d: add stereoCalibrateExtended test

* Merge pull request opencv#10697 from woodychow:tbb_task_arena

* Use Intel TBB's task arena if possible

* dnn(ocl): fix build options for Apple OpenCL
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hough Circles performance finding many small circles

3 participants