Improvement of sequential connected components Wu's algorithm and pro…#7705
Improvement of sequential connected components Wu's algorithm and pro…#7705MicheleCancilla wants to merge 6 commits intoopencv:masterfrom
Conversation
…vide parallel version of both Wu's and Grana's algorithms (using TBB library)
|
Please, restore text indentation on lines 54-3957: it was 4 spaces, but you've changed it to 8 spaces. |
|
Is it possible to change TBB to |
|
Hi @mshabunin, "Please, restore text indentation on lines 54-3957: it was 4 spaces, but you've changed it to 8 spaces." "Is it possible to change TBB to cv::parallel_for_? It has different backends (including TBB) and will work on more platforms." |
Extended parallel version to all frameworks supported by OpenCV; Added some documentation notes in modules/imgproc/include/opencv2/imgproc.hpp;
|
Hi @mshabunin. You were right: we reverted the parallel code to |
To avoid conflicts with reserved variable names (started with underscore). |
…h getNumberOfCPUs
|
I tested your last commit a bit. I noticed, that when I use CV_16U label type, connectedComponents returns only the labels from first thread and the label image is broken. This seems to occure for both, GRANA and WU as well as for 4 and 8 connected. |
|
Hello @mschoeneck, The CV_16U label type can be used only when the number of labels in the image can be fit a 16bit data type (i.e 65536). The problem with the parallel version of the algorithms (which divide the image to label in stripes and compute labeling on each stripe in parallel) is that each thread must start labeling from the maximum possible value reachable by the previous thread. This is necessary in order to avoid collision accesses to equivalence label array. For example (considering an image with 8 rows and 2 thread): the first thread works with rows from 0 to 3 [0;3] and starts from label 1 (background is 0). The second one works with rows from 4 to 7 [4;7] and starts from maximum label reachable (at least in theory) by thread one incremented by one. So if the maximum label reachable from thread one is 5 thread two starts labeling from label 6. Then, if you run connectedComponents function with CV_16U label type on an image that actually have only two labels, but the maximum number of different labels that image could contain is greater than 65536, parallel labeling will fail because of its nature. For this reason we decided to call the parallel connectedComponents only with CV_32S label type. @mshabunin @alalek The latest push failed under linux with strange errors (error closing /tmp/ccXGh37G.s: No space left on device), it's my fault or was there a problem with the buildbot? |
These builds are restarted. |
|
hello, do you have any updates? 😄 |
|
trying to fix doc builder complains, see #8789 |

This pullrequest changes
With this Pull Request we provide the following improvements on “cv::connectedComponents” and on “cv::connectedComponentsWithStats” functions:
- We improve the speed of Wu’s algorithm (sequential version) providing a better implementation, both for 4 and 8-way connectivity.
- We provide a parallel CPU implementation of both Wu and Grana’s algorithms that are the ones currently included in the OpenCV library. These parallel versions are automatically chose by the functions when the TBB variable is defined. The performance are shown in the charts below where acronyms are used to distinguish the different implementation/version of the algorithms. An explanation of the datasets used and how the tests were performed can be found here.
To conclude, this pull request reply to RFC #7270 and take into account the issue #7607 providing the same solution proposed by @sokian with pull request #7608 that is open yet.SAUF_OLD: is the old implementation of the sequential version of the Wu’s algorithm;
SAUF_NEW: is the implementation of the sequential version of the Wu’s algorithm proposed with this pull request;
SAUF_PAR: is the parallel version of the Wu’s algorithm introduced by this pull request;
BBDT_SEQ: is the current sequential version of the Grana’s algorithm (currently included in the OpenCV);
BBDT_PAR: is the parallel version of the Grana’s algorithm introduced by this pull request;
Comparison of Wu’s algorithms with 8-way connectivity (without computing statistics)
Comparison of Wu’s algorithms with 4-way connectivity (without computing statistics)
Comparison of Grana’s algorithms without computing statistics
Comparison of Grana’s algorithms with statistics computation
Contributors:
Michele Cancilla @MicheleCancilla (cancilla.michele[at]gmail.com)
Federico Bolelli @prittt (federico.bolelli[at]unimore.it)
Costantino Grana @CostantinoGrana (costantino.grana[at]unimore.it)