Skip to content

Merge 3.4#16135

Merged
alalek merged 45 commits intoopencv:masterfrom
alalek:merge-3.4
Dec 12, 2019
Merged

Merge 3.4#16135
alalek merged 45 commits intoopencv:masterfrom
alalek:merge-3.4

Conversation

@alalek
Copy link
Copy Markdown
Member

@alalek alalek commented Dec 11, 2019

#15257 from pmur:resize
#15988 from l-bat:custom_layer
#16070 from dkurt:backport_15611
#16071 from alalek:update_version_3.4.9-pre
#16076 from l-bat:prior_ngraph
#16079 from alalek:imgproc_color_clarify_error_message
#16085 from alalek:imgproc_threshold_to_zero_ipp_bug
#16088 from alalek:dnn_eltwise_layer_different_src_channels
#16089 from dkurt:dnn_ie_fix_fpga
#16093 from alalek:core_itt_thread_name_16072
#16094 from saskatchewancatch:issue-16053
#16098 from alalek:dnn_clarify_error_getMemoryShapes
#16101 from dkurt:dnn_ie_ngraph_detection_output
#16102 from asmorkalov:as/xperience_c
#16106 from dkurt:dnn_ie_ngraph_weights_fusion
#16107 from dkurt:dnn_ie_ngraph_v1_conv
#16109 from pixelb:gcc-9-pch
#16117 from mshabunin:fix-hist-args-34
#16120 from alalek:python3.8
#16121 from shimat:fix_voronoi_typo
#16123 from alalek:opencv_include_port_file
#16124 from alalek:issue_13354
#16125 from alalek:core_safe_xadd
#16138 from pmur:reg_16137

Previous "Merge 3.4": #16068

buildworker:Win64 OpenCL=windows-2
buildworker:Custom=linux-1,linux-2,linux-4
build_image:Docs=docs-js
build_image:Custom=javascript
#build_image:Custom=powerpc64le
#build_image:Custom=ubuntu-openvino-2019r3.0:16.04
#buildworker:Custom=linux-2
#build_image:Custom=ubuntu-vulkan:16.04
#buildworker:Custom=linux-4
#build_image:Custom=fedora:28
#build_image:Custom=ubuntu-cuda:16.04
#build_image:Custom=ubuntu-clang:18.04
#buildworker:Custom=linux-1
#build_image:Custom=javascript-simd
#build_image:Custom=mips64el
build_image:Custom Mac=openvino-2019r3.0
build_image:Custom Win=openvino-2019r3.0
test_opencl:Custom Win=OFF
#build_image:Custom Win=msvs2017
#build_image:Custom Win=msvs2019
test_modules:Custom Mac=dnn,java,python3

dkurt and others added 30 commits December 5, 2019 19:25
Test create custom layer in python

* check is contiguos

* Add custom layer test

* Fix test

* Remove assert

* Move assert to pyopencv dnn

* remove assert

* Add unregister

* Fix python2

* proto to bytearray

* Fix data type
- don't override current application thread names
- set name for own threads only
…_ipp_bug

* imgproc(IPP): wrong result from threshold(THRESH_TOZERO)

* imgproc(IPP): disable IPP code to pass THRESH_TOZERO test
* resize: HResizeLinear reduce duplicate work

There appears to be a 2x unroll of the HResizeLinear against k,
however the k value is only incremented by 1 during the unroll. This
results in k - 1 duplicate passes when k > 1.

Likewise, the final pass may not respect the work done by the vector
loop. Start it with the offset returned by the vector op if
implemented. Note, no vector ops are implemented today.

The performance is most noticable on a linear downscale. A set of
performance tests are added to characterize this.  The performance
improvement is 10-50% depending on the scaling.

* imgproc: vectorize HResizeLinear

Performance is mostly gated by the gather operations
for x inputs.

Likewise, provide a 2x unroll against k, this reduces the
number of alpha gathers by 1/2 for larger k.

While not a 4x improvement, it still performs substantially
better under P9 for a 1.4x improvement. P8 baseline is
1.05-1.10x due to reduced VSX instruction set.

For float types, this results in a more modest
1.2x improvement.

* Update U8 processing for non-bitexact linear resize

* core: hal: vsx: improve v_load_expand_q

With a little help, we can do this quickly without gprs on
all VSX enabled targets.

* resize: Fix cn == 3 step per feedback

Per feedback, ensure we don't overrun. This was caught via the
failure observed in Test_TensorFlow.inception_accuracy.
* Add eps error checking for approxPolyDP to allow sensible values only
for epsilon value of Douglas-Peucker algorithm.

* Review changes for PR
-c is required to avoid linking (and the associated missing "main" message)
when linker flags like "-Wl,-z,relro" are passed to GCC
pmur and others added 2 commits December 12, 2019 13:00
* imgproc: Prevent 1B overrun of 8C3 SIMD optimization

The fourth value read via v_load_q is essentially ignored,
but can cause trouble if it happens to cross page boundaries.

The final few iterations may attempt to read the most extreme
elements of S, which will read 1B beyond the array in most
aligment cases. Dynamically compute the stop. This could be
hoised from the loop, but will require a more extensive change.

Likewise, cleanup the iteration increment statements to make
it more obvious they do channel count (3) elements per pass.

This should resolve opencv#16137

* imgproc(resize): extra check
@alalek
Copy link
Copy Markdown
Member Author

alalek commented Dec 12, 2019

👍

@alalek alalek merged commit 92b9888 into opencv:master Dec 12, 2019
@alalek alalek mentioned this pull request Dec 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants