Skip to content

3.4-tengine#16724

Merged
alalek merged 26 commits intoopencv:3.4from
liqi-c:3.4-tengine
Mar 9, 2020
Merged

3.4-tengine#16724
alalek merged 26 commits intoopencv:3.4from
liqi-c:3.4-tengine

Conversation

@liqi-c
Copy link
Copy Markdown
Contributor

@liqi-c liqi-c commented Mar 3, 2020

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under OpenCV (BSD) License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • The PR is proposed to proper branch
  • There is reference to original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake
force_builders=ARMv7,ARMv8
Xbuild_image:Android armeabi-v7a=android-gradle
Xbuildworker:Android armeabi-v7a=linux-4

#build_image:Android pack=android-gradle
#android_pack_config:Android pack=ndk-16.config.py

@vpisarev vpisarev requested a review from alalek March 3, 2020 16:34
@vpisarev
Copy link
Copy Markdown
Contributor

vpisarev commented Mar 3, 2020

@alalek, I wonder if it's possible to merge this patch into 3.4 (and then master) before the code freeze?

@alalek
Copy link
Copy Markdown
Member

alalek commented Mar 3, 2020

As this is optimization patch, it would be useful to see performance benefits from this library.

Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for contribution!

Please measure performance gain on some public ARM platforms against current OpenCV code (use opencv_perf_dnn).

CMakeLists.txt Outdated
include(cmake/OpenCVFindLAPACK.cmake)
include(cmake/OpenCVFindProtobuf.cmake)
if(WITH_TENGINE)
if (X86_64 OR X86 OR ARM OR AARCH64 OR CMAKE_CROSSCOMPILING)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this condition to corresponding OCV_OPTION.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

#

# Default tengine source store directory .
SET(DEFAULT_OPENCV_TENGINE_SOURCE_PATH ${OpenCV_BINARY_DIR}/3rdparty/libtengine/Tengine-1.12.0)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1.12.0

Avoid hardcoded versions in strings. Especially if used in multiple places.
Version should be dumped in CMake build summary.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

OCV_OPTION(WITH_QUIRC "Include library QR-code decoding" ON
VISIBLE_IF TRUE
VERIFY HAVE_QUIRC)
OCV_OPTION(WITH_TENGINE "Include Arm Inference Tengine support" OFF
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OFF

Please add temporary "test" commit with enabling this for ARMv7/ARMv8 builders (building is checked only, dnn tests are not configured for ARM).

@BUG1989
Copy link
Copy Markdown

BUG1989 commented Mar 4, 2020

The performance improvement is obvious.
image

}

int nstripes = std::max(getNumThreads(), 1);
#ifdef HAVE_TENGINE
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's better to have something like DNN_BACKEND_TENGINE and separate forward? Current layers implementation is not perfect on terms of backends but if TEngine uses host Mat data and there is no need for wrappers - it must be easy to manage.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dkurt, that was the original idea, but implementing a dedicated backend takes much more efforts. To get some immediate performance improvement, only convolution layer is accelerated now. Later on, the backend will be created, I think

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. So maybe let's create an internal forward_tengine inside the layer?

void tengine_set_Winograd(bool flag)
{
// Winograd on/off .
setenv("NO_WINO", "1", (int)!flag);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid changing environment from code?

Copy link
Copy Markdown
Contributor

@vpisarev vpisarev Mar 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's what I told too. For now, there is no appropriate function for it in the Tengine itself, and without this setting the unit tests fail. I think, in some next version of Tengine the function to configure Winograd branch will be added and that function will be called instead of setting environment variable. Since "setenv()" affects only the current process, it's not a showstopper, just a question of style and extra overhead.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've made some changes and removed setenv() call.

}

postrun_graph(graph);
destroy_graph(graph);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that optimal to create and destroy graph every run?

Copy link
Copy Markdown
Contributor

@vpisarev vpisarev Mar 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's not very optimal. I guess, each graph consumes some memory and there is no way currently to make it to use external buffers. If the graph is constructed once, the Tengine-accelerated OpenCV DNN will consume much more memory. For now the performance increase because of using Tengine just for convolution outweights the overhead of the graph construction. I believe, some better solution will be developed later

Comment on lines +28 to +29
IF( WITH_CUDA OR WITH_OPENCL OR WITH_CUDNN OR WITH_VULKAN OR X86 OR X86_64)
MESSAGE(STATUS "TENGINE:-- Not support . Turning Tengine_FOUND off.")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition doesn't look logical (irrational).
Also use whitelist principle instead of blacklist - OpenCV is cross-platform library.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

ocv_add_module(dnn opencv_core opencv_imgproc WRAP python java js)

ocv_option(OPENCV_DNN_OPENCL "Build with OpenCL support" HAVE_OPENCL AND NOT APPLE)
ocv_option(OPENCV_DNN_TENGINE "Build with Tengine support" HAVE_TENGINE)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OPENCV_DNN_TENGINE

Probably not needed at all.
OpenCL is OpenCV-wide stuff (so we want to control this through OPENCV_DNN_OPENCL), but tengine is not.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

* Author: qtang@openailab.com
*/

#include <iostream>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

include "../../precomp.hpp" should be first.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@liqi-c
Copy link
Copy Markdown
Contributor Author

liqi-c commented Mar 5, 2020

@alalek hello , Can we specify NDK version in OpenCV CI system? We only support versions above ndk14 . So how do we get through CI ?

@alalek
Copy link
Copy Markdown
Member

alalek commented Mar 5, 2020

above ndk14

It would be nice to add this check into CMake scripts somewhere.


I have switched on gradle-based Android build environment (but it is not well supported on 3.4 branch).

  • Please ignore warnings from cmake_checks.cmake
  • also please adjust the "test" commit and disable BUILD_ANDROID_PROJECTS variable.


if(HAVE_TENGINE)
list(APPEND include_dirs ${TENGINE_INCLUDE_DIRS})
list(APPEND libs ${TENGINE_LIBRARIES})
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No rule to make target lib/libtengine.a

In case of "build" mode CMake should get here the target name instead of library path. It is necessary for creating dependencies.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@vpisarev
Copy link
Copy Markdown
Contributor

vpisarev commented Mar 6, 2020

@liqi-c, maybe it makes sense to temporarily disable Tengine on Android, in order to make all the builds pass? Then we can merge it in before the code freeze. Later on, you can re-enable it.

VISIBLE_IF TRUE
VERIFY HAVE_QUIRC)
OCV_OPTION(WITH_TENGINE "Include Arm Inference Tengine support" ON
VISIBLE_IF (ARM OR AARCH64) AND UNIX AND NOT ANDROID AND NOT IOS
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ON

NOT ANDROID

Copy link
Copy Markdown
Contributor

@vpisarev vpisarev Mar 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alalek, can you please decrypt this comment? :) the decision is to temporarily disable Tengine on Android, because the builds on Android fail now. Let's repair the Android case later

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VISIBLE_IF is for "complete" filtering of platforms. This filter "hides" the option.

Android can be use used, but should be configured for proper version.
So right place for "NOT ANDROID" "by default" is ON/OFF condition.

@liqi-c
Copy link
Copy Markdown
Contributor Author

liqi-c commented Mar 7, 2020

above ndk14

It would be nice to add this check into CMake scripts somewhere.

I have switched on gradle-based Android build environment (but it is not well supported on 3.4 branch).

  • Please ignore warnings from cmake_checks.cmake
  • also please adjust the "test" commit and disable BUILD_ANDROID_PROJECTS variable.

@alalek How to add NDK version check ? we add like this " if(${ANDROID_NDK_REVISION} LESS 14) " but it's CI test failed when ndk version is r10 .

@alalek
Copy link
Copy Markdown
Member

alalek commented Mar 7, 2020

ANDROID_NDK_REVISION

It is modern stuff only. For example, it is not available in current NDK10e builders for OpenCV 3.4.
But ANDROID_NDK_REVISION value is available on master branch.
Add DEFINED check for that.

LESS

Wrong. You should use VERSION_LESS instead.

if(${ANDROID_NDK_REVISION} ...

Value unpacking is wrong too in modern CMake.

@vpisarev
Copy link
Copy Markdown
Contributor

vpisarev commented Mar 7, 2020

I think, it's ready to merge 👍 Android support could be added later

@alalek alalek merged commit 0bcdf7d into opencv:3.4 Mar 9, 2020
@alalek alalek mentioned this pull request Mar 9, 2020
@fengyuentau fengyuentau mentioned this pull request Aug 8, 2023
6 tasks
asmorkalov pushed a commit that referenced this pull request Aug 9, 2023
dnn: cleanup of tengine backend #24122

🚀 Cleanup for OpenCV 5.0. Tengine backend is added for convolution layer speedup on ARM CPUs, but it is not maintained and the convolution layer on our default backend has reached similar performance to that of Tengine.

Tengine backend related PRs:
- #16724
- #18323

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
thewoz pushed a commit to thewoz/opencv that referenced this pull request Jan 4, 2024
dnn: cleanup of tengine backend opencv#24122

🚀 Cleanup for OpenCV 5.0. Tengine backend is added for convolution layer speedup on ARM CPUs, but it is not maintained and the convolution layer on our default backend has reached similar performance to that of Tengine.

Tengine backend related PRs:
- opencv#16724
- opencv#18323

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
thewoz pushed a commit to thewoz/opencv that referenced this pull request May 29, 2024
dnn: cleanup of tengine backend opencv#24122

🚀 Cleanup for OpenCV 5.0. Tengine backend is added for convolution layer speedup on ARM CPUs, but it is not maintained and the convolution layer on our default backend has reached similar performance to that of Tengine.

Tengine backend related PRs:
- opencv#16724
- opencv#18323

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: dnn optimization platform: arm ARM boards related issues: RPi, NVIDIA TK/TX, etc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants