Skip to content

[cv::transform] Enable CV_SIMD for the 16U case on AArch64.#19164

Merged
opencv-pushbot merged 1 commit intoopencv:3.4from
fpetrogalli:tranform_16u
Dec 20, 2020
Merged

[cv::transform] Enable CV_SIMD for the 16U case on AArch64.#19164
opencv-pushbot merged 1 commit intoopencv:3.4from
fpetrogalli:tranform_16u

Conversation

@fpetrogalli
Copy link
Copy Markdown
Contributor

@fpetrogalli fpetrogalli commented Dec 18, 2020

Performance uplift (x-factor) ranges from 2.20 to 2.50, on the following perf tests of the core module:

  1. Mat_Transform::Size_MatType
  2. Transform::OCL_TransformFixture

The associated issue is #19163.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • [ X ] I agree to contribute to the project under Apache 2 License.
  • [ X ] To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • [ X ] The PR is proposed to proper branch
  • [ X ] There is reference to original bug report and related work

@asmorkalov asmorkalov self-requested a review December 19, 2020 09:33
Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch should go into 3.4 branch first.
We will merge changes from 3.4 into master regularly (weekly/bi-weekly).

Please:

  • change "base" branch of this PR: master => 3.4 (use "Edit" button near PR title)
  • rebase your commits from master onto 3.4 branch. For example:
    git rebase -i --onto upstream/3.4 upstream/master
    (check list of your commits, save and quit (Esc + "wq" + Enter)
    where upstream is configured by following this GitHub guide and fetched (git fetch upstream).
  • push rebased commits into source branch of your fork (with --force option)

Note: no needs to re-open PR, apply changes "inplace".

Comment on lines +1607 to +1609
transform_32f( const float* src, float* dst, const float* m, int len, int scn, int dcn )
{
#if CV_SIMD && !defined(__aarch64__) && !defined(_M_ARM64)
#if CV_SIMD
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transform_32f

Should be transform_16u as stated here: #19163 (comment)

Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for contribution 👍

@opencv-pushbot opencv-pushbot merged commit 663bd73 into opencv:3.4 Dec 20, 2020
@alalek alalek mentioned this pull request Dec 20, 2020
@fpetrogalli
Copy link
Copy Markdown
Contributor Author

Hi - thank you for looking after my patch over the weekend. Sorry for not coming back to you earlier, but I have a strict rule that forbids me to enter my home office over the weekend :).

First of all, thank you for fixing the commit. My local changes were covering both 32F and 16U when I ran the perf executables. I saw the results and decided to give it a go at upstreaming the 16U changes. The upstreaming for the 32F case was unintentional.

One curiosity: how comes that the patch needed to be merged into 3.4 and not master?

BTW, what would be the preferred communication channel of the OpenCV community to discuss development topic that are not bugs/perf issues? Is there a mailing list or a public discussion forum for developers?

Kind regards!

Francesco

@tomoaki0705
Copy link
Copy Markdown
Contributor

You need to read.
Please, please please please read and read and read the link from @alalek

If your pull request is also applicable to the 3.4 branch, you should choose that branch as "base"
If you've already created pull request based on the master branch, but it is also applicable to 3.4, you will be asked to rebase it to 3.4, see the instruction in the following section.

For the forum, there is exactly a place for you to discuss.

It has been created recently, and the old one will be deprecated, and new one will be used.

@fpetrogalli fpetrogalli deleted the tranform_16u branch December 21, 2020 15:05
@alalek alalek mentioned this pull request Apr 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: core optimization platform: arm ARM boards related issues: RPi, NVIDIA TK/TX, etc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants