Skip to content

norm.cpp(normL2Sqr_): improve performance of pipeline#18831

Merged
opencv-pushbot merged 1 commit intoopencv:3.4from
rjiejie:master-opt@pipeline
Dec 2, 2020
Merged

norm.cpp(normL2Sqr_): improve performance of pipeline#18831
opencv-pushbot merged 1 commit intoopencv:3.4from
rjiejie:master-opt@pipeline

Conversation

@rjiejie
Copy link
Copy Markdown
Contributor

@rjiejie rjiejie commented Nov 17, 2020

The most of target machine use one type cpu unit resource
to execute some one type of instruction, e.g.
all vx_load API use load/store cpu unit,
and v_muladd API use mul/mula cpu unit, we interleave
vx_load and v_muladd to improve performance on most targets like
RISCV or ARM.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • [Y] I agree to contribute to the project under Apache 2 License.
  • [Y] To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • [Y] The PR is proposed to proper branch
  • [N] There is reference to original bug report and related work
  • [N] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • [Y] The feature is well documented and sample code can be built with the project CMake

@asmorkalov asmorkalov added optimization platform: arm ARM boards related issues: RPi, NVIDIA TK/TX, etc labels Nov 17, 2020
@asmorkalov
Copy link
Copy Markdown
Contributor

@rjiejie Thanks for the contribution. Do you have performance numbers that shows performance improvement?

@asmorkalov asmorkalov added the pr: needs rebase Rebase patch (and squash fixup commits) on the top of target branch label Nov 17, 2020
@asmorkalov
Copy link
Copy Markdown
Contributor

This patch should go into 3.4 branch first.
We will merge changes from 3.4 into master regularly (weekly/bi-weekly).

So, please:

  • change "base" branch of this PR: master => 3.4 (use "Edit" button near PR title)
  • rebase your commits from master onto 3.4 branch. For example:
    git rebase -i --onto upstream/3.4 upstream/master
    (check list of your commits, save and quit (Esc + "wq" + Enter)
    where upstream is configured by following this GitHub guide and fetched (git fetch upstream).
  • push rebased commits into source branch of your fork (with --force option)

Note: no needs to re-open PR, apply changes "inplace".

@rjiejie rjiejie changed the base branch from master to 3.4 November 19, 2020 01:43
The most of target machine use one type cpu unit resource
to execute some one type of instruction, e.g.
all vx_load API use load/store cpu unit,
and v_muladd API use mul/mula cpu unit, we interleave
vx_load and v_muladd to improve performance on most targets like
RISCV or ARM.
@rjiejie rjiejie force-pushed the master-opt@pipeline branch from 3f912f6 to 12b8d54 Compare November 19, 2020 01:51
@asmorkalov
Copy link
Copy Markdown
Contributor

@rjiejie Do you have progress with performance testing?

@rjiejie
Copy link
Copy Markdown
Contributor Author

rjiejie commented Nov 19, 2020

@rjiejie Do you have progress with performance testing?

Yes, I need to get all result of performance in summary, I will submit that in the next week, Thanks.

Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 👍

@asmorkalov
Copy link
Copy Markdown
Contributor

@rjiejie Do you have any progress with performance testing?

@opencv-pushbot opencv-pushbot merged commit 484251c into opencv:3.4 Dec 2, 2020
@alalek alalek mentioned this pull request Dec 4, 2020
@alalek alalek mentioned this pull request Apr 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

optimization platform: arm ARM boards related issues: RPi, NVIDIA TK/TX, etc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants