GAPI Fluid: Resize Linear U8C3 - reworking horizontal pass. by anna-khakimova · Pull Request #21144 · opencv/opencv

anna-khakimova · 2021-11-28T19:55:24Z

Reworking the Resize Linear U8C3 horizontal pass. Previous version handles 16 pixels per loop iteration. New version handles 5 pixels per iteration.
Fix for valgrind issue
Enabling SSE41 SIMD Resize U8C3
Valgrind run ( ✖ failed due to GAPI_Streaming_Desync.Python_Pull_Overload hang/timeout - reproducible on weekly builds)

Performance report:

force_builders=Linux AVX2,Custom,Custom Win,Custom Mac
build_gapi_standalone:Linux x64=ade-0.1.1f
build_gapi_standalone:Win64=ade-0.1.1f
Xbuild_gapi_standalone:Mac=ade-0.1.1f
build_gapi_standalone:Linux x64 Debug=ade-0.1.1f

build_image:Custom=centos:7
buildworker:Custom=linux-1
build_gapi_standalone:Custom=ade-0.1.1f

Xbuild_image:Custom=ubuntu-openvino-2021.3.0:20.04
build_image:Custom Win=openvino-2021.4.1
build_image:Custom Mac=openvino-2021.2.0

buildworker:Custom Win=windows-3

test_modules:Custom=gapi,python2,python3,java
test_modules:Custom Win=gapi,python2,python3,java
test_modules:Custom Mac=gapi,python2,python3,java

buildworker:Custom=linux-1
# disabled due high memory usage: test_opencl:Custom=ON
Xtest_opencl:Custom=OFF
Xtest_bigdata:Custom=1
Xtest_filter:Custom=*

CPU_BASELINE:Custom Win=AVX512_SKX
CPU_BASELINE:Custom=SSE4_2

sivanov-work

I'm sorry - looks like i'm not the right person who eligible to review intrisics code so deeply.
@anna-khakimova you had better to add the more qualified reviewer then me.

I was expected to find differences in buffer tail processing conditions, but it seems on much more modifications here. Sorry again - I'm not enough qualified for reviewing such strong SSE code

sivanov-work · 2022-01-14T08:11:18Z

modules/gapi/src/backends/fluid/gfluidcore_simd_sse41.hpp

    bool yRatioEq = inSz.height == outSz.height;
-    constexpr int nlanes = 16;
-    constexpr int half_nlanes = 16 / 2;
+    constexpr int nlanes = 16; // number of 8-bit integers that fit into a 128-bit SIMD vector.


suggest to make code self documented ( may be in future ways) likes as

struct sse_traits { constexpr int instruction_size = 128; constexpr int lanes_count = 128 / 8bit; ... }

and reuse them

sivanov-work · 2022-01-14T08:12:27Z

modules/gapi/src/backends/fluid/gfluidcore_simd_sse41.hpp

-
-            for (int x = 0; outSz.width >= nlanes; )
+            __m128i horizontal_shuf_mask1 = _mm_setr_epi8(0, 1, 2, 4, 5, 6, 8, 9, 10, 12, 13, 14, 3, 7, 11, 15);
+            constexpr int nproc_pixels = 5;


what is nproc_pixels: is it non_proc or number of proc and how is 5 was obtained?

UPDATE: discussed offline.
but i still think that processing_pixel_number looks better
5 = 128 / 24, where 24 is rgb * 8bit - could you please put it into comment?

GAPI Fluid: Resize Linear U8C3 - reworking horizontal pass. * Reworked horizontal pass * Fixed valgrind issue and removed unnesesary snippet

anna-khakimova requested a review from dmatveev November 28, 2021 19:56

anna-khakimova added the optimization label Nov 28, 2021

Reworked horizontal pass

bdba2c5

anna-khakimova force-pushed the ak/resize_simd_v2 branch from 6bf90aa to 000588b Compare January 12, 2022 13:31

anna-khakimova requested review from AsyaPronina and sivanov-work January 12, 2022 13:37

anna-khakimova added the category: g-api / gapi label Jan 12, 2022

Fixed valgrind issue and removed unnesesary snippet

52f38b6

anna-khakimova force-pushed the ak/resize_simd_v2 branch from 000588b to 52f38b6 Compare January 12, 2022 13:50

anna-khakimova added the bug label Jan 13, 2022

sivanov-work reviewed Jan 14, 2022

View reviewed changes

sivanov-work approved these changes Jan 14, 2022

View reviewed changes

alalek merged commit 60228d3 into opencv:4.x Jan 14, 2022

anna-khakimova mentioned this pull request Jan 17, 2022

Fix Fluid Resize Valgrind issue #21358

Closed

6 tasks

alalek mentioned this pull request Feb 11, 2022

opencv 4.5.5 32 bits build fails #21597

Closed

alalek mentioned this pull request Feb 22, 2022

(5.x) Merge 4.x #21651

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GAPI Fluid: Resize Linear U8C3 - reworking horizontal pass.#21144

GAPI Fluid: Resize Linear U8C3 - reworking horizontal pass.#21144
alalek merged 2 commits intoopencv:4.xfrom
anna-khakimova:ak/resize_simd_v2

anna-khakimova commented Nov 28, 2021 •

edited

Loading

Uh oh!

sivanov-work left a comment

Uh oh!

sivanov-work Jan 14, 2022

Uh oh!

sivanov-work Jan 14, 2022 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

anna-khakimova commented Nov 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sivanov-work left a comment

Choose a reason for hiding this comment

Uh oh!

sivanov-work Jan 14, 2022

Choose a reason for hiding this comment

Uh oh!

sivanov-work Jan 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

anna-khakimova commented Nov 28, 2021 •

edited

Loading

sivanov-work Jan 14, 2022 •

edited

Loading