Skip to content

GAPI Fluid: Enable dynamic dispatching for AbsDiffC kernel.#21204

Merged
opencv-pushbot merged 1 commit intoopencv:4.xfrom
anna-khakimova:ak/move_simd_absdiffc
Dec 13, 2021
Merged

GAPI Fluid: Enable dynamic dispatching for AbsDiffC kernel.#21204
opencv-pushbot merged 1 commit intoopencv:4.xfrom
anna-khakimova:ak/move_simd_absdiffc

Conversation

@anna-khakimova
Copy link
Copy Markdown
Member

@anna-khakimova anna-khakimova commented Dec 6, 2021

Moved SIMD for GAPI Fluid AbsDiffC kernel to enable dynamic dispatching.

Performance report:

FluidAbsDiffCSIMDEnableDynamicDispatch.xlsx

force_builders=Linux AVX2,Custom,Custom Win,Custom Mac
build_gapi_standalone:Linux x64=ade-0.1.1f
build_gapi_standalone:Win64=ade-0.1.1f
Xbuild_gapi_standalone:Mac=ade-0.1.1f
build_gapi_standalone:Linux x64 Debug=ade-0.1.1f

build_image:Custom=centos:7
buildworker:Custom=linux-1
build_gapi_standalone:Custom=ade-0.1.1f

Xbuild_image:Custom=ubuntu-openvino-2021.3.0:20.04
build_image:Custom Win=openvino-2021.4.1
build_image:Custom Mac=openvino-2021.2.0

buildworker:Custom Win=windows-3

test_modules:Custom=gapi,python2,python3,java
test_modules:Custom Win=gapi,python2,python3,java
test_modules:Custom Mac=gapi,python2,python3,java

buildworker:Custom=linux-1
# disabled due high memory usage: test_opencl:Custom=ON
Xtest_opencl:Custom=OFF
Xtest_bigdata:Custom=1
Xtest_filter:Custom=*

CPU_BASELINE:Custom Win=AVX512_SKX
CPU_BASELINE:Custom=SSE4_2

int w = 0;
#if CV_SIMD
w = absdiffc_simd(in, scalar, out, width, chan);
w = absdiffc_simd(in, scalar, out, length, chan);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was a bug before?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it wasn't a bug. I just took the length calculation to a higher level.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, it was width then became width * chan here - so data amount as increased in chan times. Looks like we didn't use by whole data range before (was bug) or processing more data (possible out-of-bound ) for now

Copy link
Copy Markdown
Member Author

@anna-khakimova anna-khakimova Dec 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it wasn't so. Earlier we calculate length inside absdiffc_simd_c3() and absdiffc_simd_c1c2c4() functions and then used length for a loop. You can see it here for example. Now length is calculated 2 level up of call stack (i.e. here).


int width = dst.length();
int chan = dst.meta().chan;
const int length = width * chan;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just for interest: why all types is signed? is it by historical reasons?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is for history :-)

#undef MULC_SIMD

#define ABSDIFFC_SIMD(T) \
int absdiffc_simd(const T in[], const float scalar[], T out[], \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no difference. This structure was organized for dynamic dispatching support.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you describe relationship by short schema between them please?
Because i see two functions with the similar names, with the similar agruments and in the same namespace .
May be one of fucntions should have *_impl suffic, may be i wrong

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am i right in my assumption (it's definitely clear for me by now) that here we have function declaration
but in file before we have function definition?

If yes, then i suggest to rename MACRO as ABSDIFFC_SIMD_DECLARATION & ABSDIFFC_SIMD_DEFINITION or something like that

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gfluidcore_func.hpp is header file for forward declaration of wrappers in the gfluidcore_func.dispatch.cpp. It's necessary for C++ compiler.

Copy link
Copy Markdown
Member Author

@anna-khakimova anna-khakimova Dec 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suggest to rename MACRO as ABSDIFFC_SIMD_DECLARATION & ABSDIFFC_SIMD_DEFINITION or something like that

Ok. I would prefer to do this in the next PR together with MACRO for SIMD of the other kernels. All together.

//-------------------------

#define ABSDIFFC_SIMD(SRC) \
int absdiffc_simd(const SRC in[], const float scalar[], SRC out[], \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always_inline ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of dynamic dispatching support, any key words before return type break compilation.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any key words before return type break compilation.

could you please provide observed error message?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is implementation of external interface declared on the line 154 and used in other compilation modules, e.g. .dispatch.cpp. So it can't be inlined.

@dmatveev dmatveev self-assigned this Dec 8, 2021
Copy link
Copy Markdown
Contributor

@sivanov-work sivanov-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@dmatveev dmatveev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of questions

class MaxPerfTest : public TestPerfParams<tuple<cv::Size, MatType, cv::GCompileArgs>> {};
class AbsDiffPerfTest : public TestPerfParams<tuple<cv::Size, MatType, cv::GCompileArgs>> {};
class AbsDiffCPerfTest : public TestPerfParams<tuple<cv::Size, MatType, cv::GCompileArgs>> {};
class AbsDiffCPerfTest : public TestPerfParams<tuple<compare_f, cv::Size, MatType, cv::GCompileArgs>> {};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be exact? why do we need a comparison function here?

Copy link
Copy Markdown
Member Author

@anna-khakimova anna-khakimova Dec 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand the meaning of the question. Why does Alexey Trutnev add comparison functions to all GAPI performance tests in the PR? Probably to understand whether the kernels works correctly in the future.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the question was why this check is not exact any more but we discussed this double/float change on call. Thanks

Comment on lines +973 to +975
compare_f cmpF = get<0>(GetParam());
cv::Size sz_in = get<1>(GetParam());
MatType type = get<2>(GetParam());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tie()

Copy link
Copy Markdown
Contributor

@dmatveev dmatveev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@opencv-pushbot opencv-pushbot merged commit f1053d4 into opencv:4.x Dec 13, 2021
@alalek alalek mentioned this pull request Dec 30, 2021
@alalek alalek mentioned this pull request Feb 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants