Skip to content

DNN: fix bug for X86 winograd#23763

Merged
asmorkalov merged 3 commits intoopencv:4.xfrom
zihaomu:add_runtime_check
Jun 9, 2023
Merged

DNN: fix bug for X86 winograd#23763
asmorkalov merged 3 commits intoopencv:4.xfrom
zihaomu:add_runtime_check

Conversation

@zihaomu
Copy link
Copy Markdown
Member

@zihaomu zihaomu commented Jun 8, 2023

Address #23760
The patch aims to add a runtime check for X86 platform without AVX(2).

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@zihaomu zihaomu requested a review from asmorkalov June 8, 2023 00:35
@zihaomu zihaomu linked an issue Jun 8, 2023 that may be closed by this pull request
4 tasks
@zihaomu zihaomu added this to the 4.8.0 milestone Jun 8, 2023
@zihaomu zihaomu force-pushed the add_runtime_check branch from ea0432e to f7d349b Compare June 8, 2023 00:47
float16_t* weightsWinoBufPtr_FP16;
#endif

Ptr<WinoParams> winoParams = makePtr<WinoParams>();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ptr to structure with constants is overkill for me. I propose do the following:

  • remain const int for NEON and other non-Intel platforms.
  • for x86 branch use something like static const int CONV_WINO_IBLOCK = (checkHardwareSupport(CPU_AVX) || checkHardwareSupport(CPU_AVX2)) 6 : 3;
    It means that:
  • no code changes in compute part
  • no performance side effects for others
  • single instance of constant values, but not copy for each layer

Also useAVX, useAVX2 and others may be static const too.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, will update the patch today later.

@asmorkalov
Copy link
Copy Markdown
Contributor

The solution passes all tests on old hardware, thanks!

@asmorkalov asmorkalov self-assigned this Jun 8, 2023
@zihaomu zihaomu force-pushed the add_runtime_check branch from f7d349b to 9469f39 Compare June 8, 2023 13:14
@zihaomu zihaomu force-pushed the add_runtime_check branch from 9469f39 to fc0557f Compare June 8, 2023 13:31
float* outptr, int Cg, const int winoIblock, const int winoAtomF32)
{
CV_Assert(CONV_WINO_IBLOCK == 3 && CONV_WINO_KBLOCK == 4 && CONV_WINO_ATOM_F32 == 4);
CV_Assert(winoIblock == 3 && winoAtomF32 == 4);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need function parameters, if they always should have the same value? They should be local constants like const int winoIblock = 3;

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I want to do is to align the API.

#if CV_TRY_AVX2
                if (conv->useAVX2)
                    opt_AVX2::winofunc_accum_f32(inwptr, wptr, out_wbuf, Cg, block_id1 - block_id0, CONV_WINO_IBLOCK,
                                       CONV_WINO_KBLOCK, CONV_WINO_ATOM_F32, CONV_WINO_NATOMS_F32);
                else
#endif
#if CV_TRY_AVX
                if (conv->useAVX)
                    opt_AVX::winofunc_accum_f32(inwptr, wptr, out_wbuf, Cg, block_id1 - block_id0, CONV_WINO_IBLOCK,
                                       CONV_WINO_KBLOCK, CONV_WINO_ATOM_F32, CONV_WINO_NATOMS_F32);
                else
#endif
#if CV_NEON && CV_NEON_AARCH64
                if (conv->useNEON)
                    opt_NEON::winofunc_accum_f32(inwptr, wptr, out_wbuf, Cg, block_id1 - block_id0, CONV_WINO_IBLOCK,
                                       CONV_WINO_KBLOCK, CONV_WINO_ATOM_F32, CONV_WINO_NATOMS_F32);
                else
#endif

                winofunc_accum_f32(inwptr, wptr, out_wbuf, Cg, block_id1 - block_id0, CONV_WINO_IBLOCK,
                                       CONV_WINO_KBLOCK, CONV_WINO_ATOM_F32, CONV_WINO_NATOMS_F32);

For SIMD implementation, the parameters is the same, but for other dispatch, it may be different.

@opencv opencv deleted a comment from jm10000 Jun 9, 2023
Copy link
Copy Markdown
Contributor

@asmorkalov asmorkalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Thanks a lot for the patch!

@asmorkalov asmorkalov merged commit eec8a20 into opencv:4.x Jun 9, 2023
@asmorkalov asmorkalov mentioned this pull request Jul 12, 2023
thewoz pushed a commit to thewoz/opencv that referenced this pull request Jan 4, 2024
DNN: fix bug for X86 Winograd opencv#23763

Address opencv#23760
The patch aims to add a runtime check for X86 platform without AVX(2).

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
thewoz pushed a commit to thewoz/opencv that referenced this pull request May 29, 2024
DNN: fix bug for X86 Winograd opencv#23763

Address opencv#23760
The patch aims to add a runtime check for X86 platform without AVX(2).

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Winograd convolution fails with assertion on CPU without AVX

3 participants