dnn(OpenCL): avoid out of buffer access in copyWeightsSwizzled by alalek · Pull Request #20648 · opencv/opencv

alalek · 2021-09-03T02:57:38Z

resolves #20615

diablodale · 2021-09-03T15:21:04Z

modules/dnn/src/ocl4dnn/src/ocl4dnn_conv_spatial.cpp

        if (oclk_copy_weight.empty())
            return false;

+        int total = alignSize(num_output_, swizzled_factor) * channels * kernel_w_ * kernel_h_;


this is suspicious. I don't think this total matches the size of the swizzled_weights_umat size. The formula here and on teh create() above are different. And if that is all you want...then just use the UMat member functions on swizzled_weights_umat to get its 1D size.

Or use the KernelArg that passes the Size...

I don't think this total matches the size of the swizzled_weights_umat size

It should not, because total doesn't try to limit idxOut (offset in destination swizzled_weights_umat).
It limits get_global_id(0) which is blamed in issue's description.

Yes...still suspicious. In the examples I gave in the issue the "out" was 9216 in size yet only saved to a max of 4608. And the "in" was out of bounds.

It is possible the "out" is operating correctly...but it is suspicious. It is suspicious to overallocate "out" by 2x.

The algorithm is invalid for "in", that increases the likelyhood that "out" is also incorrectly calculated. Naturally, this isn't true until providen. But...based on experience, "in" is wrong, seeing 2x overallocation, a wrong release()...it makes me suspicious of the whole function/algorithm.

I would help more if I understood the intention of the function/algorithm. But I don't know DNN and I found no documentation. 😵 So I can only look at the code and memory usage/access and highlight problems with that.

I added limits for input/output buffers and checks for them.
Debug commit is here: https://github.com/alalek/opencv/commits/debug_20615

I see:

out of bounds detections for idxIn <=== this is a bug

no out of bounds detections for idxOut

As I understand "output" should have extra alignment gaps:

there are no inputs for them

they should be filled with zeros.

I ran the original repro steps with your debug branch (before 10 minutes ago changes). So many errors and out of bounds that my buffer couldn't capture everything. Attached is log I could capture. Sorry, github won't let me copy/paste the text directly.
1.txt

Added fix to this PR with output gaps handling in OpenCL kernel.

diablodale · 2021-09-03T19:58:23Z

I continue to get errors and crash that terminates the test. This is with your PR change including 7e83e39

...
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 9 (offset: 9) got 0.885758 expected 2.05206
[ERROR:0] Kernel: IDLF_k3x3_cn576_g1_s1x1_d1x1_b1_in32x32_p1x1_num1_M12_activ0_eltwise0_FP32_2_1_3_1_SIMD16
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 0 (offset: 0) got 0.205725 expected 1.02068
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 1 (offset: 1) got -0.230392 expected -1.56461
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 2 (offset: 2) got 0.20696 expected 1.7121
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 3 (offset: 3) got 0.178028 expected 1.40408
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 4 (offset: 4) got 0.185864 expected -0.989608
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 5 (offset: 5) got 0.245834 expected 1.35702
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 6 (offset: 6) got 0.370515 expected 0.177543
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 7 (offset: 7) got 0.323833 expected -0.0437661
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 8 (offset: 8) got 0.597495 expected 1.49064
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 9 (offset: 9) got 0.885758 expected 2.05206
[ERROR:0] Kernel: IDLF_k3x3_cn576_g1_s1x1_d1x1_b1_in32x32_p1x1_num1_M12_activ0_eltwise0_FP32_2_1_2_1_SIMD16
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 0 (offset: 0) got 0.205725 expected 1.02068
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 1 (offset: 1) got -0.230392 expected -1.56461
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 2 (offset: 2) got 0.20696 expected 1.7121
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 3 (offset: 3) got 0.178028 expected 1.40408
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 4 (offset: 4) got 0.185864 expected -0.989608
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 5 (offset: 5) got 0.245834 expected 1.35702
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 6 (offset: 6) got 0.370515 expected 0.177543
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 7 (offset: 7) got 0.323833 expected -0.0437661
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 8 (offset: 8) got 0.597495 expected 1.49064
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 9 (offset: 9) got 0.885758 expected 2.05206
[ERROR:0] Kernel: IDLF_k3x3_cn576_g1_s1x1_d1x1_b1_in32x32_p1x1_num1_M12_activ0_eltwise0_FP32_2_1_1_1_SIMD16
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 0 (offset: 0) got 0.205725 expected 1.02068
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 1 (offset: 1) got -0.230392 expected -1.56461
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 2 (offset: 2) got 0.20696 expected 1.7121
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 3 (offset: 3) got 0.178028 expected 1.40408
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 4 (offset: 4) got 0.185864 expected -0.989608
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 5 (offset: 5) got 0.245834 expected 1.35702
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 6 (offset: 6) got 0.370515 expected 0.177543
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 7 (offset: 7) got 0.323833 expected -0.0437661
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 8 (offset: 8) got 0.597495 expected 1.49064
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 9 (offset: 9) got 0.885758 expected 2.05206
Exception thrown at 0x0000022C8AC605B5 in opencv_test_dnnd.exe: 0xC0000005: Access violation reading location 0x0000022C95BA000C.

alalek · 2021-09-04T15:49:06Z

Reduced PR scope: handle (filter < outputs) only.
Other issues with adjusted global_work_items values will be handled by other larger PR for many OCL4DNN kernels.

alalek · 2021-09-04T18:21:04Z

👍

diablodale · 2021-09-06T17:32:02Z

Hi. I'm running tests on this PR. I have a question

Does the code need to initialize or write to all entities in the output weight array?
I ask because the current code does not write to all output entities. Therefore, those entities have random data in them and the swizzled random data weights are passed further down the codepath.

alalek mentioned this pull request Sep 3, 2021

out of bounds crashes in DNN swizzleWeight() and copyWeightsSwizzled corrupts OpenCL state #20615

Closed

4 tasks

diablodale reviewed Sep 3, 2021

View reviewed changes

dnn(ocl): avoid out of buffer access in copyWeightsSwizzled

5b2c016

opencv-pushbot merged commit 7967683 into opencv:3.4 Sep 4, 2021

alalek mentioned this pull request Sep 11, 2021

(4.x) Merge 3.4 #20691

Merged

alalek mentioned this pull request Oct 15, 2021

(5.x) Merge 4.x #20886

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

dnn(OpenCL): avoid out of buffer access in copyWeightsSwizzled#20648

dnn(OpenCL): avoid out of buffer access in copyWeightsSwizzled#20648
opencv-pushbot merged 1 commit intoopencv:3.4from
alalek:issue_20615

alalek commented Sep 3, 2021 •

edited

Loading

Uh oh!

diablodale Sep 3, 2021

Uh oh!

diablodale Sep 3, 2021

Uh oh!

alalek Sep 3, 2021

Uh oh!

diablodale Sep 3, 2021

Uh oh!

alalek Sep 3, 2021 •

edited

Loading

Uh oh!

diablodale Sep 3, 2021

Uh oh!

alalek Sep 3, 2021

Uh oh!

diablodale commented Sep 3, 2021

Uh oh!

alalek commented Sep 4, 2021

Uh oh!

alalek commented Sep 4, 2021

Uh oh!

diablodale commented Sep 6, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

alalek commented Sep 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

diablodale Sep 3, 2021

Choose a reason for hiding this comment

Uh oh!

diablodale Sep 3, 2021

Choose a reason for hiding this comment

Uh oh!

alalek Sep 3, 2021

Choose a reason for hiding this comment

Uh oh!

diablodale Sep 3, 2021

Choose a reason for hiding this comment

Uh oh!

alalek Sep 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

diablodale Sep 3, 2021

Choose a reason for hiding this comment

Uh oh!

alalek Sep 3, 2021

Choose a reason for hiding this comment

Uh oh!

diablodale commented Sep 3, 2021

Uh oh!

alalek commented Sep 4, 2021

Uh oh!

alalek commented Sep 4, 2021

Uh oh!

diablodale commented Sep 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

alalek commented Sep 3, 2021 •

edited

Loading

alalek Sep 3, 2021 •

edited

Loading

diablodale commented Sep 6, 2021 •

edited

Loading