Skip to content

dnn(OpenCL): avoid out of buffer access in copyWeightsSwizzled#20648

Merged
opencv-pushbot merged 1 commit intoopencv:3.4from
alalek:issue_20615
Sep 4, 2021
Merged

dnn(OpenCL): avoid out of buffer access in copyWeightsSwizzled#20648
opencv-pushbot merged 1 commit intoopencv:3.4from
alalek:issue_20615

Conversation

@alalek
Copy link
Copy Markdown
Member

@alalek alalek commented Sep 3, 2021

resolves #20615

/cc @diablodale

if (oclk_copy_weight.empty())
return false;

int total = alignSize(num_output_, swizzled_factor) * channels * kernel_w_ * kernel_h_;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is suspicious. I don't think this total matches the size of the swizzled_weights_umat size. The formula here and on teh create() above are different. And if that is all you want...then just use the UMat member functions on swizzled_weights_umat to get its 1D size.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or use the KernelArg that passes the Size...

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this total matches the size of the swizzled_weights_umat size

It should not, because total doesn't try to limit idxOut (offset in destination swizzled_weights_umat).
It limits get_global_id(0) which is blamed in issue's description.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes...still suspicious. In the examples I gave in the issue the "out" was 9216 in size yet only saved to a max of 4608. And the "in" was out of bounds.

It is possible the "out" is operating correctly...but it is suspicious. It is suspicious to overallocate "out" by 2x.

The algorithm is invalid for "in", that increases the likelyhood that "out" is also incorrectly calculated. Naturally, this isn't true until providen. But...based on experience, "in" is wrong, seeing 2x overallocation, a wrong release()...it makes me suspicious of the whole function/algorithm.

I would help more if I understood the intention of the function/algorithm. But I don't know DNN and I found no documentation. 😵 So I can only look at the code and memory usage/access and highlight problems with that.

Copy link
Copy Markdown
Member Author

@alalek alalek Sep 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added limits for input/output buffers and checks for them.
Debug commit is here: https://github.com/alalek/opencv/commits/debug_20615

I see:

  • out of bounds detections for idxIn <=== this is a bug
  • no out of bounds detections for idxOut

As I understand "output" should have extra alignment gaps:

  • there are no inputs for them
  • they should be filled with zeros.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran the original repro steps with your debug branch (before 10 minutes ago changes). So many errors and out of bounds that my buffer couldn't capture everything. Attached is log I could capture. Sorry, github won't let me copy/paste the text directly.
1.txt

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added fix to this PR with output gaps handling in OpenCL kernel.

@diablodale
Copy link
Copy Markdown
Contributor

I continue to get errors and crash that terminates the test. This is with your PR change including 7e83e39

...
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 9 (offset: 9) got 0.885758 expected 2.05206
[ERROR:0] Kernel: IDLF_k3x3_cn576_g1_s1x1_d1x1_b1_in32x32_p1x1_num1_M12_activ0_eltwise0_FP32_2_1_3_1_SIMD16
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 0 (offset: 0) got 0.205725 expected 1.02068
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 1 (offset: 1) got -0.230392 expected -1.56461
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 2 (offset: 2) got 0.20696 expected 1.7121
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 3 (offset: 3) got 0.178028 expected 1.40408
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 4 (offset: 4) got 0.185864 expected -0.989608
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 5 (offset: 5) got 0.245834 expected 1.35702
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 6 (offset: 6) got 0.370515 expected 0.177543
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 7 (offset: 7) got 0.323833 expected -0.0437661
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 8 (offset: 8) got 0.597495 expected 1.49064
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 9 (offset: 9) got 0.885758 expected 2.05206
[ERROR:0] Kernel: IDLF_k3x3_cn576_g1_s1x1_d1x1_b1_in32x32_p1x1_num1_M12_activ0_eltwise0_FP32_2_1_2_1_SIMD16
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 0 (offset: 0) got 0.205725 expected 1.02068
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 1 (offset: 1) got -0.230392 expected -1.56461
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 2 (offset: 2) got 0.20696 expected 1.7121
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 3 (offset: 3) got 0.178028 expected 1.40408
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 4 (offset: 4) got 0.185864 expected -0.989608
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 5 (offset: 5) got 0.245834 expected 1.35702
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 6 (offset: 6) got 0.370515 expected 0.177543
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 7 (offset: 7) got 0.323833 expected -0.0437661
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 8 (offset: 8) got 0.597495 expected 1.49064
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 9 (offset: 9) got 0.885758 expected 2.05206
[ERROR:0] Kernel: IDLF_k3x3_cn576_g1_s1x1_d1x1_b1_in32x32_p1x1_num1_M12_activ0_eltwise0_FP32_2_1_1_1_SIMD16
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 0 (offset: 0) got 0.205725 expected 1.02068
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 1 (offset: 1) got -0.230392 expected -1.56461
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 2 (offset: 2) got 0.20696 expected 1.7121
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 3 (offset: 3) got 0.178028 expected 1.40408
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 4 (offset: 4) got 0.185864 expected -0.989608
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 5 (offset: 5) got 0.245834 expected 1.35702
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 6 (offset: 6) got 0.370515 expected 0.177543
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 7 (offset: 7) got 0.323833 expected -0.0437661
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 8 (offset: 8) got 0.597495 expected 1.49064
[ERROR:0] test verification failed @ image 0 group 0 out_ch 0 h 0 w 9 (offset: 9) got 0.885758 expected 2.05206
Exception thrown at 0x0000022C8AC605B5 in opencv_test_dnnd.exe: 0xC0000005: Access violation reading location 0x0000022C95BA000C.

@alalek
Copy link
Copy Markdown
Member Author

alalek commented Sep 4, 2021

Reduced PR scope: handle (filter < outputs) only.
Other issues with adjusted global_work_items values will be handled by other larger PR for many OCL4DNN kernels.

@alalek
Copy link
Copy Markdown
Member Author

alalek commented Sep 4, 2021

👍

@opencv-pushbot opencv-pushbot merged commit 7967683 into opencv:3.4 Sep 4, 2021
@diablodale
Copy link
Copy Markdown
Contributor

diablodale commented Sep 6, 2021

Hi. I'm running tests on this PR. I have a question

Does the code need to initialize or write to all entities in the output weight array?
I ask because the current code does not write to all output entities. Therefore, those entities have random data in them and the swizzled random data weights are passed further down the codepath.

@alalek alalek mentioned this pull request Sep 11, 2021
@alalek alalek mentioned this pull request Oct 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants