dnn(OpenCL): avoid out of buffer access in copyWeightsSwizzled#20648
dnn(OpenCL): avoid out of buffer access in copyWeightsSwizzled#20648opencv-pushbot merged 1 commit intoopencv:3.4from
Conversation
| if (oclk_copy_weight.empty()) | ||
| return false; | ||
|
|
||
| int total = alignSize(num_output_, swizzled_factor) * channels * kernel_w_ * kernel_h_; |
There was a problem hiding this comment.
this is suspicious. I don't think this total matches the size of the swizzled_weights_umat size. The formula here and on teh create() above are different. And if that is all you want...then just use the UMat member functions on swizzled_weights_umat to get its 1D size.
There was a problem hiding this comment.
Or use the KernelArg that passes the Size...
There was a problem hiding this comment.
I don't think this total matches the size of the swizzled_weights_umat size
It should not, because total doesn't try to limit idxOut (offset in destination swizzled_weights_umat).
It limits get_global_id(0) which is blamed in issue's description.
There was a problem hiding this comment.
Yes...still suspicious. In the examples I gave in the issue the "out" was 9216 in size yet only saved to a max of 4608. And the "in" was out of bounds.
It is possible the "out" is operating correctly...but it is suspicious. It is suspicious to overallocate "out" by 2x.
The algorithm is invalid for "in", that increases the likelyhood that "out" is also incorrectly calculated. Naturally, this isn't true until providen. But...based on experience, "in" is wrong, seeing 2x overallocation, a wrong release()...it makes me suspicious of the whole function/algorithm.
I would help more if I understood the intention of the function/algorithm. But I don't know DNN and I found no documentation. 😵 So I can only look at the code and memory usage/access and highlight problems with that.
There was a problem hiding this comment.
I added limits for input/output buffers and checks for them.
Debug commit is here: https://github.com/alalek/opencv/commits/debug_20615
I see:
- out of bounds detections for
idxIn<=== this is a bug - no out of bounds detections for
idxOut
As I understand "output" should have extra alignment gaps:
- there are no inputs for them
- they should be filled with zeros.
There was a problem hiding this comment.
I ran the original repro steps with your debug branch (before 10 minutes ago changes). So many errors and out of bounds that my buffer couldn't capture everything. Attached is log I could capture. Sorry, github won't let me copy/paste the text directly.
1.txt
There was a problem hiding this comment.
Added fix to this PR with output gaps handling in OpenCL kernel.
|
I continue to get errors and crash that terminates the test. This is with your PR change including 7e83e39 |
|
Reduced PR scope: handle |
|
👍 |
|
Hi. I'm running tests on this PR. I have a question Does the code need to initialize or write to all entities in the output weight array? |
resolves #20615
/cc @diablodale