Resolve uncovered CUDA dnn layer#24080
Conversation
|
Seems like |
|
@WanliZhong, @fengyuentau, please review a part about GEMM. I have switched |
|
@dkurt Thanks a lot for the patch. I made experiments with the code. The matmul change is covered by accuracy tests and it works well, but not covered with performance tests. Could you add some performance test to ensure performance regressions before merge. |
|
@asmorkalov, added performance test. To compare, replace |
fengyuentau
left a comment
There was a problem hiding this comment.
Gemm part looks good to me. IIRC, is_matmul is added by Wanli and used to deal with some optimization things. @WanliZhong Could you add some details here?
|
@fengyuentau, opencv/modules/dnn/src/layers/fully_connected_layer.cpp Lines 534 to 567 in 0245c0c |
WanliZhong
left a comment
There was a problem hiding this comment.
When the matrix is 2D, the implementation is the same whether or not is_matmul is true. Someday matrix multiplication should have to be separated from the inner product implementation. :-)
Resolve uncovered CUDA dnn layer opencv#24080 ### Pull Request Readiness Checklist * Gelu activation layer on CUDA * Try to relax GEMM from ONNX resolves opencv#24064 See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake
Resolve uncovered CUDA dnn layer opencv#24080 ### Pull Request Readiness Checklist * Gelu activation layer on CUDA * Try to relax GEMM from ONNX resolves opencv#24064 See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake
Pull Request Readiness Checklist
resolves #24064
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.