OpenCL: add explicit cast for half#18360
Merged
opencv-pushbot merged 1 commit intoopencv:3.4from Sep 18, 2020
Merged
Conversation
Contributor
Author
|
Here's a summary of So basically, it doesn't harm the implementation. |
Member
|
I will add runtime option to bypass this FP16 check in a separate PR. (We tried OpenCL/FP16 on AMD/NVIDIA GPUs some time ago and accuracy/performance of current implementation was very poor on these platforms) |
Member
|
Actually there is already such parameter to bypass these OpenCL checks: |
Merged
This was referenced Sep 23, 2020
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
relates #18283
Pull Request Readiness Checklist
OpenCL FP16 inference on Arm Mali is not supported, as claimed on #18283
There are 2 points, one is
isIntel()call and another is kernel build failureThis PR is to fix the 2nd point.
The error comes that clamp is accepting
half4, float, floatas a parameter, when it's supposed to accept eitherhalf4, half, halforfloat4, float, floatThis PR will explicitly cast the latter 2 parameters, and let the kernel to be built.
I confirmed on RK3399, by also disabling here
opencv/modules/dnn/src/dnn.cpp
Lines 1316 to 1323 in e668cff
The build passed successfully and the
opencv_test_dnnonly raised rounding error.The kernel seems to be doing its work.
I don't think it's safe enough to comment out the
isIntel()due to the support situation, but at least the kernel build failure comes from OpenCL spec violation, so I think it's worth to fix it.See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.