Do not accept FP16 target on old, incompatible Nvidia cards#21462
Do not accept FP16 target on old, incompatible Nvidia cards#21462JulienMaille wants to merge 1 commit intoopencv:4.xfrom
Conversation
29c4661 to
05598d4
Compare
|
I think there is a corner case that is missed with this patch (if it's going into 3.4) if OpenCL devices can be changed at runtime (I am not sure if changing devices at runtime is officially supported). See this comment: #21461 (comment) |
|
@alalek @YashasSamaga I have updated my PR to better suit your approach |
modules/dnn/src/dnn.cpp
Outdated
| { | ||
| #ifdef HAVE_CUDA | ||
| if (!cuda4dnn::doesDeviceSupportFP16()) | ||
| impl->preferableTarget = DNN_TARGET_CUDA; |
There was a problem hiding this comment.
I am unable to understand this fix. This check is already done in initCUDABackend and it also sends an error message. This check here will render that check useless. What is this check doing?
There was a problem hiding this comment.
This will auto-correct target when setPreferableTarget(TARGET_CUDA_FP16) is called. A lot of things can happen after initCudaBackend, and there are no guarantees that the preferableTarget has not changed by the time Net::forward() is called. Or I'm missing something?
There was a problem hiding this comment.
@YashasSamaga Ok, I agree this does render your other check useless (and I can see that I don't have the auto-correct warning message when I add my fix). However it does indeed fixes some corner case because it won't complain about missing convolution for half anymore.
There was a problem hiding this comment.
Changing backend/target would reset the network which would force a reinitialization. I think there is a bug elsewhere and probably along the lines you're hinting at. Maybe something does indeed happen before the FP16->FP32 switch in initCUDABackend. This might fix the problem for now though but I think it's better to find the root cause than hide it.
| { | ||
| #ifdef HAVE_CUDA | ||
| if (!cuda4dnn::doesDeviceSupportFP16()) | ||
| impl->preferableTarget = DNN_TARGET_CUDA; |
There was a problem hiding this comment.
I don't think he is, my change patches the bug but is closer to a workaround rather than a proper fix
There was a problem hiding this comment.
The issue here is that this will involve the CUDA API before the diagnostic checks on the version, compatibility, etc. happen in initCUDABackend.
I think the bug, as @JulienMaille pointed out in another comment, is that the target for layers is set even before initCUDABackend is called. So changing the target inside initCUDABackend is useless.
One option would be to completely reinitialize the net object on fallback or redo the diagnostic checks in the setPreferableTarget before the FP16 check.
|
@JulienMaille Do you have progress with the patch? Please pay attention on conflicts. |
|
@YashasSamaga told me he was working on a cleaner fix so I've not reviewed this patch recently. |
|
@YashasSamaga Do you have any update on alternative solution? |
|
I am quite rusty on the internals now but I have a hypothesis. I think that the target for the layers are set before I think re-initializing ( |
|
Hello @alalek and @asmorkalov, Would you mind reviewing an alternative fix for that issue? Thanks in advance! |
|
Closed in favor #25880 |
Fix CUDA for old GPUs without FP16 support #25880 Fixes #21461 ~This is a build-time solution that reflects https://github.com/opencv/opencv/blob/4.10.0/modules/dnn/src/cuda4dnn/init.hpp#L68-L82.~ ~We shouldn't add an invalid target while building with `CUDA_ARCH_BIN` < 53.~ _(please see [this discussion](#25880 (comment) This is a run-time solution that basically reverts [these lines](d0fe6ad#diff-757c5ab6ddf2f99cdd09f851e3cf17abff203aff4107d908c7ad3d0466f39604L245-R245). I've debugged these changes, [coupled with other fixes](gentoo/gentoo#37479), on [Gentoo Linux](https://www.gentoo.org/) and [related tests passed](https://github.com/user-attachments/files/16135391/opencv-4.10.0.20240708-224733.log.gz) on my laptop with `GeForce GTX 960M`. Alternative solution: - #21462 _Best regards!_ ### Pull Request Readiness Checklist - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [ ] `n/a` There is accuracy test, performance test and test data in opencv_extra repository, if applicable - [ ] `n/a` The feature is well documented and sample code can be built with the project CMake
Fix for #21461