Do not accept FP16 target on old, incompatible Nvidia cards by JulienMaille · Pull Request #21462 · opencv/opencv

JulienMaille · 2022-01-17T08:45:21Z

modules/dnn/src/dnn.cpp

YashasSamaga · 2022-01-17T09:17:33Z

I think there is a corner case that is missed with this patch (if it's going into 3.4) if OpenCL devices can be changed at runtime (I am not sure if changing devices at runtime is officially supported). See this comment: #21461 (comment)

JulienMaille · 2022-01-17T10:03:02Z

@alalek @YashasSamaga I have updated my PR to better suit your approach

YashasSamaga · 2022-01-17T10:27:54Z

modules/dnn/src/dnn.cpp

+        {
+#ifdef HAVE_CUDA
+            if (!cuda4dnn::doesDeviceSupportFP16())
+                impl->preferableTarget = DNN_TARGET_CUDA;                


I am unable to understand this fix. This check is already done in initCUDABackend and it also sends an error message. This check here will render that check useless. What is this check doing?

This will auto-correct target when setPreferableTarget(TARGET_CUDA_FP16) is called. A lot of things can happen after initCudaBackend, and there are no guarantees that the preferableTarget has not changed by the time Net::forward() is called. Or I'm missing something?

@YashasSamaga Ok, I agree this does render your other check useless (and I can see that I don't have the auto-correct warning message when I add my fix). However it does indeed fixes some corner case because it won't complain about missing convolution for half anymore.

Changing backend/target would reset the network which would force a reinitialization. I think there is a bug elsewhere and probably along the lines you're hinting at. Maybe something does indeed happen before the FP16->FP32 switch in initCUDABackend. This might fix the problem for now though but I think it's better to find the root cause than hide it.

Fix for opencv#21461

alalek · 2022-01-21T14:29:57Z

modules/dnn/src/dnn.cpp

+        {
+#ifdef HAVE_CUDA
+            if (!cuda4dnn::doesDeviceSupportFP16())
+                impl->preferableTarget = DNN_TARGET_CUDA;


@YashasSamaga Are you OK with proposed change?

I don't think he is, my change patches the bug but is closer to a workaround rather than a proper fix

The issue here is that this will involve the CUDA API before the diagnostic checks on the version, compatibility, etc. happen in initCUDABackend.

I think the bug, as @JulienMaille pointed out in another comment, is that the target for layers is set even before initCUDABackend is called. So changing the target inside initCUDABackend is useless.

One option would be to completely reinitialize the net object on fallback or redo the diagnostic checks in the setPreferableTarget before the FP16 check.

asmorkalov · 2022-04-01T07:42:44Z

@JulienMaille Do you have progress with the patch? Please pay attention on conflicts.

JulienMaille · 2022-04-01T08:18:11Z

@YashasSamaga told me he was working on a cleaner fix so I've not reviewed this patch recently.

asmorkalov · 2022-09-19T11:56:31Z

@YashasSamaga Do you have any update on alternative solution?

YashasSamaga · 2022-09-19T15:29:44Z

I am quite rusty on the internals now but I have a hypothesis. I think that the target for the layers are set before initCUDABackend is called and changing the target in the Net object inside initCUDABackend won't automatically change the target in the layers. Therefore, even though the net target is changed, the initCUDA code, which rely on Layer::preferableTarget will initialize with the original FP16 target.

I think re-initializing (Net::setPreferableBackend calls clear() on backend change which forces reinit) the network after the switch should fix the problem.

Jamim · 2024-07-07T15:53:10Z

Hello @alalek and @asmorkalov,

Would you mind reviewing an alternative fix for that issue?

🐛 Fix CUDA for old GPUs without FP16 support #25880

Thanks in advance!

asmorkalov · 2024-07-09T07:02:46Z

Closed in favor #25880

Fix CUDA for old GPUs without FP16 support #25880 Fixes #21461 ~This is a build-time solution that reflects https://github.com/opencv/opencv/blob/4.10.0/modules/dnn/src/cuda4dnn/init.hpp#L68-L82.~ ~We shouldn't add an invalid target while building with `CUDA_ARCH_BIN` < 53.~ _(please see [this discussion](#25880 (comment) This is a run-time solution that basically reverts [these lines](d0fe6ad#diff-757c5ab6ddf2f99cdd09f851e3cf17abff203aff4107d908c7ad3d0466f39604L245-R245). I've debugged these changes, [coupled with other fixes](gentoo/gentoo#37479), on [Gentoo Linux](https://www.gentoo.org/) and [related tests passed](https://github.com/user-attachments/files/16135391/opencv-4.10.0.20240708-224733.log.gz) on my laptop with `GeForce GTX 960M`. Alternative solution: - #21462 _Best regards!_ ### Pull Request Readiness Checklist - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [ ] `n/a` There is accuracy test, performance test and test data in opencv_extra repository, if applicable - [ ] `n/a` The feature is well documented and sample code can be built with the project CMake

alalek reviewed Jan 17, 2022

View reviewed changes

modules/dnn/src/dnn.cpp Outdated Show resolved Hide resolved

JulienMaille force-pushed the patch-2 branch 2 times, most recently from 29c4661 to 05598d4 Compare January 17, 2022 09:14

JulienMaille changed the base branch from 3.4 to 4.x January 17, 2022 09:17

JulienMaille force-pushed the patch-2 branch from c1b1456 to 992e181 Compare January 17, 2022 10:02

JulienMaille changed the title ~~Do not report for FP16 compatibility on old Nvidia cards~~ Do not accept FP16 target on old, incompatible Nvidia cards Jan 17, 2022

JulienMaille requested a review from alalek January 17, 2022 10:13

YashasSamaga reviewed Jan 17, 2022

View reviewed changes

JulienMaille force-pushed the patch-2 branch from 530527a to 457ac99 Compare January 17, 2022 10:31

Do not report for FP16 compatibility on old Nvidia cards

ff6e9fd

Fix for opencv#21461

JulienMaille force-pushed the patch-2 branch from 457ac99 to ff6e9fd Compare January 17, 2022 10:43

alalek reviewed Jan 21, 2022

View reviewed changes

asmorkalov removed this from the 4.8.0 milestone May 5, 2023

Jamim mentioned this pull request Jul 7, 2024

🐛 Fix CUDA for old GPUs without FP16 support #25880

Merged

6 tasks

asmorkalov closed this Jul 9, 2024

JulienMaille deleted the patch-2 branch July 9, 2024 09:04

Uh oh!

Conversation

JulienMaille commented Jan 17, 2022

Uh oh!

Uh oh!

YashasSamaga commented Jan 17, 2022

Uh oh!

JulienMaille commented Jan 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

YashasSamaga Jan 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JulienMaille Jan 17, 2022

Choose a reason for hiding this comment

Uh oh!

JulienMaille Jan 17, 2022

Choose a reason for hiding this comment

Uh oh!

YashasSamaga Jan 17, 2022

Choose a reason for hiding this comment

Uh oh!

alalek Jan 21, 2022

Choose a reason for hiding this comment

Uh oh!

JulienMaille Jan 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

YashasSamaga Jan 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asmorkalov commented Apr 1, 2022

Uh oh!

JulienMaille commented Apr 1, 2022

Uh oh!

asmorkalov commented Sep 19, 2022

Uh oh!

YashasSamaga commented Sep 19, 2022

Uh oh!

Jamim commented Jul 7, 2024

Uh oh!

asmorkalov commented Jul 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

JulienMaille commented Jan 17, 2022 •

edited

Loading

YashasSamaga Jan 17, 2022 •

edited

Loading

JulienMaille Jan 21, 2022 •

edited

Loading

YashasSamaga Jan 22, 2022 •

edited

Loading