Python binding for cuda::GpuMat does not handle float16 properly when downloading to or uploading from a NumPy array

### System Information

OpenCV python version: 4.7.0.72 with OpenCV 87331ca built with Cuda 11.8
Operating System / Platform: Ubuntu 22.04
Python version: 3.10.8

### Detailed description

Trying to upload a float16 NumPy array to a GpuMat gives an `arr data type = 23 is not supported` error. While trying to download from a float16 GpuMat gives a uint64 NumPy array with garbage content. The test case below shows that float16 CuPy and GpuMat interoperability appears to work fine. Here's the full output log to the code from "Steps to reproduce":

```
Original CuPy array and pointer
[[2. 3.]
 [2. 3.]] 140353891991552


GpuMat, initialized and downloaded to NumPy; and pointer
[[140356343578624  94720260377824]
 [     1107312640             260]] 140353891991552
Assert GpuMat type is float16: True
NumPy dtype: uint64

Back to CuPy from GpuMat: value and pointer:
[[2. 3.]
 [2. 3.]] 140353891991552

Now try to upload a float16 NumPy array to GpuMat:
Traceback (most recent call last):
  File "***/tst_cupy_to_mat.py", line 75, in <module>
    cv_a.upload(np_a)
cv2.error: OpenCV(4.7.0) :-1: error: (-5:Bad argument) in function 'upload'
> Overload resolution failed:
>  - arr data type = 23 is not supported
>  - Expected Ptr<cv::cuda::GpuMat> for argument 'arr'
>  - Expected Ptr<cv::UMat> for argument 'arr'
>  - cuda_GpuMat.upload() missing required argument 'stream' (pos 2)
>  - cuda_GpuMat.upload() missing required argument 'stream' (pos 2)
>  - cuda_GpuMat.upload() missing required argument 'stream' (pos 2)
```

### Steps to reproduce

```python
import cupy as cp
import numpy as np
import cv2


def cv2cp(mat: cv2.cuda.GpuMat) -> cp.ndarray:
    class CudaArrayInterface:
        def __init__(self, gpu_mat: cv2.cuda.GpuMat):
            w, h = gpu_mat.size()
            type_map = {
                cv2.CV_8U: "|u1",
                cv2.CV_8S: "|i1",
                cv2.CV_16U: "<u2", cv2.CV_16S: "<i2",
                cv2.CV_32S: "<i4",
                cv2.CV_32F: "<f4", cv2.CV_64F: "<f8",
                cv2.CV_16F: "<f2"
            }
            self.__cuda_array_interface__ = {
                "version": 3,
                "shape": (h, w, gpu_mat.channels()) if gpu_mat.channels() > 1 else (h, w),
                "typestr": type_map[gpu_mat.depth()],
                "descr": [("", type_map[gpu_mat.depth()])],
                "stream": 1,
                "strides": (gpu_mat.step, gpu_mat.elemSize(), gpu_mat.elemSize1()) if gpu_mat.channels() > 1
                else (gpu_mat.step, gpu_mat.elemSize()),
                "data": (gpu_mat.cudaPtr(), False),
            }
    arr = cp.asarray(CudaArrayInterface(mat))

    return arr


def cp2cv(arr: cp.ndarray) -> cv2.cuda.GpuMat:
    assert len(arr.shape) in (2, 3), "CuPy array must have 2 or 3 dimensions to be a valid GpuMat"
    type_map = {
        cp.dtype('uint8'): cv2.CV_8U,
        cp.dtype('int8'): cv2.CV_8S,
        cp.dtype('uint16'): cv2.CV_16U,
        cp.dtype('int16'): cv2.CV_16S,
        cp.dtype('int32'): cv2.CV_32S,
        cp.dtype('float32'): cv2.CV_32F,
        cp.dtype('float64'): cv2.CV_64F,
        cp.dtype('float16'): cv2.CV_16F
    }
    depth = type_map.get(arr.dtype)
    assert depth is not None, "Unsupported CuPy array dtype"
    channels = 1 if len(arr.shape) == 2 else arr.shape[2]
    mat_type = cv2.CV_MAKETYPE(depth, channels)
    mat = cv2.cuda.createGpuMatFromCudaMemory(arr.__cuda_array_interface__['shape'][1::-1],
                                              mat_type,
                                              arr.__cuda_array_interface__['data'][0])
    return mat


cp_a = cp.random.randint(1, 5, (2, 2)).astype(np.float16)
print('Original CuPy array and pointer')
print(cp_a, cp_a.__cuda_array_interface__['data'][0])
print('')
cv_a = cp2cv(cp_a)
np_a = cv_a.download()
print('')
print('GpuMat, initialized and downloaded to NumPy; and pointer')
print(np_a, cv_a.cudaPtr())
print(f'Assert GpuMat type is float16: {cv_a.type() == cv2.CV_16FC1}')
print('NumPy dtype:', np_a.dtype)
print('')
cp_a2 = cv2cp(cv_a)
print('Back to CuPy from GpuMat: value and pointer:')
print(cp_a2, cp_a2.__cuda_array_interface__['data'][0])
print('')

print('Now try to upload a float16 NumPy array to GpuMat:')
np_a = np.random.randint(1, 5, (2, 2)).astype(np.float16)
cv_a = cv2.cuda_GpuMat()
cv_a.upload(np_a)
```

### Issue submission checklist

- [X] I report the issue, it's not a question
- [X] I checked the problem with documentation, FAQ, open issues, forum.opencv.org, Stack Overflow, etc and have not found any solution
- [X] I updated to the latest OpenCV version and the issue is still there
- [X] There is reproducer code and related data files (videos, images, onnx, etc)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Python binding for cuda::GpuMat does not handle float16 properly when downloading to or uploading from a NumPy array #23687

System Information

Detailed description

Steps to reproduce

Issue submission checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Python binding for cuda::GpuMat does not handle float16 properly when downloading to or uploading from a NumPy array #23687

Description

System Information

Detailed description

Steps to reproduce

Issue submission checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions