cuda::flip - Use in-place npp function for inplace arguments#17863
cuda::flip - Use in-place npp function for inplace arguments#17863opencv-pushbot merged 1 commit intoopencv:3.4from
Conversation
3900414 to
bfefe79
Compare
alalek
left a comment
There was a problem hiding this comment.
It is not a CUDA-only issue for this particular function.
It is related to many CPU and OpenCL implementations (almost zero tests for this mode).
So some robust generic approach should be applied.
Possible fixes degrade performance, so we want to forbid such usages (to keep OpenCV library and applications optimized).
We don't want to trying fixing that and adding corresponding tests.
see also #13570 (it is closed without raising a general issue)
bfefe79 to
bb34398
Compare
|
@alalek Consulting this issue comment, should we raise an exception instead? |
bb34398 to
58dcd52
Compare
58dcd52 to
9411cd6
Compare
|
Instead of copying the source matrix, this code calls in-place npp function if given in-place arguments. |
|
Looks good to me. Passed test on Ubuntu 18.04 with CUDA 10.2 and GPU 1080Ti. |
|
@nglee The same issue is reproducible with master branch too, but the code for master is moved to opencv_contrib repo. Could you propose yet another PR to contrib repository too? |
|
@asmorkalov |
@asmorkalov Once in the past, I was pointed from @alalek that this is not necessary, porting one PR from opencv to opencv_contrib.
And alalek also did the same recently Was there any special reason for this PR to be ported to opencv_contrib ? |
|
It's good idea to have CUDA implementations aligned between branches, especially if the patch could be ported as is without serious modifications. |
|
@asmorkalov , my point is not "why", but "who". |
|
General approach is to merge 3.4 to master weekly or so. Alalek does it and you usually hould not worry about it. Cuda changes cannot be just merged to contrib with git and some manual work is required. I usually ask the PR author to do porting as soon as author is in context can can do it faster. |
Resolves #17840
#17863 (comment)
cv::cuda::GpuMat::createdoes not allocate a new GpuMat if required size and type equals existing size and type.opencv/modules/core/src/cuda/gpu_mat.cu
Lines 160 to 161 in bf8136e
This PR implementscv::cuda::detachOutputto allocate a new GpuMat in such cases. Unfortunately,_dstis const so assignment is not possible. Instead, this implementation clonessrc.Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.