-
-
Notifications
You must be signed in to change notification settings - Fork 56.5k
UMat::getMat() and more funcs fail to handle exceptions -- leads to refcount errors that cascade #20594
Description
UMat::getMat()` and 9 other functions fail to handle exceptions. Therefore, when an exception occurs, it leads to refcount errors that cascade more failures in later use of the counted object...sometimes much later in code.
Errant code was found during code review/debugging the OpenCLAllocator.
Fixes in many of the functions are reasonably easy.
I can fix all the non-CUDA functions. Would someone else take the 3 CUDA?
System information (version)
- OpenCV => 3.x and 4.x
- Operating System / Platform => all
- Compiler => all
Detailed description
One example...
opencv/modules/core/src/umatrix.cpp
Lines 965 to 976 in 77a5c43
| Mat UMat::getMat(int accessFlags) const | |
| { | |
| if(!u) | |
| return Mat(); | |
| // TODO Support ACCESS_READ (ACCESS_WRITE) without unnecessary data transfers | |
| accessFlags |= ACCESS_RW; | |
| UMatDataAutoLock autolock(u); | |
| if(CV_XADD(&u->refcount, 1) == 0) | |
| u->currAllocator->map(u, accessFlags); | |
| if (u->data != 0) | |
| { | |
| Mat hdr(dims, size.p, type(), u->data + offset, step.p); |
Easy repro. Imagine an exception occurs in the call u->currAllocator->map. Or in type(), or the Mat constructor making hdr.
Lets imagine it is the map() call that throws an exception.
Since there is no try/catch, the exception cascades up to the caller of UMat::getMat() and skips all code in this function after the map() call. Therefore, the refcount/state of the UMat is now incorrect. Why? Because the refcount was incremented by CV_XADD() before the call to map(). And since the exception occurred, the later code in this function CV_XADD(-1) never runs.
Code in OpenCV modules needs the state of Mat/UMat to be consistent after exceptions. For example, if an exception occurs in CV_OCL_RUN(), then calling code usually has fallback non-OCL code afterwards. Therefore, the UMat/Mat used as parameters must still be valid and correct. The OpenCV codebase relies on this.
Fix
Fixes can likely be isolated to the specific errant function. I looked -- CV_XADD() backends are all C apis, therefore they do not throw exceptions. Therefore, a try/catch can start around the CV_XADD and catch the different error paths.
Here is a pseudocode fix for UMat::getMat()
UMatDataAutoLock autolock(u);
try
{
if(CV_XADD(&u->refcount, 1) == 0)
u->currAllocator->map(u, accessFlags);
if (u->data != 0)
{
Mat hdr(dims, size.p, type(), u->data + offset, step.p);
hdr.flags = flags;
hdr.u = u;
hdr.datastart = u->data;
hdr.data = u->data + offset;
hdr.datalimit = hdr.dataend = u->data + u->size;
return hdr;
}
}
catch()
{
// I use this code, rather than nothing here, so the actual exception is thrown rather than the no detail exception below
CV_XADD(&u->refcount, -1);
throw;
}
CV_XADD(&u->refcount, -1);
CV_Assert(u->data != 0 && "Error mapping of UMat to host memory.");
return Mat();Notes
I also see a similiar issue in...
UMat::getMat()as in OPMat::getUMat()UMat::UMat(const UMat& m, const Rect& roi)Mat::Mat(const Mat& m, const Rect& roi)Kernel::Impl::cleanupUMats()
There are also issues in the following funcs. I don't recommend changing them at this time due to their scattered manual refcount code. It is too fragile and I believe safer to overhaul them with a future smart pointer re-write. If someone sees a clear and safe way within the existing code approach, please make a suggestion. 🙂
Mat::release()UMatData::~UMatData()
In the CUDA area, I see same issues with refcounts. But I also see suspicious code in these functions where refcounts are not decremented (undo) their increment when the ROI is empty. 🤔 Since this might be an additional bug itself, I do not recommend changing these until that issue is reviewed (and if truly a bug then first fixed). You can see in these functions a check for an empty ROI which sets rows=cols=0 but doesn't decrement the refcount like I would think needed (and is done in similar ROI constructors for u/mat).
cv::cuda::GpuMat::GpuMat(const GpuMat& m, Range rowRange_, Range colRange_)cv::cuda::GpuMat::GpuMat(const GpuMat& m, Rect roi)
Issue submission checklist
- I report the issue, it's not a question
- I checked the problem with documentation, FAQ, open issues,
forum.opencv.org, Stack Overflow, etc and have not found solution - I updated to latest OpenCV version and the issue is still there