[GSOC] New camera model for stitching pipeline#6933
Conversation
|
ORB is apparently thread-unsafe when running with OpenCL. I have restarted the build once. Last time it passed 1 OCL stitching tests, this time it went through 2 of them. There seems to be a race condition. Is ORB expected to be thread-unsafe with OCL or should this be fixed? I can disable my parallel feature finding changes with OCL, but I don't know if this is not an actual bug in ORB. |
|
I have disabled parallel feature finding when running with OpenCL. There is not much benefit, because it needs to wait for the device. This solves the issues with ORB for me, but I'm still not sure if this should be opened as bug or not. |
|
rebase to catch latest changes in master (especially #6962) |
|
FYI: The build is not failing, there is only a warning that patch to opencv_extra is too big (~1MB). Is that a problem? I'd like to add more images. |
|
I believe this is not a problem in this case. |
|
Jiri, I have some comments/suggestions that might be easier to discuss offline. Can you shoot me an email at dmz@mself.com to connect? I did some work to create variants of findHomography() that estimate constrained transformations that have 3 or 4 degrees of freedom rather than the 8 DOF of a full homography. These correspond to (3D) rotations only, and rotations plus uniform scaling. This is similar to your 4 DOF variant. I'd be interested in adding a 3 DOF version that only allows a (2D) rotation plus translation, for example. --Matthew |
modules/calib3d/src/ptsetreg.cpp
Outdated
| double* Bdata = B.ptr<double>(); | ||
| A = Scalar::all(0); | ||
|
|
||
| for( int i = 0; i < (N/2); i++ ) |
There was a problem hiding this comment.
Shouldn't runKernel() include all of the points in _m1 and _m2 in the estimate rather than just the first 3? That's what HomographyEstimatorCallback.runKernel() does. I presume that RANSAC will call this with sets of 3 points to determine the consensus set, but does it also call it at the end to refine the estimate with all of the points in the consensus set?
There was a problem hiding this comment.
Or, to reverse the question, why does findHomography() do it the way it does?
There was a problem hiding this comment.
I guess this is for improving initial guess for LM refining. I did some experiments with estimateAffine2D* functions and the transformation produced by RANSAC seems to be good enough initial guess. LM needs only 3 iterations in most cases and produced transformation is reasonable.
There was a problem hiding this comment.
Yes, I think you are right. I did some tests on findHomography() to see what effect the final call to cb->runKernel has (it is run on all of the consensus points before starting the LM refinement. Basically, if you have either that final call to cb->runKernel or the LM step you get the same results as if you have both. My tests were with fairly stable transforms (adjacent video frames), but perhaps this additional step helps with stability on more extreme transitions. I think your approach is fine.
There was a problem hiding this comment.
Hm this is getting interesting. If I remove cb->runKernel LM seems to converge just fine, using juste about 5 iterations in most cases I tested. From POV cb->runKernel does not seem to be necessary.
git blames @vpisarev probably he could know more.
| M.copyTo(_model); | ||
| return 1; | ||
| } | ||
| }; |
There was a problem hiding this comment.
You could also override checkSubset with a version that always returns true, since calling the parent class's version will always return true when count is 2. For the partial affine case, any 2 points will always produce a valid estimate. (Unless perhaps if they are too close to each other, but the parent class's function doesn't check for that in any case?)
There was a problem hiding this comment.
Nice catch. Thank you.
I also though about too close points when I was implementing it (because estimateRigidTransform checks that), but neither estimateAffine3D nor findHomography does that, so I think it's ok to assume that this is in fact not causing too much problems.
|
ok. I think I have solved the the issue with coincident points. I have reworked |
|
I have experimented with solving the system in affinePartial callback analytically. In my experiments it surprisingly runs slower than SVD version. SVD: analytic: you can find the code at 57bf2d4 |
|
That's surprising! If you're up for more experiments, I realized that you can solve the entire kernel analytically without even a matrix multiply. This should be even faster. I can't see how SVD could be faster than this! The compiler should be able to optimize all of the common subexpressions and there are no function calls. |
|
Yep, I was also surprised. I think that the kernel is not a bottle neck for the function. But I will try your version, that seems even better. I will optimize copying inliers, which is currently quite ineficient, that could also speed something up. |
modules/calib3d/src/ptsetreg.cpp
Outdated
| { | ||
| Mat ms1 = _ms1.getMat(), ms2 = _ms2.getMat(); | ||
| // check colinearity and also check that points are too close | ||
| return !(haveCollinearPoints(ms1, count) || haveCollinearPoints(ms2, count)); |
There was a problem hiding this comment.
I think you should only check _ms1 for collinear points and not _ms2. The estimator is only unstable when _ms1 has collinear points -- it doesn't matter if _ms2 does or not. Also, consider the case where the true affine transform is degenerate (i.e. maps all input points to a line). This case may not be common in CV, but it is valid and there is no reason not to produce an accurate estimate of the transform in this case. By checking _ms2, this case won't be able to be estimated and will error out instead. Plus it's faster to only check _ms1.
There was a problem hiding this comment.
Actually, I think it might be faster to not do these checks at all. It seems like they may be more expensive than the work that they will (occasionally) save. Instead, checkSubset() could always return true and runKernel() can return 0 if the determinant is too small (< FLT_EPSILON). This isn't extra work since runKernel() has to compute the determinant in any case.
I think that for findHomography() the situation is different. In that case runKernel() is more expensive (Eigen decomp of an 8x8 matrix) and the geometric consistency check is quite cheap.
|
I have updated the perf test to use the new API. It seems that lot of time is spent in Levenberg-Marquart refining. RANSAC is only about 1/2 or even 1/3 runtime. LMEDS takes much longer, faster kernel makes more sense here. Here are current results with SVD-based kernels: |
|
OK, cool. Perhaps the reason that |
|
Now that you've updated the APIs, it would be interesting to compare the performance of the analytic |
reestimation step is not needed. estimateAffine2D* functions are running their own reestimation on inliers using the Levenberg-Marquardt algorithm, which is better than simply rerunning RANSAC on inliers.
bundle adjuster that expect affine transform with 4DOF. Refines parameters for all cameras together. stitching: fix bug in BundleAdjusterAffinePartial * use the invers properly * use static buffer for invers to speed it up
* add support for using affine bundle adjuster with 4DOF * improve logging of initial intristics
prevents spurious test failures on mac. values are still pretty fine.
* fix bug with AffineBestOf2NearestMatcher (we want to select affine partial mode) * select right bundle adjuster
* this prevents failure on mac. tranformation is still ok.
* implements affine bundle adjuster that is using full affine transform * existing test case modified to test both affinePartial an full affine bundle adjuster
* show basic usage of stitching api (Stitcher class)
* added new datasets to existing testcase * removed unused include
* added comment to make that this also checks too close points
* use common function to check collinearity * this also ensures that point will not be too close to each other
* more similar to `findHomography`, `findFundamentalMat`, `findEssentialMat` and similar * follows standard recommended semantic INPUTS, OUTPUTS, FLAGS * allows to disable refining * supported LMEDS robust method (tests yet to come) along with RANSAC * extended docs with some tips
* rewrite in googletest style * parametrize to test both robust methods (RANSAC and LMEDS) * get rid of boilerplate
* rework in googletest style * add testing for LMEDS
* test for LMEDS speed * test with/without Levenberg-Marquart * remove sanity checking (this is covered by accuracy tests)
* test transformations in loop * improves test by testing more potential transformations
* use analytical solution instead of SVD * this version is faster especially for smaller amount of points
* avoid copying inliers * avoid converting input points if not necessary * check only `from` point for collinearity, as `to` does not affect stability of transform
* add some examples how to run stitcher sample code * mention stitching_detailed.cpp
* do error computing in floats instead of doubles this have required precision + we were storing the result in float anyway. This make code faster and allows auto-vectorization by smart compilers.
* refer to new functions on appropriate places * prefer estimateAffine*2D over estimateRigidTransform
|
rebased again. |
* mention camera models in module documentation to give user a better overview and reduce confusion
|
I have restarted the build due to github DNS issues. |
Merge with extra: opencv/opencv_extra#303
This PR contains all work for New camera model for stitching pipeline GSoC 2016 project.
GSoC Proposal
Stitching pipeline is a well established code in OpenCV. It provides good results for creating panoramas from camera captured images. Main limitation of stitching pipeline is its expected camera model (perspective transformation). Although this model is fine for many applications working with camera captured images, there are applications which aren't covered by current stitching pipeline.
New camera model
Due to physical constraints it is possible for some applications to expect much simpler transform with less degrees of freedom. Those are situations when input data are not subject to perspective transform. The transformation can be much simpler, such as affine transformation. Datasets considered here includes images captured by special hardware (such as book scanners[0] that tries hard to eliminate perspective), maps from laser scanning (produced from different starting points), preprocessed images (where perspective was compensated by other robust means, taking advantage of physical situation, e.g. for book scanners we would use data from calibration to compensate remaining perspective). In all those situations we would like to obtain image mosaic under affine transformation.
I'd like to introduce new camera model based on affine transformation to stitching pipeline. This would include:
I used approach based on affine transformation to merge maps produced by multiple robots [1] for my robotics project. It shows a good results. However, as mentioned earlier applications for this model are much broader than that.
Parallelism for FeaturesFinder
To make usage of stitching pipeline more comfortable and performant for large number of images, I’d like also to improve FeaturesFinder to allow finding features in parallel. All camera models and other users of FeaturesFinder may take benefit from that. The API could be similar to
FeaturesMatcher::operator ()(features, pairwise_matches, mask).This could be with TBB in similar manner as mentioned method in FeaturesMatcher, which is already being used in stitching pipeline so there would be almost no additional overhead in starting new threads in typical scenarios, because these threads are there already for FeaturesMatcher. This change would be fully integrated into high level stitching interface.
There might be some changes necessary in finders to ensure thread-safety. Where thread-safety can’t be ensured or it does not make sense (GPU finders), parallelization would be disabled and all images would be processed in serial manner so this method would be always safe to use regardless of underlying finder. This approach is also similar to FeaturesMatcher.
Benefits to OpenCV
implemented goals (all + extras)
new camera model
parallel feature finding
implemented extras
video
other work
During this GSoC I have also coded some related work, that is not going to be included (mostly because we has chosen different approach or the work has been merged under this PR). It is listed here for completeness.
PRs:
commits: