Fix bug with predictions in RTrees/Boost by danielenricocahall · Pull Request #19884 · opencv/opencv

danielenricocahall · 2021-04-09T10:06:37Z

Addresses #12974, although that issue incorrectly mentions the saving/loading of the model, which is not relevant. The source of the issue is no validation when calling the predict method, which we have in other models i.e;

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
The PR is proposed to proper branch
There is reference to original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

…ees/boost models

mshabunin · 2021-04-09T10:21:30Z

modules/ml/test/test_rtrees.cpp

+    // create a simple 2 feature dataset and train + save the model
+    cv::Ptr<RTrees> model = RTrees::create();
+    Mat samples(2, 2, CV_32FC1);
+    samples.at<float>(0,0) = 0;


Matrices can be initialized in more compact way, check out the previous test.

Good call - funny enough, I was also the one who wrote the previous test :)

mshabunin · 2021-04-09T10:24:08Z

modules/ml/test/test_rtrees.cpp

+    cv::Ptr<TrainData> trainData = TrainData::create(samples, cv::ml::ROW_SAMPLE, responses);
+    model->train(trainData);
+    // load the model and try to predict on a 1 feature sample
+    Mat test(1, 1, CV_32FC1);


Does it make sense to check multiple sizes in a loop (1..n-1 and n+1)?

@mshabunin not sure I understand what you mean?

Ah! You mean assert that the test fails for < n and > n. Good idea.

Done - I increased the number of features to make it more comprehensive too (now we have 5 in training, and test on 0-4 and 6).

tomoaki0705 · 2021-04-12T15:13:41Z

modules/ml/src/boost.cpp


    float predict( InputArray samples, OutputArray results, int flags ) const CV_OVERRIDE
    {
+        CV_Assert( samples.cols() == getVarCount() && samples.type() == CV_32F );


@danielenricocahall would you mind explain why this assert was in here?
The test from text module was relying on this code, and it was passing before this PR, but now it raises this assertion.

Before

[==========] Running 12 tests from 1 test case. [----------] Global test environment set-up. [----------] 12 tests from Text/Detection [ RUN ] Text/Detection.sample/0, where GetParam() = ("text/scenetext01.jpg", false) Image: text/scenetext01.jpg Orientation: horiz Found groups: 25 [ OK ] Text/Detection.sample/0 (4981 ms) [ RUN ] Text/Detection.sample/1, where GetParam() = ("text/scenetext01.jpg", true) [ SKIP ] ERGROUPING_ORIENTATION_ANY mode is not supported [ OK ] Text/Detection.sample/1 (4 ms) : [==========] 12 tests from 1 test case ran. (37679 ms total) [ PASSED ] 12 tests.

After

[==========] Running 12 tests from 1 test case. [----------] Global test environment set-up. [----------] 12 tests from Text/Detection [ RUN ] Text/Detection.sample/0, where GetParam() = ("text/scenetext01.jpg", false) Image: text/scenetext01.jpg Orientation: horiz unknown file: Failure C++ exception with description "OpenCV(3.4.14-dev) /opencv-fork/modules/ml/src/boost.cpp:493: error: (-215:Assertion failed) samples.cols() == getVarCount() && samples.type() == CV_32F in function 'predict' " thrown in the test body. [ FAILED ] Text/Detection.sample/0, where GetParam() = ("text/scenetext01.jpg", false) (68 ms) [ RUN ] Text/Detection.sample/1, where GetParam() = ("text/scenetext01.jpg", true) [ SKIP ] ERGROUPING_ORIENTATION_ANY mode is not supported [ OK ] Text/Detection.sample/1 (4 ms) : [==========] 12 tests from 1 test case ran. (861 ms total) [ PASSED ] 6 tests. [ FAILED ] 6 tests, listed below: [ FAILED ] Text/Detection.sample/0, where GetParam() = ("text/scenetext01.jpg", false) [ FAILED ] Text/Detection.sample/2, where GetParam() = ("text/scenetext02.jpg", false) [ FAILED ] Text/Detection.sample/4, where GetParam() = ("text/scenetext03.jpg", false) [ FAILED ] Text/Detection.sample/6, where GetParam() = ("text/scenetext04.jpg", false) [ FAILED ] Text/Detection.sample/8, where GetParam() = ("text/scenetext05.jpg", false) [ FAILED ] Text/Detection.sample/10, where GetParam() = ("text/scenetext06.jpg", false)

I'm trying on Jetson series and Raspberry Pis but all the test results are pointing that this assertion is too strict.
May be samples.type() == CV_32F could be removed?

I was just using the same assertion that is used in other predict methods:

https://github.com/opencv/opencv/blob/master/modules/ml/src/svm.cpp#L2013

https://github.com/opencv/opencv/blob/master/modules/ml/src/knearest.cpp#L370

https://github.com/opencv/opencv/blob/master/modules/ml/src/nbayes.cpp#L314
would it make sense to remove it, or is there something in the text test that should be updated?

May be samples.type() == CV_32F could be removed?

In general it should be removed. Types should be defined by underlying classifiers in Boost (e.g, see DTreesImpl::predictTrees).

Unfortunately, in opencv_text case error is raised on the first part of condition:

C++ exception with description "OpenCV(3.4.14-dev) /home/alalek/projects/opencv/dev/modules/ml/src/boost.cpp:493: error: (-2:Unspecified error) in function 'virtual float cv::ml::BoostImpl::predict(cv::InputArray, cv::OutputArray, int) const' > (expected: 'samples.cols() == getVarCount()'), where > 'samples.cols()' is 4 > must be equal to > 'getVarCount()' is 5 " thrown in the test body.

So, opencv_text module should be checked / fixed too.

Problem is caused by invalid import in "ml" module (misused "legacy" mode due to bug #5412).
opencv_text code looks valid.

…n-features-bug Fix bug with predictions in RTrees/Boost * address bug where predict functions with invalid feature count in rtrees/boost models * compact matrix rep in tests * check 1..n-1 and n+1 in feature size validation test

address bug where predict functions with invalid feature count in rtr…

88001a3

…ees/boost models

mshabunin reviewed Apr 9, 2021

View reviewed changes

danielenricocahall added 2 commits April 9, 2021 06:25

compact matrix rep in tests

5716d99

check 1..n-1 and n+1 in feature size validation test

fd5e03b

mshabunin approved these changes Apr 9, 2021

View reviewed changes

alalek assigned mshabunin Apr 9, 2021

alalek merged commit a9a6801 into opencv:3.4 Apr 9, 2021

tomoaki0705 reviewed Apr 12, 2021

View reviewed changes

This was referenced Apr 12, 2021

class_labels vector in boosted decision tree classifier is empty when saved and then loaded from disk #5412

Closed

ML: fix legacy import in DTreesImpl #19894

Merged

ML: update checks #19901

Merged

(4.x) Merge 3.4 #19916

Merged

alalek mentioned this pull request Jun 4, 2021

(5.x) Merge 4.x #20216

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix bug with predictions in RTrees/Boost#19884

Fix bug with predictions in RTrees/Boost#19884
alalek merged 3 commits intoopencv:3.4from
danielenricocahall:fix-prediction-features-bug

danielenricocahall commented Apr 9, 2021 •

edited

Loading

Uh oh!

mshabunin Apr 9, 2021

Uh oh!

danielenricocahall Apr 9, 2021

Uh oh!

danielenricocahall Apr 9, 2021

Uh oh!

mshabunin Apr 9, 2021

Uh oh!

danielenricocahall Apr 9, 2021

Uh oh!

danielenricocahall Apr 9, 2021

Uh oh!

danielenricocahall Apr 9, 2021

Uh oh!

tomoaki0705 Apr 12, 2021

Uh oh!

danielenricocahall Apr 12, 2021

Uh oh!

alalek Apr 12, 2021

Uh oh!

alalek Apr 12, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

danielenricocahall commented Apr 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Readiness Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Before

After

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

danielenricocahall commented Apr 9, 2021 •

edited

Loading