Skip to content

Fix bug with predictions in RTrees/Boost#19884

Merged
alalek merged 3 commits intoopencv:3.4from
danielenricocahall:fix-prediction-features-bug
Apr 9, 2021
Merged

Fix bug with predictions in RTrees/Boost#19884
alalek merged 3 commits intoopencv:3.4from
danielenricocahall:fix-prediction-features-bug

Conversation

@danielenricocahall
Copy link
Copy Markdown
Contributor

@danielenricocahall danielenricocahall commented Apr 9, 2021

Addresses #12974, although that issue incorrectly mentions the saving/loading of the model, which is not relevant. The source of the issue is no validation when calling the predict method, which we have in other models i.e;

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • The PR is proposed to proper branch
  • There is reference to original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

// create a simple 2 feature dataset and train + save the model
cv::Ptr<RTrees> model = RTrees::create();
Mat samples(2, 2, CV_32FC1);
samples.at<float>(0,0) = 0;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Matrices can be initialized in more compact way, check out the previous test.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call - funny enough, I was also the one who wrote the previous test :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

cv::Ptr<TrainData> trainData = TrainData::create(samples, cv::ml::ROW_SAMPLE, responses);
model->train(trainData);
// load the model and try to predict on a 1 feature sample
Mat test(1, 1, CV_32FC1);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to check multiple sizes in a loop (1..n-1 and n+1)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mshabunin not sure I understand what you mean?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! You mean assert that the test fails for < n and > n. Good idea.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - I increased the number of features to make it more comprehensive too (now we have 5 in training, and test on 0-4 and 6).

@alalek alalek merged commit a9a6801 into opencv:3.4 Apr 9, 2021

float predict( InputArray samples, OutputArray results, int flags ) const CV_OVERRIDE
{
CV_Assert( samples.cols() == getVarCount() && samples.type() == CV_32F );
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielenricocahall would you mind explain why this assert was in here?
The test from text module was relying on this code, and it was passing before this PR, but now it raises this assertion.

Before

[==========] Running 12 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 12 tests from Text/Detection
[ RUN      ] Text/Detection.sample/0, where GetParam() = ("text/scenetext01.jpg", false)
Image: text/scenetext01.jpg
Orientation: horiz
Found groups: 25
[       OK ] Text/Detection.sample/0 (4981 ms)
[ RUN      ] Text/Detection.sample/1, where GetParam() = ("text/scenetext01.jpg", true)
[     SKIP ] ERGROUPING_ORIENTATION_ANY mode is not supported
[       OK ] Text/Detection.sample/1 (4 ms)
  :
[==========] 12 tests from 1 test case ran. (37679 ms total)
[  PASSED  ] 12 tests.

After

[==========] Running 12 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 12 tests from Text/Detection
[ RUN      ] Text/Detection.sample/0, where GetParam() = ("text/scenetext01.jpg", false)
Image: text/scenetext01.jpg
Orientation: horiz
unknown file: Failure
C++ exception with description "OpenCV(3.4.14-dev) /opencv-fork/modules/ml/src/boost.cpp:493: error: (-215:Assertion failed) samples.cols() == getVarCount() && samples.type() == CV_32F in function 'predict'
" thrown in the test body.
[  FAILED  ] Text/Detection.sample/0, where GetParam() = ("text/scenetext01.jpg", false) (68 ms)
[ RUN      ] Text/Detection.sample/1, where GetParam() = ("text/scenetext01.jpg", true)
[     SKIP ] ERGROUPING_ORIENTATION_ANY mode is not supported
[       OK ] Text/Detection.sample/1 (4 ms)
  :
[==========] 12 tests from 1 test case ran. (861 ms total)
[  PASSED  ] 6 tests.
[  FAILED  ] 6 tests, listed below:
[  FAILED  ] Text/Detection.sample/0, where GetParam() = ("text/scenetext01.jpg", false)
[  FAILED  ] Text/Detection.sample/2, where GetParam() = ("text/scenetext02.jpg", false)
[  FAILED  ] Text/Detection.sample/4, where GetParam() = ("text/scenetext03.jpg", false)
[  FAILED  ] Text/Detection.sample/6, where GetParam() = ("text/scenetext04.jpg", false)
[  FAILED  ] Text/Detection.sample/8, where GetParam() = ("text/scenetext05.jpg", false)
[  FAILED  ] Text/Detection.sample/10, where GetParam() = ("text/scenetext06.jpg", false)

I'm trying on Jetson series and Raspberry Pis but all the test results are pointing that this assertion is too strict.
May be samples.type() == CV_32F could be removed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just using the same assertion that is used in other predict methods:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be samples.type() == CV_32F could be removed?

In general it should be removed. Types should be defined by underlying classifiers in Boost (e.g, see DTreesImpl::predictTrees).

Unfortunately, in opencv_text case error is raised on the first part of condition:

C++ exception with description "OpenCV(3.4.14-dev) /home/alalek/projects/opencv/dev/modules/ml/src/boost.cpp:493: error: (-2:Unspecified error) in function 'virtual float cv::ml::BoostImpl::predict(cv::InputArray, cv::OutputArray, int) const'
>  (expected: 'samples.cols() == getVarCount()'), where
>     'samples.cols()' is 4
> must be equal to
>     'getVarCount()' is 5
" thrown in the test body.

So, opencv_text module should be checked / fixed too.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem is caused by invalid import in "ml" module (misused "legacy" mode due to bug #5412).
opencv_text code looks valid.

danielenricocahall added a commit to danielenricocahall/opencv that referenced this pull request May 27, 2021
…n-features-bug

Fix bug with predictions in RTrees/Boost

* address bug where predict functions with invalid feature count in rtrees/boost models

* compact matrix rep in tests

* check 1..n-1 and n+1 in feature size validation test
@alalek alalek mentioned this pull request Jun 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants