Skip to content

Support for YOLOv7 ONNX (not simplified)#22290

Merged
asmorkalov merged 1 commit intoopencv:4.xfrom
fengyuentau:naive_yolov7
Sep 19, 2022
Merged

Support for YOLOv7 ONNX (not simplified)#22290
asmorkalov merged 1 commit intoopencv:4.xfrom
fengyuentau:naive_yolov7

Conversation

@fengyuentau
Copy link
Copy Markdown
Member

@fengyuentau fengyuentau commented Jul 23, 2022

Fixes #22215 .

Test data: opencv/opencv_extra#998
Link to the model: https://drive.google.com/u/0/uc?id=1rm3mIqjJNu0xPTCjMKnXccspazV1B2zv&export=download

Things have to be done for the support:

  • Identity node: getting input from constBlob.
  • Range node: implementation for Range node (all inputs are constant).
  • fix parseExpand with multiple broadcast axes.
  • fix 1D mat issue comes with Range output.
  • acc test

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake
buildworker:Custom=linux-f1
build_image:Custom=ubuntu:18.04

@KishoreElvicto
Copy link
Copy Markdown

kindly fix this issue as soon as possible

@fengyuentau fengyuentau changed the title Supports for YOLOv7 ONNX with dynamic input shape Supports for YOLOv7 ONNX (not simplified) Aug 13, 2022
@fengyuentau fengyuentau added pr: needs test New functionality requires minimal tests set category: dnn (onnx) ONNX suport issues in DNN module labels Aug 22, 2022
@fengyuentau fengyuentau requested review from alalek, rogday and zihaomu and removed request for alalek August 22, 2022 03:25
@fengyuentau fengyuentau marked this pull request as ready for review August 22, 2022 03:26
@fengyuentau
Copy link
Copy Markdown
Member Author

All is ready for reviews. Let me test other linked issues as well.

@fengyuentau fengyuentau removed the pr: needs test New functionality requires minimal tests set label Aug 23, 2022
Copy link
Copy Markdown
Member

@zihaomu zihaomu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution!
Please check the comments below, others are fine for me.

Copy link
Copy Markdown
Member

@zihaomu zihaomu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍

@fengyuentau
Copy link
Copy Markdown
Member Author

For #22361, the user does not provide the model.

For #20324, this patch solved the Range support. But DNN still have problems dealing with other operators in that model.

Copy link
Copy Markdown
Member

@rogday rogday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for contribution! Looks good, have just a few questions.

Rect2d(109.99532f, 246.13585f, 259.82266f, 600.7827f),
Rect2d(386.13947f, 83.04132f, 575.3909f, 189.83206f)};

Mat img = imread(imgPath);
Copy link
Copy Markdown
Member

@rogday rogday Aug 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can testDetectModel be used here? (instead of some/all of the code below)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me have a try.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just recalled a reason why I did not use testDetectModel. For YOLOv7, the final confidence for a box is calculated by multiplying the class confidence and the object confidence in line 855 (conf *= obj_conf;). I dont see this in testDetectModel.detect(), so I did not use testDetectModel.

@fengyuentau
Copy link
Copy Markdown
Member Author

I will try manually upload the model to each CI server tomorrow.

Comment on lines +845 to +857
// filter out non objects
float obj_conf = preds.row(i).at<float>(4);
if (obj_conf < conf_threshold)
continue;

// get class id and conf
Mat scores = preds.row(i).colRange(5, preds.cols);
double conf;
Point maxLoc;
minMaxLoc(scores, 0, &conf, 0, &maxLoc);
conf *= obj_conf;
if (conf < conf_threshold)
continue;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vpisarev, could you please take a look if it makes sense to change the code here to be the same? And reduce this to a call to testDetectModel()?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my understanding, recent object detection models (e.g. YOLOX in opencv_zoo) are trained with predicting IoU between anchors and ground truth boxes (related lines in YOLOv7 repo). In this case, the final confidence should be calculated by multiplying the class confidence and the predicted IoU (which is named obj_conf in my code).

In short, models without such a training scheme should not do this multiplication. So I do not think it makes sense if we do this for region/detection_output layer, considering compatibility for old object detection models.

By the way, we should consider support next gen NMS algorithms (CIoU, DIoU ...) since they are quite popular among recent object detection models.

@fengyuentau
Copy link
Copy Markdown
Member Author

I will try manually upload the model to each CI server tomorrow.

Done. GitHub Actions tests are all passed.

@zihaomu
Copy link
Copy Markdown
Member

zihaomu commented Sep 1, 2022

Update:

Model can works fine with this patch.

The following is old comment.
Hi @fengyuentau, please take a look this comment. This patch can work correct on Simple Layer, but it maybe cause some issue in reshape layer.

@zihaomu zihaomu linked an issue Sep 1, 2022 that may be closed by this pull request
@fengyuentau
Copy link
Copy Markdown
Member Author

fengyuentau commented Sep 1, 2022

Hello @alalek , could you kindly update the dnn model in default? Link to the model is attached to the first comment.

@alalek
Copy link
Copy Markdown
Member

alalek commented Sep 2, 2022

There is some problem with script:

Model <yolov7_not_simplified>
  expect fcd0fa401c83bf2b29e18239a9c2c989c9b8669d
  catch [Errno 2] No such file or directory: 'onnx/models/yolov7_not_simplified.onnx'
  hash check failed - downloading
  get https://drive.google.com/u/0/uc?id=1rm3mIqjJNu0xPTCjMKnXccspazV1B2zv&export=download
  200 OK [<unknown> Mb]
  progress > done
 file onnx/models/yolov7_not_simplified.onnx
  expect fcd0fa401c83bf2b29e18239a9c2c989c9b8669d
  actual bc7f7fe19f90edbde1f71f638c49a16bc2ebe7db
  renaming invalid file to onnx/models/yolov7_not_simplified.onnx.invalid
Following models have not been downloaded:
* onnx/models/yolov7_not_simplified.onnx

For Google documents we should use GDrive() (see download_models.py)


Update: Proper way is using downloader=GDrive('1rm3mIqjJNu0xPTCjMKnXccspazV1B2zv'), instead of URL
But script still fails due to "Google Drive - Virus scan warning" confirmation dialog.


Update2: File is downloaded manually.
GDrive() downloader would be improved later (starting from 3.4 branch).

@alalek
Copy link
Copy Markdown
Member

alalek commented Sep 2, 2022

There are 10 commits, some of them are not needed at all in git history

delete useless files

So please squash commits into one (1 PR - 1 commit rule).

@fengyuentau
Copy link
Copy Markdown
Member Author

@alalek Thank you for the feedback and suggestions. Commits are squashed into one and the model is now downloaded with GDrive() in opencv_extra.

@fengyuentau
Copy link
Copy Markdown
Member Author

Update2: File is downloaded manually.
GDrive() downloader would be improved later (starting from 3.4 branch).

Thank you for the information! I thought I use GDrive() in a wrong way and was testing it till now.

@alalek
Copy link
Copy Markdown
Member

alalek commented Sep 2, 2022

There are failures of test accuracy checks on CI (i5-6600 through KVM):

[ RUN      ] Test_Model.YOLOv7/0, where GetParam() = OCV/CPU
[ INFO:0@228.119] global /build/precommit_linux64/4.x/opencv/modules/dnn/src/onnx/onnx_importer.cpp (822) populateNet DNN/ONNX: loading ONNX v7 model produced by 'pytorch':1.12.0. Number of nodes = 576, initializers = 193, inputs = 1, outputs = 4
[ INFO:0@228.119] global /build/precommit_linux64/4.x/opencv/modules/dnn/src/onnx/onnx_importer.cpp (738) parseOperatorSet DNN/ONNX: ONNX opset version = 13
Unmatched prediction: class 1 score 0.962365 box [472.545 x 466.494 from (106.011, 150.323)]
Highest IoU: 0
/build/precommit_linux64/4.x/opencv/modules/dnn/test/test_common.impl.hpp:143: Failure
Value of: matched
  Actual: false
Expected: true
Unmatched prediction: class 16 score 0.959694 box [259.837 x 600.703 from (110.007, 246.323)]
Highest IoU: 0
/build/precommit_linux64/4.x/opencv/modules/dnn/test/test_common.impl.hpp:143: Failure
Value of: matched
  Actual: false
Expected: true
Unmatched prediction: class 7 score 0.879178 box [575.834 x 189.781 from (386.032, 83.0033)]
Highest IoU: 0
/build/precommit_linux64/4.x/opencv/modules/dnn/test/test_common.impl.hpp:143: Failure
Value of: matched
  Actual: false
Expected: true
Unmatched reference: class 1 score 0.962477 box [472.529 x 466.508 from (105.999, 150.2)] IoU diff: 1
/build/precommit_linux64/4.x/opencv/modules/dnn/test/test_common.impl.hpp:155: Failure
Expected: (refScores[i]) <= (confThreshold), actual: 0.962477 vs 0
Unmatched reference: class 16 score 0.959407 box [259.823 x 600.783 from (109.995, 246.136)] IoU diff: 1
/build/precommit_linux64/4.x/opencv/modules/dnn/test/test_common.impl.hpp:155: Failure
Expected: (refScores[i]) <= (confThreshold), actual: 0.959407 vs 0
Unmatched reference: class 7 score 0.869300 box [575.391 x 189.832 from (386.139, 83.0413)] IoU diff: 1
/build/precommit_linux64/4.x/opencv/modules/dnn/test/test_common.impl.hpp:155: Failure
Expected: (refScores[i]) <= (confThreshold), actual: 0.8693 vs 0
[  FAILED  ] Test_Model.YOLOv7/0, where GetParam() = OCV/CPU (3427 ms)

@fengyuentau
Copy link
Copy Markdown
Member Author

There are failures of test accuracy checks on CI (i5-6600 through KVM):

@alalek I cannot reproduce this issue on my side, and we do not have these failures in GHA checks. I looked into the error message, and the problem was the difference between refScores and testScores (0.000111818, 0.00028795, 0.00987768) exceeding the threshold (1e-5). So these differences break the condition in line 124, and all IoU calculations are skipped, leading to all boxes mismatched:

for (int j = 0; j < refBoxes.size() && !matched; ++j)
{
if (!matchedRefBoxes[j] && testClassId == refClassIds[j] &&
std::abs(testScore - refScores[j]) < scores_diff)
{
double interArea = (testBox & refBoxes[j]).area();
double iou = interArea / (testBox.area() + refBoxes[j].area() - interArea);
topIoU = std::max(topIoU, iou);
refBoxesIoUDiff[j] = std::min(refBoxesIoUDiff[j], 1.0f - iou);
if (1.0 - iou < boxes_iou_diff)
{
matched = true;
matchedRefBoxes[j] = true;
}
}
}

I even tried manually update the base of my branch, re-compile everything, but still the check was passed and the issue cannot be reproduced.

@alalek
Copy link
Copy Markdown
Member

alalek commented Sep 6, 2022

It makes sense to add performance test too.
Here: https://github.com/opencv/opencv/blob/4.6.0/modules/dnn/perf/perf_net.cpp#L221

@fengyuentau
Copy link
Copy Markdown
Member Author

It makes sense to add performance test too. Here: https://github.com/opencv/opencv/blob/4.6.0/modules/dnn/perf/perf_net.cpp#L221

It is indeed a good place for the test, but I think we keep it in test_onnx_importer.onnx since we need to check the result correctness.

@fengyuentau
Copy link
Copy Markdown
Member Author

Hello @asenyaev , I observed that in Windows10-x64, the model that I manually uploaded to $dnn_models/dnn/onnx/models/yolov7-not-simplified.onnx got deleted somehow from time to time. But this issue did not happen on other hosts. Is it because of the scheduled update script or something like that?

@rogday rogday requested a review from vpisarev September 6, 2022 13:01
@alalek
Copy link
Copy Markdown
Member

alalek commented Sep 6, 2022

since we need to check the result correctness.

Adding of performance test doesn't mean to drop the accuracy test.

@asmorkalov asmorkalov changed the title Supports for YOLOv7 ONNX (not simplified) Support for YOLOv7 ONNX (not simplified) Sep 19, 2022
@asenyaev
Copy link
Copy Markdown
Contributor

Hello @fengyuentau! I checked logs in different workflows and want to say it's a strange behavior, the model was copied in the latest pipeline and DNN models weren't updated that day. Moreover, when models update, self-downloaded are not deleted. I see your model right there since September 6.

@fengyuentau
Copy link
Copy Markdown
Member Author

Thank you @asenyaev . I guess I should just store the model somewhere else other than google drive, where our CI servers can download from, to avoid this kind of issue.

Copy link
Copy Markdown
Member

@rogday rogday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@asmorkalov
Copy link
Copy Markdown
Contributor

@fengyuentau Please sqush the commits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: dnn (onnx) ONNX suport issues in DNN module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cannot load onnx (opset 12) of Resnet + MobileVit When support to YOLO v7?

8 participants