Conversation
|
I have already start one pr #24267 for yolox |
This is a unified API for all yolo detectors |
wrong it's like my pr but for all yolo models |
Sorry, meant API for yolo detectors |
I made this PR because of this |
| CV_DEPRECATED_EXTERNAL // avoid using in C++ code (need to fix bindings first) | ||
| YoloObjectDetector(); | ||
|
|
||
| CV_WRAP void detect(InputArray frame, CV_OUT std::vector<int>& classIds, |
There was a problem hiding this comment.
We should not have this here.
User uses DetectionModel::detect() from the base class instead. Inheritance is designed for that.
There was a problem hiding this comment.
Type of box rectangle is different from detection model's detect method.
There was a problem hiding this comment.
How user and bindings should work with that?
Inheritance is not just a word.
Why DetectionModel doesn't need that method overload?
There was a problem hiding this comment.
I have just checked the bindings. Call to detect method in python returns floating points for boxes as expected.
So, I do not understand, what is the issue?
I have overridden the detect method since I changed box from Rect to Rect2d.
Do you want me to keep it as in the ancestor class? If yes, why?
There was a problem hiding this comment.
Is there any reason for object detection use Rect2d? All subsequent actions with detected object require conversion to Rect2i anyway (ROI subtraction, drawing).
There was a problem hiding this comment.
In that regard you are right, there is no point. I was coming from the point of view "what base model outputs, opencv should return it as well". You suggest to change it to Rect?
There was a problem hiding this comment.
Anyway we apply some postprocessing so output is not an origin. So yes, integer type is enough for such a high level API and I cannot image the case when float/double precision is required.
There was a problem hiding this comment.
Why do we still have this here?
User uses DetectionModel::detect() from the base class instead.
There was a problem hiding this comment.
I overload detect in Imlp class. Without definition in the base class in fails to build. Could can check on your side
| CV_OUT std::vector<float>& confidences, CV_OUT std::vector<Rect2d>& boxes, | ||
| float confThreshold = 0.5f, float nmsThreshold = 0.0f); | ||
|
|
||
| CV_WRAP void postProccess( |
There was a problem hiding this comment.
It is better to move it outside of this class with appropriate names. Like NMSBoxes()
There was a problem hiding this comment.
Made it static
|
@dkurt @akarsakov @opencv-alalek some points on which I need you opinion
|
modules/dnn/src/model.cpp
Outdated
| ImagePaddingMode paddingMode; | ||
| YoloVersion yoloVersion; | ||
| float padValue; | ||
| float Height, Width; |
There was a problem hiding this comment.
| float Height, Width; | |
| Size frameSize; |
There was a problem hiding this comment.
when I use Size instead of floats model predictions start drifting for some unknown reason
There was a problem hiding this comment.
Probably because of integer division. Please rename at least to lower case height and width
|
modules/dnn/src/model.cpp
Outdated
| padValue = padValue_; | ||
| } | ||
|
|
||
| void boxRescale(CV_OUT std::vector<Rect>& boxes){ |
| CV_DEPRECATED_EXTERNAL // avoid using in C++ code (need to fix bindings first) | ||
| YoloObjectDetector(); | ||
|
|
||
| CV_WRAP void detect(InputArray frame, CV_OUT std::vector<int>& classIds, |
There was a problem hiding this comment.
Why do we still have this here?
User uses DetectionModel::detect() from the base class instead.
modules/dnn/src/model.cpp
Outdated
| ImagePaddingMode paddingMode; | ||
| YoloVersion yoloVersion; | ||
| float padValue; | ||
| float Height, Width; |
modules/dnn/src/model.cpp
Outdated
| std::vector<Mat> detections; | ||
| impl.dynamicCast<YOLODetectionModel_Impl>()->processFrame(frame, detections); | ||
|
|
||
| postProccess( |
There was a problem hiding this comment.
No business logic is allowed in PImpl methods.
There was a problem hiding this comment.
Where should this logic exist if not in overridden method of child class?
There was a problem hiding this comment.
Obviously in implementation class.
There was a problem hiding this comment.
then why does DetectionModel have the same "business logic" implemented in the main class? Take a look here
There was a problem hiding this comment.
It should be refactored too (moved to Impl part).
Check TextDetectionModel => TextDetectionModel_EAST / TextDetectionModel_DB
Other minor fixes from PR
| for(auto & detection : detections){ | ||
| cv::transposeND(detection, {0, 2, 1}, detection); | ||
| } | ||
| } |
| } | ||
|
|
||
| void YOLODetectionModel::postProccess( | ||
| std::vector<Mat>& detections, |
There was a problem hiding this comment.
| std::vector<Mat>& detections, | |
| const std::vector<Mat>& detections, |
| } else { | ||
| CV_Error(Error::StsNotImplemented, "Unsupported padding mode"); | ||
| } | ||
| } |
| net.forward(outs, outNames); | ||
| } | ||
|
|
||
| void setPaddingValue(const float padValue_){ |
| YOLODetectionModel::YOLODetectionModel(const String& model, const String& config) | ||
| { | ||
| impl = makePtr<YOLODetectionModel_Impl>(); | ||
| impl->initNet(readNet(model, config)); |
| for (auto preds : detections) | ||
| { | ||
| if (!darknet) | ||
| preds = preds.reshape(1, preds.size[1]); |
There was a problem hiding this comment.
This step is valid for every model. So can we remove the if condition?
There was a problem hiding this comment.
Not valid. For yolov5 and larger model not valid
There was a problem hiding this comment.
Can you please share the shapes for them?
|
|
||
| if (!darknet && detections[0].size[1] < detections[0].size[2]) { | ||
| yolov8 = true; // Set the correct flag based on tensor shape | ||
| } |
There was a problem hiding this comment.
darknet flag can be removed completely considering https://github.com/opencv/opencv/pull/24691/files#r1434763871 and the following change:
if (detections[0].dims == 3 && detections[0].size[1] < detections[0].size[2]) {
yolov8 = true; // Set the correct flag based on tensor shape
}150b264 to
2899387
Compare
Documentation for Yolo usage in Opencv #24898 This PR introduces documentation for the usage of yolo detection model family in open CV. This is not to be merge before #24691, as the sample will need to be changed. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake
This PR introduces a unified API for yolo object detection family.
Note: This is a preliminary PR, it is not to merged but is proposed to get feedback
Issue to fix:
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.