[G-API] Support postprocessing for not argmaxed outputs#20476
[G-API] Support postprocessing for not argmaxed outputs#20476alalek merged 7 commits intoopencv:masterfrom
Conversation
* Add static_cast to uint8_t
|
@dmatveev Could you have a look ? |
dmatveev
left a comment
There was a problem hiding this comment.
LGTM if the existing case is not broken with this change.
| void classesToColors(const cv::Mat &out_blob, | ||
| cv::Mat &mask_img) { | ||
| const int H = out_blob.size[0]; | ||
| const int W = out_blob.size[1]; | ||
|
|
||
| mask_img.create(H, W, CV_8UC3); | ||
| GAPI_Assert(out_blob.type() == CV_8UC1); | ||
| const uint8_t* const classes = out_blob.ptr<uint8_t>(); | ||
|
|
||
| for (int rowId = 0; rowId < H; ++rowId) { | ||
| for (int colId = 0; colId < W; ++colId) { | ||
| uint8_t class_id = classes[rowId * W + colId]; | ||
| mask_img.at<cv::Vec3b>(rowId, colId) = | ||
| class_id < colors.size() | ||
| ? colors[class_id] | ||
| : cv::Vec3b{0, 0, 0}; // NB: sample supports 20 classes | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Can this be expressed with our graph operators? Just wondering
There was a problem hiding this comment.
Do you mean call this function inside the user kernel ? Or express this algo by using already existing operations ?
| cv::resize(mask_img, out, in.size()); | ||
| const float blending = 0.3f; | ||
| out = in * blending + out * (1 - blending); |
There was a problem hiding this comment.
can this be moved on the graph level, too? Not critical to do it right now but worth considering for the future.
There was a problem hiding this comment.
On the graph level cv::Size parameter is unknown, isn't it ?
It's obviously can be custom resize operation
| // NB: If output has more than single plane, it contains probabilities | ||
| // otherwise class id. |
There was a problem hiding this comment.
Is this robust enough? Maybe explicit enum flag is better? I just don't know.
There was a problem hiding this comment.
What do you mean by enum flag ? In that case you need to match model name with postprocessing enum flag, right ?
I don't think that it's a great solution, just tried not to overdesign it.
|
@alalek Can it be merged ? |
…amvid-0001-segm-sample [G-API] Support postprocessing for not argmaxed outputs * Support postprocessing for not argmaxed outputs * Fix typo * Add assert * Remove static cast * CamelCast to snake_case * Fix windows warning * Add static_cast to uint8_t * Add const to variables
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.
Overview
Some semantic segmentation networks such as unet-camvid-0001 from OMZ produce multi-plane output (1 x num_classesx H x W). In that case need to perform argmax operation for every pixel through channel plane in order to convert output to 1 x 1 x H x W representation where every pixel is class id.
Build configuration