Fixes for Segment Anything by fengyuentau · Pull Request #23491 · opencv/opencv

fengyuentau · 2023-04-14T07:38:17Z

Resolves #23470.
~~Need to have einsum supported, so link #23134 as well.~~ Einsum is extracted into several operators to bypass this issue first. Will support Einsum in another PR.

To build a demo running SAM with dnn, we actually need two models:

SAM encoder, which is basically a ViT.
- can be loaded with dnn.
- can be inferred with dnn.
SAM, which takes the output of SAM encoder as input.
- can be loaded with dnn.
- can be inferred with dnn.

Since the current dnn engine does not support dynamic shape, we need to carefully export ONNX SAM with:

Fixed input shape,
Post processing excluded,
Simplify with onnxsim.

Access the code of exporting ONNX SAM with encoder: https://github.com/fengyuentau/segment-anything

Download the model I carefully exported: https://drive.google.com/drive/folders/110JBApuq0_37C0gTlMzqs9B3var2oCGY

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

asmorkalov · 2023-04-28T10:36:29Z

@fengyuentau I tried to load one of the models in your shared folder with OpenCV model diagnostics tool and got a lot of errors. Could you take a look on attached log.
Command line: ./opencv_model_diagnostics -m=./sam_encoder_vit_b.onnx
OS: Ubuntu 18.04, default GCC.
msa_diagnostics.txt

fengyuentau · 2023-04-28T11:07:08Z

./opencv_model_diagnostics -m=./sam_encoder_vit_b.onnx

With this patch, our dnn engine can only parse and infer sam_encoder_vit_b.sim.onnx and sam_vit_b.fixed.nopost.sim.onnx from my shared model list. Others either require extensive supports to different operators or dynamic input shape, which I will do later in separate pull requests.

asmorkalov · 2023-04-28T11:57:26Z

 ./opencv_model_diagnostics -m=sam_vit_b.fixed.nopost.sim.onnx 
[ERROR:0@0.044] global debug_utils.cpp:74 printMissing DNN: Not supported types:
Type='ai.onnx.Not', affected nodes:
['/Not_output_0']

asmorkalov · 2023-05-04T06:49:38Z

@fengyuentau Friendly reminder.

fengyuentau · 2023-05-04T07:04:51Z

./opencv_model_diagnostics -m=sam_vit_b.fixed.nopost.sim.onnx
[ERROR:0@0.044] global debug_utils.cpp:74 printMissing DNN: Not supported types:
Type='ai.onnx.Not', affected nodes:
['/Not_output_0']

To be honest, I do not think my patch affects anything related to this node and SAM just works on my side. You can try the following test code instead:

TEST_P(Test_ONNX_layers, FAIR_SAM)
{
    std::string model_path = "/absolute_path/to/sam_vit_b.fixed.nopost.sim.onnx";
    Net net = readNet(model_path);

    std::vector<int> shape_image_embeddings{1, 256, 64, 64};
    Mat image_embeddings(shape_image_embeddings, CV_32FC1);
    randn(image_embeddings, 0.f, 1.f);
    net.setInput(image_embeddings, std::string("image_embeddings"));

    std::vector<int> shape_point_coords{1, 5, 2};
    Mat point_coords(shape_point_coords, CV_32FC1);
    randn(point_coords, 0.f, 1.f);
    net.setInput(point_coords, std::string("point_coords"));

    std::vector<int> shape_point_labels{1, 5};
    Mat point_labels(shape_point_labels, CV_32FC1);
    randn(point_labels, 0.f, 1.f);
    net.setInput(point_labels, std::string("point_labels"));
    
    std::vector<int> shape_mask_input{1, 1, 256, 256};
    Mat mask_input(shape_mask_input, CV_32FC1);
    randn(mask_input, 0.f, 1.f);
    net.setInput(mask_input, std::string("mask_input"));

    std::vector<int> shape_has_mask_input{1};
    Mat has_mask_input(shape_has_mask_input, CV_32FC1);
    has_mask_input.at<float>(0) = 1;
    net.setInput(has_mask_input, std::string("has_mask_input"));

    Mat outs = net.forward();
    std::cout << outs.size << std::endl;
}

Let me know if there are further issues.

support broadcast on axis > 1 for Expand

88cacd3

fengyuentau force-pushed the patch_for_segment_anything branch from 2ede67c to 88cacd3 Compare April 14, 2023 07:53

allow null constant_value in Pad and ignore it when loading

4f99e5a

asmorkalov added the category: dnn label Apr 14, 2023

fengyuentau marked this pull request as ready for review April 26, 2023 07:33

fengyuentau requested a review from rogday April 26, 2023 07:33

fengyuentau mentioned this pull request Apr 26, 2023

dnn: add more CANN operators to support SAM #23550

Closed

11 tasks

fengyuentau added this to the 4.8.0 milestone Apr 27, 2023

asmorkalov requested review from vpisarev and zihaomu and removed request for rogday April 28, 2023 07:48

vpisarev approved these changes Apr 28, 2023

View reviewed changes

asmorkalov merged commit 351589e into opencv:4.x May 4, 2023

asmorkalov mentioned this pull request May 31, 2023

(5.x) Merge 4.x #23718

Merged

fengyuentau deleted the patch_for_segment_anything branch June 24, 2023 05:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fixes for Segment Anything#23491

Fixes for Segment Anything#23491
asmorkalov merged 2 commits intoopencv:4.xfrom
fengyuentau:patch_for_segment_anything

fengyuentau commented Apr 14, 2023 •

edited

Loading

Uh oh!

asmorkalov commented Apr 28, 2023

Uh oh!

fengyuentau commented Apr 28, 2023

Uh oh!

asmorkalov commented Apr 28, 2023

Uh oh!

asmorkalov commented May 4, 2023

Uh oh!

fengyuentau commented May 4, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

fengyuentau commented Apr 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Readiness Checklist

Uh oh!

asmorkalov commented Apr 28, 2023

Uh oh!

fengyuentau commented Apr 28, 2023

Uh oh!

asmorkalov commented Apr 28, 2023

Uh oh!

asmorkalov commented May 4, 2023

Uh oh!

fengyuentau commented May 4, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fengyuentau commented Apr 14, 2023 •

edited

Loading