Skip to content

Fixes for Segment Anything#23491

Merged
asmorkalov merged 2 commits intoopencv:4.xfrom
fengyuentau:patch_for_segment_anything
May 4, 2023
Merged

Fixes for Segment Anything#23491
asmorkalov merged 2 commits intoopencv:4.xfrom
fengyuentau:patch_for_segment_anything

Conversation

@fengyuentau
Copy link
Copy Markdown
Member

@fengyuentau fengyuentau commented Apr 14, 2023

Resolves #23470.
Need to have einsum supported, so link #23134 as well. Einsum is extracted into several operators to bypass this issue first. Will support Einsum in another PR.

To build a demo running SAM with dnn, we actually need two models:

  • SAM encoder, which is basically a ViT.
    • can be loaded with dnn.
    • can be inferred with dnn.
  • SAM, which takes the output of SAM encoder as input.
    • can be loaded with dnn.
    • can be inferred with dnn.

Since the current dnn engine does not support dynamic shape, we need to carefully export ONNX SAM with:

  • Fixed input shape,
  • Post processing excluded,
  • Simplify with onnxsim.

Access the code of exporting ONNX SAM with encoder: https://github.com/fengyuentau/segment-anything

Download the model I carefully exported: https://drive.google.com/drive/folders/110JBApuq0_37C0gTlMzqs9B3var2oCGY

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@fengyuentau fengyuentau force-pushed the patch_for_segment_anything branch from 2ede67c to 88cacd3 Compare April 14, 2023 07:53
@fengyuentau fengyuentau marked this pull request as ready for review April 26, 2023 07:33
@fengyuentau fengyuentau requested a review from rogday April 26, 2023 07:33
@fengyuentau fengyuentau added this to the 4.8.0 milestone Apr 27, 2023
@asmorkalov asmorkalov requested review from vpisarev and zihaomu and removed request for rogday April 28, 2023 07:48
@asmorkalov
Copy link
Copy Markdown
Contributor

@fengyuentau I tried to load one of the models in your shared folder with OpenCV model diagnostics tool and got a lot of errors. Could you take a look on attached log.
Command line: ./opencv_model_diagnostics -m=./sam_encoder_vit_b.onnx
OS: Ubuntu 18.04, default GCC.
msa_diagnostics.txt

@fengyuentau
Copy link
Copy Markdown
Member Author

./opencv_model_diagnostics -m=./sam_encoder_vit_b.onnx

With this patch, our dnn engine can only parse and infer sam_encoder_vit_b.sim.onnx and sam_vit_b.fixed.nopost.sim.onnx from my shared model list. Others either require extensive supports to different operators or dynamic input shape, which I will do later in separate pull requests.

@asmorkalov
Copy link
Copy Markdown
Contributor

 ./opencv_model_diagnostics -m=sam_vit_b.fixed.nopost.sim.onnx 
[ERROR:0@0.044] global debug_utils.cpp:74 printMissing DNN: Not supported types:
Type='ai.onnx.Not', affected nodes:
['/Not_output_0']

@asmorkalov
Copy link
Copy Markdown
Contributor

@fengyuentau Friendly reminder.

@fengyuentau
Copy link
Copy Markdown
Member Author

./opencv_model_diagnostics -m=sam_vit_b.fixed.nopost.sim.onnx
[ERROR:0@0.044] global debug_utils.cpp:74 printMissing DNN: Not supported types:
Type='ai.onnx.Not', affected nodes:
['/Not_output_0']

To be honest, I do not think my patch affects anything related to this node and SAM just works on my side. You can try the following test code instead:

TEST_P(Test_ONNX_layers, FAIR_SAM)
{
    std::string model_path = "/absolute_path/to/sam_vit_b.fixed.nopost.sim.onnx";
    Net net = readNet(model_path);

    std::vector<int> shape_image_embeddings{1, 256, 64, 64};
    Mat image_embeddings(shape_image_embeddings, CV_32FC1);
    randn(image_embeddings, 0.f, 1.f);
    net.setInput(image_embeddings, std::string("image_embeddings"));

    std::vector<int> shape_point_coords{1, 5, 2};
    Mat point_coords(shape_point_coords, CV_32FC1);
    randn(point_coords, 0.f, 1.f);
    net.setInput(point_coords, std::string("point_coords"));

    std::vector<int> shape_point_labels{1, 5};
    Mat point_labels(shape_point_labels, CV_32FC1);
    randn(point_labels, 0.f, 1.f);
    net.setInput(point_labels, std::string("point_labels"));
    
    std::vector<int> shape_mask_input{1, 1, 256, 256};
    Mat mask_input(shape_mask_input, CV_32FC1);
    randn(mask_input, 0.f, 1.f);
    net.setInput(mask_input, std::string("mask_input"));

    std::vector<int> shape_has_mask_input{1};
    Mat has_mask_input(shape_has_mask_input, CV_32FC1);
    has_mask_input.at<float>(0) = 1;
    net.setInput(has_mask_input, std::string("has_mask_input"));

    Mat outs = net.forward();
    std::cout << outs.size << std::endl;
}

Let me know if there are further issues.

@asmorkalov asmorkalov merged commit 351589e into opencv:4.x May 4, 2023
@asmorkalov asmorkalov mentioned this pull request May 31, 2023
@fengyuentau fengyuentau deleted the patch_for_segment_anything branch June 24, 2023 05:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Meta's Segment Anything with dnn

3 participants