Skip to content

fix bug in onnx div operator when input[1] is variable;#21203

Closed
hujunyao wants to merge 1 commit intoopencv:4.xfrom
hujunyao:4.x
Closed

fix bug in onnx div operator when input[1] is variable;#21203
hujunyao wants to merge 1 commit intoopencv:4.xfrom
hujunyao:4.x

Conversation

@hujunyao
Copy link
Copy Markdown

@hujunyao hujunyao commented Dec 6, 2021

fix the onnx DIV operator bugs.
when input[1] is variable , it should be calculated as X^-1 .

the simple code to show bugs:

#============================
#to generate onnx graph
#============================
import torch
import torch.nn as nn
import torch.nn.init as init
import torch.onnx
import onnxruntime

class TestNet(nn.Module):

    def __init__(self):
        super(TestNet, self).__init__()

    def forward(self, x):
        x = torch.reciprocal(1+torch.exp(x))
        return x

torch_model = TestNet()
#x = torch.randn(1, 1, 2, 2)
x = torch.ones(2, 2)
torch_out = torch_model(x)

# Export the model
torch.onnx.export(torch_model,               # model being run
                  x,                         # model input (or a tuple for multiple inputs)
                  "test.onnx",   # where to save the model (can be a file or file-like object)
                  export_params=True,        # store the trained parameter weights inside the model file
                  opset_version=10,          # the ONNX version to export the model to
                  do_constant_folding=True,  # whether to execute constant folding for optimization
                  input_names = ['input'],   # the model's input names
                  output_names = ['output'], # the model's output names
                  #dynamic_axes={'input' : {0 : 'batch_size'},    # variable length axes
                  #              'output' : {0 : 'batch_size'}}
                  )


ort_session = onnxruntime.InferenceSession("test.onnx")

def to_numpy(tensor):
    return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy()

# compute ONNX Runtime output prediction
ort_inputs = {ort_session.get_inputs()[0].name: to_numpy(x)}
ort_outs = ort_session.run(None, ort_inputs)
print(f"x= {to_numpy(x)}" )
print(ort_outs[0])

C++ reproducer:

//====================================================
//to find bugs
//====================================================
#define _CRT_SECURE_NO_WARNINGS
#include <iostream>
#include <opencv2/dnn.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>

using namespace cv;
using namespace dnn;
using namespace std;

int main()
{
    Net net = readNet("E:/tmp/opencv_onnx_bug/code/onnx_find/python/test.onnx");


    cv::Mat blob(2 , 2 , CV_32F, Scalar(1));


    net.setInput(blob);

    vector<Mat> outs;
    net.forward(outs, GetOutputsNames(net));//"binary"GetOutputsNames()"shrink"this->net.getUnconnectedOutLayersNames()
    Mat binary = outs[0];
    cv::FileStorage file2("some_name.txt", cv::FileStorage::WRITE);
    file2 << "matName" << binary;
    cout << binary << endl;
    CV_Assert(outs.size() == 1);
}

@asmorkalov asmorkalov requested a review from rogday December 7, 2021 06:54
Copy link
Copy Markdown
Member

@rogday rogday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for contribution!
Please, consider adding tests(test_onnx_importer.cpp) and test data(opencv_extra/testdata/dnn/onnx/). You can use opencv_extra/testdata/dnn/onnx/generate_onnx_models.py as a reference.

Comment on lines +1665 to +1677
float coeff;
if ( isDiv )
{
if ( constId == 0 )
{
layerParams.set("power", -1);
coeff = blob_value;
}
else
coeff = 1.0 / blob_value;
}
else
coeff = blob_value;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should always initialize variables to avoid certain cases of bugs.
Power layer is calculating (x*scale + shift)^power, so, in your case(const/tensor), it will be 1/(x*scale) instead of scale/x. We could use something like:

float coeff = blob_value;
if ( isDiv )
{
    coeff = 1.0 / blob_value;
    if ( constId == 0 )
    {
        layerParams.set("power", -1);
    }                    
}

But I'm not sure it will be accurate enough.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider input 2/x , x/2 , I think it should be :


float coeff = blob_value;
if ( isDiv )
{
    
    if ( constId == 0 )
    {
        layerParams.set("power", -1);
    }
  else
        coeff = 1.0 / blob_value;                    
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hujunyao, I just tested your code using the following snippet, which you can use to add tests(it needs input/output saving, you can look at save_onnx_data_and_model in opencv_extra/testdata/dnn/onnx/generate_onnx_models.py):

node = onnx.helper.make_node(
    'Div',
    inputs=['x', 'y'],
    outputs=['z'],
)

x = np.array([2]).astype(np.float32)
y = np.array([[4, 4], [4, 4]]).astype(np.float32)
z = x / y

X = onnx.helper.make_tensor('x', onnx.TensorProto.FLOAT, x.shape, x)
Y = onnx.helper.make_tensor_value_info('y', onnx.TensorProto.FLOAT, y.shape)
Z = onnx.helper.make_tensor_value_info('z', onnx.TensorProto.FLOAT, z.shape)

graph = onnx.helper.make_graph([node], 'const_div', [Y], [Z], initializer=[X])
model = onnx.helper.make_model(graph, producer_name='const_div')
onnx.save(model, 'const_div.onnx')

It outputs a mat of 0.125, which is consistent with my prediction: 1/(2*4), whereas the proposed solution outputs 0.5 as expected.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it won't go to this branch when it in
'''
if (constId != -1 && haveVariables)
{
..................
}
else if (!haveVariables)
'''
I thought this branch only calculate two operator : a constant and a variable, when 2 constants div , it will go to !haveVariable. but I'm sure that.

@hujunyao hujunyao requested a review from rogday December 8, 2021 06:05
@asmorkalov asmorkalov added the category: dnn (onnx) ONNX suport issues in DNN module label Dec 9, 2021
@rogday rogday mentioned this pull request Dec 20, 2021
6 tasks
@rogday
Copy link
Copy Markdown
Member

rogday commented Dec 20, 2021

Closed in favor of #21297

@rogday rogday closed this Dec 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: dnn (onnx) ONNX suport issues in DNN module category: dnn pr: needs test New functionality requires minimal tests set

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants