Skip to content

Matmul layer craches with CUDA backend #26021

@Abdurrahheem

Description

@Abdurrahheem

System Information

OpenCV version: 5.x
Operating System / Platform: Ubuntu 20.04
Compiler & compiler version: GCC 9.4.0

Detailed description

When the input to MatMul is of shape [batch_size, 1, input_dim] and weight matrix is of shape [input_dim, hidden_dim] and batch_size is big (128 in this case) the inference fails with following error. But if one swaps batch_size with ineffective dimention (1) of input, such that input shape is [1, batch_size, input_dim] then the multiplicatio works fine. The number of elements is the same in both cases (size occupied should be the same theoretically) but malloc fails in the first case.

I have attached the ONNX model and reproducer below.

[ INFO:0@0.015] global onnx_importer.cpp:821 populateNet DNN/ONNX: loading ONNX v9 model produced by ''. Number of nodes = 2, initializers = 0, inputs = 1, outputs = 1
[ INFO:0@0.015] global onnx_importer.cpp:714 parseOperatorSet DNN/ONNX: ONNX opset version = 19
[ INFO:0@0.016] global onnx_importer.cpp:992 handleNode DNN/ONNX: processing node with 0 inputs and 1 outputs: [Constant]:(onnx_node!n0) from domain='ai.onnx'
[ INFO:0@0.019] global onnx_importer.cpp:992 handleNode DNN/ONNX: processing node with 2 inputs and 1 outputs: [MatMul]:(onnx_node!n1) from domain='ai.onnx'
layer name: return_val
preferableBackend is CUDA
[ INFO:0@0.164] global op_cuda.cpp:81 initCUDABackend CUDA backend will fallback to the CPU implementation for the layer "_input" of type __NetInputLayer__
malloc(): corrupted top size
[1]    1574745 abort (core dumped) 

Steps to reproduce

import cv2 as cv
import numpy as np

if __name__ == "__main__":

    net = cv.dnn.readNet("./matmul_cuda.onnx")

    net.setPreferableBackend(cv.dnn.DNN_BACKEND_CUDA)
    net.setPreferableTarget(cv.dnn.DNN_TARGET_CUDA)

    batch_size = 128
    input_size = 384

    inp = np.ones((batch_size, 1, input_size), dtype=np.float32)

    net.setInput(inp)
    out = net.forward()
    print(out.shape)

    layerNames = net.getLayerNames()
    for layer in layerNames:
        l = net.getLayer(layer)
        print(l.preferableTarget)

Issue submission checklist

  • I report the issue, it's not a question
  • I checked the problem with documentation, FAQ, open issues, forum.opencv.org, Stack Overflow, etc and have not found any solution
  • I updated to the latest OpenCV version and the issue is still there
  • There is reproducer code and related data files (videos, images, onnx, etc)

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions