-
-
Notifications
You must be signed in to change notification settings - Fork 56.5k
Siamese network, with shared MatMul kernel weights, triggers !field.empty() assertion #18479
Description
System information (version)
- OpenCV => 3.4.11
- Operating System / Platform => Windows 10 64 Bit
- Compiler => Visual Studio 2010
Detailed description
When trying to load a Siamese network trained with Tensorflow/Keras, cv::dnn:readNetFromTensorflow("model.pb") asserts with "!field.empty()" in getTensorContent.
Tracing this further, it appears the issue is with retrieving the shared kernel const blob for the MatMul operation of the Dense Layer at the end of each shared sub-net. Removing the "releaseTensor(const_casttensorflow::TensorProto*(&kernelTensor));" (to match the 3.4.1 release, before the line was added), allows the model to load and function.
I suspect that, calling releaseTensor causes the second call to getConstBlob to return what is effectively an empty tensor, which then raises the assert. Operations such as Conv2D appear to get around this by storing the const data into a sharedWeights container.
Steps to reproduce (using python and TensorFlow 1.15.3)
#set up simplified model for test case
import tensorflow as tf
tf.keras.backend.set_learning_phase(0)
input_shape = (5, 5, 1)
side_input = tf.keras.layers.Input(input_shape)
side = tf.reshape(side_input, [-1, 25], name="flatten")
side = tf.keras.layers.Dense(5, activation='sigmoid')(side)
half = tf.keras.Model(inputs=side_input, outputs=side)
left_input = tf.keras.layers.Input(input_shape, name="left_input")
encoded_l = half(left_input)
right_input = tf.keras.layers.Input(input_shape, name="right_input")
encoded_r = half(right_input)
L1_distance = tf.keras.layers.Lambda(lambda x: tf.keras.backend.abs(x[0] - x[1]))([encoded_l, encoded_r])
prediction = tf.keras.layers.Dense(1, activation='sigmoid')(L1_distance)
model = tf.keras.Model(inputs=[left_input,
right_input
], outputs=prediction)
#write frozen graph
from tensorflow.python.framework import graph_util, graph_io
tensorFlowSession = tf.keras.backend.get_session()
input_graph_def = tensorFlowSession.graph.as_graph_def()
constant_graph = graph_util.convert_variables_to_constants(tensorFlowSession,
input_graph_def,
[node.name[:-2] for node in model.outputs])
graph_io.write_graph(constant_graph, "", "model.pb", as_text=False)
#attempt to load with python, but same occurs with c++
import cv2
model_ocv = cv2.dnn.readNetFromTensorflow('model.pb')