TF-Lite converter bug in 2.17 for TransposeConv with quantized weights

On Apple M1 Pro with MacOs Sonoma 14.1.2 (23B92)
TensorFlow version 2.17.0 (not an issue with TF 2.16.1)
I ran this one a Samsung Galaxy S24 Ultra

sources + assets:
[TransposeConv_bug.zip](https://github.com/user-attachments/files/17158711/TransposeConv_bug.zip)

The zip contains a TF saved model and a convert.py script that converts it to a TF-Lite mode.
the zip file contains two TF-Lite models, and created with TF 2.16.1, and the other with 2.17.0.
convert.py also runs the TF-Lite model, but it needs to run on a target with GPU and/or NPU delegates to really see the issue.

When running the TF 2.16.1 TF-Lite model, the model runs on the NPU (Qualcomm QNN delegate).
When running the TF 2.17.0 TF-Lite model, the model runs on the CPU, using the XNNpack/TFLite delegates.

we see this log from TF-Lite on the device:
[tflite.log](https://github.com/user-attachments/files/17158819/tflite.log)
the interesting lines are:
`[27/Sept/2024:11:32:40 +08:00: profiler/warning] [job_id: j87gj31pd] [model.tflite] [tflite] tensorflow/lite/kernels/transpose_conv.cc:487 affine_quantization->scale->size != weights->dims->data[affine_quantization->quantized_dimension] (1 != 4)
`
and
`[27/Sept/2024:11:32:40 +08:00: profiler/warning] [job_id: j87gj31pd] [model.tflite] [tflite] tensorflow/lite/kernels/transpose_conv.cc:487 affine_quantization->scale->size != weights->dims->data[affine_quantization->quantized_dimension] (1 != 4)
`
the first one fails the NPU delegate check and the latter the GPU delegate check.

the check here is:
`https://github.com/tensorflow/tensorflow/blob/9e4fc3d09c298ca56bd11f72d2f1beb622a0f76b/tensorflow/lite/kernels/transpose_conv.cc#L485-L487`

This check is performed only when the Transpose Conv layer has float input but integral weights. This was not the case in TF 2.16, where the weights were first dequantized.
the check makes it looks like this must be a per channel quantized weight, which is clearly not the case.

So the converter generates code that the runtime forces back to the slower CPU delegates

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TF-Lite converter bug in 2.17 for TransposeConv with quantized weights #76624

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

TF-Lite converter bug in 2.17 for TransposeConv with quantized weights #76624

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions