Describe the issue
An onnx model which are exported from PyTorch with nn.Conv2 and converted to FP16 are not giving correct result during inference.
This issue is not observed on the original exported FP32 onnx model
This issue also not observed on onnxruntime 1.13 or .1.14. I first observe it on onnxruntime >= 1.15.0
Also this issue is only observed on arm64 linux (actually I observe this issue on docker running on M1 macOS).
It works fine on macOS with M1 CPU, or Linux with intel CPU.
To reproduce
On arm64 Linux (or using python:3.10-bullseye docker image),
run following code with onnxruntime >= 1.15.0
import torch
from torch import nn
import onnx
from onnxconverter_common import float16
import onnxruntime as ort
import numpy as np
class ModelUnderTest(nn.Module):
def __init__(self):
super().__init__()
self.model = nn.Conv2d(1, 1, 1)
nn.init.constant_(self.model.weight.data, 0.5)
if self.model.bias is not None:
# It works fine for this test case if bias is initialised to 0
nn.init.constant_(self.model.bias.data, 0.5)
def forward(self, x):
return self.model(x)
if __name__ == "__main__":
m = ModelUnderTest()
x = torch.ones(1, 1, 1)
torch.onnx.export(m, x, "m1.onnx", export_params=True)
model = onnx.load("m1.onnx")
m_16 = float16.convert_float_to_float16(
model,
keep_io_types=True,
# It works fine if we block Conv Op
# op_block_list=float16.DEFAULT_OP_BLOCK_LIST + ["Conv"],
)
onnx.save(m_16, "m1_fp16.onnx")
# ---
session_option = ort.SessionOptions()
session_option.log_severity_level = 3
session_option.enable_cpu_mem_arena = False
session_option.enable_mem_pattern = False
session_option.enable_mem_reuse = False
x = np.ones((1, 1, 1))
session_fp32 = ort.InferenceSession("m1.onnx", session_option)
y1 = session_fp32.run(None, {"input": x.astype(np.float32)})[0]
print("fp32 output")
print(y1)
session_fp16 = ort.InferenceSession("m1_fp16.onnx", session_option)
y2 = session_fp16.run(None, {"input": x.astype(np.float32)})[0]
print("fp16 output")
print(y2)
y_diff = y1 - y2
y_diff_2 = y_diff * y_diff
print("SSD")
print(np.sum(y_diff_2))
It prints
fp32 output
[[[1.]]]
fp16 output
[[[0.5]]]
SSD
0.25
However, the expected output should be
fp32 output
[[[1.]]]
fp16 output
[[[1.]]]
SSD
0.0
It gives the correct output when downgrade onnxruntime to 1.14.1
Urgency
This seems to be a degrade on onnxruntime as it works before 1.15.0.
I can workaround the issue by adding Conv to op_block_list when converting the model to fp16.
Platform
Linux
OS Version
Debian Bullseye
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
>= 1.15.0
ONNX Runtime API
Python
Architecture
ARM64
Execution Provider
Default CPU
Execution Provider Library Version
No response
Describe the issue
An onnx model which are exported from PyTorch with nn.Conv2 and converted to FP16 are not giving correct result during inference.
This issue is not observed on the original exported FP32 onnx model
This issue also not observed on onnxruntime 1.13 or .1.14. I first observe it on onnxruntime >= 1.15.0
Also this issue is only observed on arm64 linux (actually I observe this issue on docker running on M1 macOS).
It works fine on macOS with M1 CPU, or Linux with intel CPU.
To reproduce
On arm64 Linux (or using
python:3.10-bullseyedocker image),run following code with onnxruntime >= 1.15.0
It prints
However, the expected output should be
It gives the correct output when downgrade onnxruntime to 1.14.1
Urgency
This seems to be a degrade on onnxruntime as it works before 1.15.0.
I can workaround the issue by adding
Convtoop_block_listwhen converting the model to fp16.Platform
Linux
OS Version
Debian Bullseye
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
>= 1.15.0ONNX Runtime API
Python
Architecture
ARM64
Execution Provider
Default CPU
Execution Provider Library Version
No response