Incorrect result for converted FP16 model with Conv Op when run on arm64 Linux with onnxruntime >= 1.15.0

### Describe the issue

An onnx model which are exported from PyTorch with nn.Conv2 and converted to FP16 are not giving correct result during inference.

This issue is not observed on the original exported FP32 onnx model
This issue also not observed on onnxruntime 1.13 or .1.14. I first observe it on onnxruntime >= 1.15.0
Also this issue is only observed on arm64 linux (actually I observe this issue on docker running on M1 macOS). 
It works fine on macOS with M1 CPU, or Linux with intel CPU.

 


### To reproduce

On arm64 Linux (or using `python:3.10-bullseye` docker image),
run following code with onnxruntime >= 1.15.0

```py
import torch
from torch import nn

import onnx
from onnxconverter_common import float16
import onnxruntime as ort
import numpy as np


class ModelUnderTest(nn.Module):
    def __init__(self):
        super().__init__()
        self.model = nn.Conv2d(1, 1, 1)
        nn.init.constant_(self.model.weight.data, 0.5)
        if self.model.bias is not None:
            # It works fine for this test case if bias is initialised to 0
            nn.init.constant_(self.model.bias.data, 0.5)

    def forward(self, x):
        return self.model(x)


if __name__ == "__main__":
    m = ModelUnderTest()
    x = torch.ones(1, 1, 1)
    torch.onnx.export(m, x, "m1.onnx", export_params=True)

    model = onnx.load("m1.onnx")
    m_16 = float16.convert_float_to_float16(
        model,
        keep_io_types=True,
        # It works fine if we block Conv Op
        # op_block_list=float16.DEFAULT_OP_BLOCK_LIST + ["Conv"],
    )
    onnx.save(m_16, "m1_fp16.onnx")

    # ---

    session_option = ort.SessionOptions()
    session_option.log_severity_level = 3
    session_option.enable_cpu_mem_arena = False
    session_option.enable_mem_pattern = False
    session_option.enable_mem_reuse = False

    x = np.ones((1, 1, 1))
    session_fp32 = ort.InferenceSession("m1.onnx", session_option)
    y1 = session_fp32.run(None, {"input": x.astype(np.float32)})[0]
    print("fp32 output")
    print(y1)
    session_fp16 = ort.InferenceSession("m1_fp16.onnx", session_option)
    y2 = session_fp16.run(None, {"input": x.astype(np.float32)})[0]
    print("fp16 output")
    print(y2)

    y_diff = y1 - y2
    y_diff_2 = y_diff * y_diff
    print("SSD")
    print(np.sum(y_diff_2))
```

It prints
```
fp32 output
[[[1.]]]
fp16 output
[[[0.5]]]
SSD
0.25
```

However, the expected output should be
```
fp32 output
[[[1.]]]
fp16 output
[[[1.]]]
SSD
0.0
```
It gives the correct output when downgrade onnxruntime to 1.14.1


### Urgency

This seems to be a degrade on onnxruntime as it works before 1.15.0.
I can workaround the issue by adding `Conv` to `op_block_list` when converting the model to fp16.

### Platform

Linux

### OS Version

Debian Bullseye

### ONNX Runtime Installation

Released Package

### ONNX Runtime Version or Commit ID

`>= 1.15.0`

### ONNX Runtime API

Python

### Architecture

ARM64

### Execution Provider

Default CPU

### Execution Provider Library Version

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect result for converted FP16 model with Conv Op when run on arm64 Linux with onnxruntime >= 1.15.0 #18992

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Incorrect result for converted FP16 model with Conv Op when run on arm64 Linux with onnxruntime >= 1.15.0 #18992

Description

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions