Describe the issue
The Min and Max operators should return NaN if any input is NaN, but the CPU provider will incorrectly return the value from the second input when it is a single-element tensor and the first input is NaN. The GPU provider behaves correctly (since #19984).
This issue was discovered when exporting a Torch model to ONNX that used the torch.Tensor.clip method.
To reproduce
import torch # Not used but initializing the CUDA execution provider fails without it
import numpy as np
import onnx
from onnx.onnx_pb import TensorProto
import onnxruntime
num_cols = 1
num_rows = 3
operator = "Min" # or "Max"
input = onnx.helper.make_tensor_value_info("input", TensorProto.FLOAT, ["N", num_cols])
output = onnx.helper.make_tensor_value_info("output", TensorProto.FLOAT, ["N", num_cols])
bound_const = onnx.helper.make_node(
"Constant",
inputs=[],
outputs=["bound_const"],
value=onnx.numpy_helper.from_array(np.array([10.0] * num_cols, dtype=np.float32)))
operator_node = onnx.helper.make_node(
operator,
inputs=["input", "bound_const"],
outputs=["output"],
)
graph_def = onnx.helper.make_graph(
nodes=[bound_const, operator_node],
name="test-model",
inputs=[input],
outputs=[output])
opset_import = onnx.helper.make_opsetid("", 17)
model_def = onnx.helper.make_model(
graph_def,
opset_imports=[opset_import],
producer_name="test")
onnx.checker.check_model(model_def, full_check=True)
model_path = 'test_operator.onnx'
onnx.save(model_def, model_path)
input = np.random.randn(num_rows, num_cols).astype(np.float32)
input[0, :] = np.nan
cpu_session = onnxruntime.InferenceSession(model_path, providers=["CPUExecutionProvider"])
output = cpu_session.run(["output"], {"input": input})
print("CPU session output:")
print(output[0])
gpu_session = onnxruntime.InferenceSession(model_path, providers=["CUDAExecutionProvider"])
output = gpu_session.run(["output"], {"input": input})
print("GPU session output:")
print(output[0])
This outputs something like:
CPU session output:
[[10. ]
[-0.2347993]
[ 2.0502098]]
GPU session output:
[[ nan]
[-0.2347993]
[ 2.0502098]]
The first row of the CPU session output should be nan.
If I change num_cols to anything > 1 then the behaviour is correct and the first row is all nans for both the CPU and GPU providers. Although I can also reproduce the issue if num_cols is > 1 but the number of rows is 1 and I reverse the order of the inputs to the operator.
Urgency
No response
Platform
Linux
OS Version
Fedora 40
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.18.1
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
Describe the issue
The Min and Max operators should return NaN if any input is NaN, but the CPU provider will incorrectly return the value from the second input when it is a single-element tensor and the first input is NaN. The GPU provider behaves correctly (since #19984).
This issue was discovered when exporting a Torch model to ONNX that used the
torch.Tensor.clipmethod.To reproduce
This outputs something like:
The first row of the CPU session output should be nan.
If I change
num_colsto anything > 1 then the behaviour is correct and the first row is all nans for both the CPU and GPU providers. Although I can also reproduce the issue ifnum_colsis > 1 but the number of rows is 1 and I reverse the order of the inputs to the operator.Urgency
No response
Platform
Linux
OS Version
Fedora 40
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.18.1
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response