Incorrect NaN handling for Min and Max operators on CPU with a single element input

### Describe the issue

The Min and Max operators should return NaN if any input is NaN, but the CPU provider will incorrectly return the value from the second input when it is a single-element tensor and the first input is NaN. The GPU provider behaves correctly (since #19984).

This issue was discovered when exporting a Torch model to ONNX that used the `torch.Tensor.clip` method.

### To reproduce

```python
import torch  # Not used but initializing the CUDA execution provider fails without it

import numpy as np
import onnx
from onnx.onnx_pb import TensorProto
import onnxruntime


num_cols = 1
num_rows = 3
operator = "Min"  # or "Max"

input = onnx.helper.make_tensor_value_info("input", TensorProto.FLOAT, ["N", num_cols])
output = onnx.helper.make_tensor_value_info("output", TensorProto.FLOAT, ["N", num_cols])

bound_const = onnx.helper.make_node(
        "Constant",
        inputs=[],
        outputs=["bound_const"],
        value=onnx.numpy_helper.from_array(np.array([10.0] * num_cols, dtype=np.float32)))

operator_node = onnx.helper.make_node(
        operator,
        inputs=["input", "bound_const"],
        outputs=["output"],
        )

graph_def = onnx.helper.make_graph(
        nodes=[bound_const, operator_node],
        name="test-model",
        inputs=[input],
        outputs=[output])

opset_import = onnx.helper.make_opsetid("", 17)

model_def = onnx.helper.make_model(
        graph_def,
        opset_imports=[opset_import],
        producer_name="test")

onnx.checker.check_model(model_def, full_check=True)

model_path = 'test_operator.onnx'
onnx.save(model_def, model_path)

input = np.random.randn(num_rows, num_cols).astype(np.float32)
input[0, :] = np.nan

cpu_session = onnxruntime.InferenceSession(model_path, providers=["CPUExecutionProvider"])
output = cpu_session.run(["output"], {"input": input})
print("CPU session output:")
print(output[0])

gpu_session = onnxruntime.InferenceSession(model_path, providers=["CUDAExecutionProvider"])
output = gpu_session.run(["output"], {"input": input})
print("GPU session output:")
print(output[0])
```

This outputs something like:
```
CPU session output:
[[10.       ]
 [-0.2347993]
 [ 2.0502098]]
GPU session output:
[[       nan]
 [-0.2347993]
 [ 2.0502098]]
```

The first row of the CPU session output should be nan.

If I change `num_cols` to anything > 1 then the behaviour is correct and the first row is all nans for both the CPU and GPU providers. Although I can also reproduce the issue if `num_cols` is > 1 but the number of rows is 1 and I reverse the order of the inputs to the operator.

### Urgency

_No response_

### Platform

Linux

### OS Version

Fedora 40

### ONNX Runtime Installation

Released Package

### ONNX Runtime Version or Commit ID

1.18.1

### ONNX Runtime API

Python

### Architecture

X64

### Execution Provider

Default CPU

### Execution Provider Library Version

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect NaN handling for Min and Max operators on CPU with a single element input #21455

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Incorrect NaN handling for Min and Max operators on CPU with a single element input #21455

Description

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions