公式识别formulaNet-s的onnx模型GPU模型推理非第一次输出异常

### 🔎 Search before asking

- [x] I have searched the PaddleOCR [Docs](https://paddlepaddle.github.io/PaddleOCR/) and found no similar bug report.
- [x] I have searched the PaddleOCR [Issues](https://github.com/PaddlePaddle/PaddleOCR/issues) and found no similar bug report.
- [x] I have searched the PaddleOCR [Discussions](https://github.com/PaddlePaddle/PaddleOCR/discussions) and found no similar bug report.

### 🐛 Bug (问题描述)

用onnx模型在GPU上推理很奇怪，cpu推理多次不会有问题，但是用GPU推理的话，第一次推理正常，但第二次以及其他次模型输出的值特别大也特别小，正负几千万亿

![Image](https://github.com/user-attachments/assets/3eac936d-ef2f-445a-bb2f-e296c54d3175)

上下分别是第一次和第二次的模型直接输出的对比，第三次第四次第五次也都是和第二次一模一样的离谱结果。

### 🏃‍♂️ Environment (运行环境)

OS windows11
Environment conda
python 3.10.14
paddlepaddle-gpu  3.0.0rc1
onnxruntime-gpu    1.21.1
nvidia-cuda-runtime-cu12 12.3.101
nvidia-cudnn-cu12        9.1.1.17

### 🌰 Minimal Reproducible Example (最小可复现问题的Demo)

模型加载与推理的类
```python
class OrtBase:
    def __init__(self, onnx_path):
        sess_options = ort.SessionOptions()
        sess_options.intra_op_num_threads = int(os.environ.get("CPU_CORE_NUM", 2))
        sess_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_EXTENDED
        providers = [
            ("CUDAExecutionProvider", {
                "device_id": 0,
                "arena_extend_strategy": "kSameAsRequested",
                "gpu_mem_limit": 5 * 1024 * 1024 * 1024,
                "cudnn_conv_algo_search": "HEURISTIC",
                "do_copy_in_default_stream": True,
            }),
            "CPUExecutionProvider" 
        ]
        # providers = ['CUDAExecutionProvider', 'CPUExecutionProvider']
        # self.session = ort.InferenceSession(path_or_bytes=onnx_path)
        # self.session = ort.InferenceSession(path_or_bytes=onnx_path, providers=providers)
        # self.session = ort.InferenceSession(path_or_bytes=onnx_path, sess_options=sess_options)
        self.session = ort.InferenceSession(path_or_bytes=onnx_path, sess_options=sess_options, providers=providers)
        self.input_names = [node.name for node in self.session.get_inputs()]
        self.output_name = [node.name for node in self.session.get_outputs()]
        print("input_name:{}".format(self.input_names))
        print("output_name:{}".format(self.output_name))

    async def forward(self, inputs, *args, **kwargs):
        """
        image_tensor = image.transpose(2, 0, 1)
        image_tensor = image_tensor[np.newaxis, :]
        onnx_session.run([output_name], {input_name: x})
        :param inputs:
        :return:
        """
        if len(self.input_names) == 1:
            input_name = self.input_names[0]
            input_feed = {input_name: inputs}
        else:
            assert len(self.input_names) == len(inputs)
            input_feed = {input_name: input_data for input_name, input_data in zip(self.input_names, inputs)}
        predict = self.session.run(self.output_name, input_feed=input_feed)
        return predict
```

推理
```python
    async def single_predict(self, img, *args, **kwargs):
        data = {'image': img[0]} 
        inputs = transform(data, self.preprocess_op)
        image = np.array(inputs)
        outputs = await self.model.forward(image)
        post_result = self.postprocess_op(outputs[0])
        return post_result
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

公式识别formulaNet-s的onnx模型GPU模型推理非第一次输出异常 #15125

🔎 Search before asking

🐛 Bug (问题描述)

🏃‍♂️ Environment (运行环境)

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

公式识别formulaNet-s的onnx模型GPU模型推理非第一次输出异常 #15125

Description

🔎 Search before asking

🐛 Bug (问题描述)

🏃‍♂️ Environment (运行环境)

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions