Regression: Reasoning/thinking output provided as regular output

**LocalAI version:**

LocalAI v4.3.1
container: localai/localai:v4.3.1-gpu-vulkan

**Environment, CPU architecture, OS, and Version:**

Ubuntu 24.04 host with an AMD Ryzen 7 5800X CPU and an AMD Radeon RX 6600 GPU

```
Linux max-machine 6.8.0-117-generic #117-Ubuntu SMP PREEMPT_DYNAMIC Tue May  5 19:26:24 UTC 2026 x86_64 x86_64 x86_64 GNU/Linux
```

**Describe the bug**

The reasoning output of the LLM is provided along with the regular output within the `content` field instead of providing both outputs within separate fields.
This is a regression since it worked in LocalAI v4.0.0 but stopped working at some point prior to LocalAI v4.3.1.

**To Reproduce**

1. Start LocalAI, e.g. `docker run -ti --rm --network=host --privileged -v "$(pwd)/data/models:/models" -v "$(pwd)/data/backends:/backends" localai/localai:v4.3.1-gpu-vulkan --address 127.0.0.1:8080`
2. Download the `qwen3-4b` model.
3. Request chat completion: `curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{"model": "qwen3-4b", "messages": [{"role": "user", "content": "Hello"}]}'`
4. Observe that the response JSON contains the actual AI response along with the reasoning output within the `content` field as opposed to the reasoning output being provided separately within the `reasoning` field:
```json
{"created":1779732183,"object":"chat.completion","id":"23fc7722-53af-48d7-be9e-6174f84e12b9","model":"qwen3-4b","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"\u003cthink\u003e\nOkay, the user sent \"Hello\". I need to respond appropriately. Since it's a greeting, I should reply with a friendly and welcoming message. Maybe start with a greeting, then ask how they're doing. Keep it simple and open-ended so they feel comfortable to share more. Also, make sure to mention that I'm here to help with any questions or needs they have. Avoid any technical jargon. Just a warm and inviting response.\n\u003c/think\u003e\n\nHello! How can I assist you today? 😊"}}],"usage":{"prompt_tokens":10,"completion_tokens":106,"total_tokens":116}}
```

**Expected behavior**

The reasoning output should be provided within the `reasoning` field separate from the actual output that should be provided within the `content` field as it was within LocalAI v4.0.0.
To give an example, here is a LocalAI v4.0.0 response:
```json
{"created":1779731898,"object":"chat.completion","id":"653a17cd-c765-418d-b8df-3979ea5422dd","model":"qwen3-4b","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"\n\nHello! How can I assist you today? 😊","reasoning":"Okay, the user just said \"Hello\". I need to respond appropriately. Since they're greeting me, I should respond in a friendly and welcoming manner. Let me make sure to acknowledge their greeting and offer assistance.\n\nI should keep it simple and polite. Maybe start with a greeting back, then ask how I can help them. That way, it's open-ended and invites them to share more about what they need.\n\nI should avoid any technical jargon or complex language. The response should be easy to understand and conversational. Let me check for any possible misunderstandings. If they meant something specific by \"Hello\", but since it's just a greeting, I think a standard response is best.\n\nAlso, I should make sure to keep the tone positive and helpful. No need to add anything else unless they ask for more. Just a straightforward reply."}}],"usage":{"prompt_tokens":10,"completion_tokens":187,"total_tokens":197}}
```

**Logs**


**Additional context**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Regression: Reasoning/thinking output provided as regular output #9985

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Regression: Reasoning/thinking output provided as regular output #9985

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions