Skip to content

[Bug]:reasoning_content in conversation history causes oMLX JSON parse error on subsequent turns #46637

@zipzagster

Description

@zipzagster

Bug type

Regression (worked before, now fails)

Summary

When Qwen 3.5 returns reasoning_content in the openai-completions response, that content appears to be included in the conversation history for the next API request. The serialized JSON body sent to oMLX contains an invalid control character, causing oMLX (FastAPI/Pydantic) to reject the request with a 422 JSON parse error. This results in the assistant producing empty responses (\n\n) with usage: { input: 0, output: 0 } on every turn after the first.

Steps to reproduce

  1. Configure an agent with omlx/Qwen3.5-122B-A10B-8bit (or any Qwen model that returns reasoning_content)
  2. Start a fresh Telegram session (or /reset)
  3. Send a simple message like "what is the date?" — this works (first turn, no prior reasoning_content in history)
  4. Send a follow-up message like "show me the front door camera" — this fails (second turn, prior turn's reasoning_content is now in the conversation history

Expected behavior

normal response

Actual behavior

  • Turn 1: Model responds correctly. Response includes reasoning_content field.
  • Turn 2: OpenClaw constructs the next request with turn 1's assistant message (including reasoning_content) in the messages array. oMLX rejects the request body with:
    {"detail":[{"type":"json_invalid","loc":["body",92],"msg":"JSON decode error","input":{},"ctx":{"error":"Invalid control character at"}}]}
    
  • The assistant response is recorded as empty text (\n\n) with 0 tokens.
  • All subsequent turns in the same session also fail (the empty responses accumulate in context, compounding the issue).

OpenClaw version

2026.3.13

Operating system

macOS arm64, Node 22.22.1

Install method

gateway update

Model

Qwen3.5-122B-A10B-8bit

Provider / routing chain

oMLX local

Config file / key location

No response

Additional provider/model setup details

No response

Logs, screenshots, and evidence

### oMLX server log


2026-03-14 17:XX:XX - omlx.server - ERROR - Invalid control character at ...


### Gateway error log


2026-03-14T17:30:58.117-05:00 [agent/embedded] embedded run timeout: runId=6aa81677 sessionId=cf848871 timeoutMs=600000
2026-03-14T17:35:19.088-05:00 [agent/embedded] embedded run agent end: isError=true model=Qwen3.5-122B-A10B-8bit provider=omlx error=LLM request timed out.

Impact and severity

Severity High: app is essentially unusable

Additional information

No effective workaround found. Tested:

  • /think off (/think:off) — accepted by OpenClaw, but does not fix the issue. Qwen 3.5 generates reasoning_content regardless of the thinking level directive, and OpenClaw still includes it in subsequent request bodies.
  • /reset — temporarily fixes it for exactly one turn (the first turn in a fresh session has no prior reasoning_content in history). The second turn fails again.

Evidence

oMLX server log

2026-03-14 17:XX:XX - omlx.server - ERROR - Invalid control character at ...

Gateway error log

2026-03-14T17:30:58.117-05:00 [agent/embedded] embedded run timeout: runId=6aa81677 sessionId=cf848871 timeoutMs=600000
2026-03-14T17:35:19.088-05:00 [agent/embedded] embedded run agent end: isError=true model=Qwen3.5-122B-A10B-8bit provider=omlx error=LLM request timed out.

Session transcript (showing empty responses)

[22:38:27Z] assistant thinking: "Don wants to see an image from his Front Door camera..."
[22:38:27Z] assistant text: "\n\n"    ← empty
[22:38:27Z] usage: { input: 0, output: 0, totalTokens: 0 }

Direct oMLX API works fine

Sending the same conversation (with reasoning_content in assistant messages) directly via curl to oMLX succeeds. This suggests the issue is in how OpenClaw serializes the request body, not in oMLX's handling of the field itself.

Likely root cause

When OpenClaw constructs the messages array for the next openai-completions request, the reasoning_content string from the previous assistant response is included in the JSON body. This string may contain control characters (e.g., raw newlines or other chars < 0x20) that are not properly JSON-escaped during serialization. oMLX's FastAPI/Pydantic JSON parser (Python json.loads in strict mode) rejects these, while Node.js's JSON.stringify may not escape them the same way.

Specifically, Python's json.loads rejects unescaped control characters (0x00–0x1F) inside JSON strings per RFC 8259 §7, whereas Node.js's JSON.stringify produces valid JSON but the interaction between streaming response parsing and re-serialization may introduce raw bytes.

Suggested fix

  • Strip or sanitize reasoning_content before including it in subsequent API request bodies
  • Or: ensure all string values in the outbound request body are properly JSON-escaped (control chars → \uXXXX)
  • Or: when thinkingLevel is not explicitly enabled, omit reasoning_content from the conversation history entirely for providers using openai-completions API

Additional notes

  • The slug-gen embedded runs also consistently time out (15s) against oMLX throughout the day — possibly the same root cause
  • The webchat channel is unaffected because it uses a separate session (agent:main:main) that gets reset more frequently
  • Cron jobs (RSS, heartbeat) with sessionTarget: "isolated" are unaffected because each run starts a fresh session

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingregressionBehavior that previously worked and now fails

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions