Skip to content

[Bug]: Streaming repairJson injects control chars into unescaped Windows paths in tool-call arguments #88918

@yetval

Description

@yetval

Bug type

Behavior bug (incorrect output/state without crash)

Summary

repairJson in the streaming JSON parser silently injects control characters (0x08, 0x09, 0x0A, 0x0C, 0x0D) into tool-call string arguments that contain an unescaped Windows path — so an exec/read/write/edit tool call for C:\bin\app.exe is executed against a corrupted path. Verified by running the real module on main (2026.5.31).

Steps to reproduce

The defect is in src/llm/utils/json-parse.ts, reached whenever a model streams tool-call arguments where a JSON.parse fails and repairJson runs (an unescaped backslash, which weaker / local OpenAI- and Anthropic-compatible models routinely emit). All streaming providers feed args through parseStreamingJson (src/llm/providers/{anthropic,openai-completions,mistral,openai-responses-shared}.ts, src/agents/{anthropic,openai}-transport-stream.ts).

Minimal deterministic repro against the real code:

import { parseStreamingJson } from "./src/llm/utils/json-parse.ts";

// Model streams tool args with an unescaped Windows path (invalid JSON -> repair runs):
console.log(parseStreamingJson('{"path":"C:\\bin\\app.exe"}'));
$ node --import tsx repro.mjs
{ path: 'C:\binu0008in\app.exe' }   // the "b" became 0x08 (backspace)

Full matrix (real module output, char codes shown):

Streamed args (raw) Expected path Actual path
{"path":"C:\bin\app.exe"} C:\bin\app.exe C:<0x08>in\app.exe
{"path":"C:\temp\x"} C:\temp\x C:<0x09>emp\x
{"path":"C:\new\file"} C:\new\file C:<0x0A>ew<0x0C>ile
{"path":"D:\reports\q"} D:\reports\q D:<0x0D>eports\q
{"path":"C:\users\bob"} C:\users\bob C:\users<0x08>ob

Expected behavior

repairJson already special-cases an invalid \u so that an unescaped Windows path round-trips to the literal backslash — this is the documented test contract in src/llm/utils/json-parse.test.ts:5-10 ({"path":"C:\users"} must parse to the literal string C:\users). The same repair must apply to \b \f \n \r \t: a backslash followed by these letters in a not-otherwise-valid JSON string came from a model that meant a literal backslash, and must be doubled (\\), not passed through as a control-char escape.

Actual behavior

src/llm/utils/json-parse.ts:3 lists b f n r t u in VALID_JSON_ESCAPES. The \u invalid case is handled separately (it doubles the backslash), but b f n r t fall into the pass-through branch at src/llm/utils/json-parse.ts:75-79:

if (VALID_JSON_ESCAPES.has(nextChar)) {
  repaired += `\\${nextChar}`;   // re-emits \b \f \n \r \t
  index += 1;
  continue;
}

JSON.parse then decodes \b→0x08, \t→0x09, \n→0x0A, \f→0x0C, \r→0x0D. The control char silently replaces the path separator + first letter of the next segment, and the corrupted string is assigned to block.arguments and handed to the executing tool.

OpenClaw version

2026.5.31-beta.4 (verified on main @ commit ec6ad88)

Operating system

Reproduced on Ubuntu 24.04 (defect is platform-independent; the corrupted data is Windows-path content emitted by the model, not the host OS).

Install method

pnpm dev (source checkout)

Model

Any model emitting unescaped backslashes in tool-call args (weaker/local OpenAI- and Anthropic-compatible models). Provider-independent.

Provider / routing chain

openclaw -> streaming tool-call parser (parseStreamingJson) -> tool executor

Logs, screenshots, and evidence

in={"path":"C:\bin\app.exe"} expected="C:\bin\app.exe" got="C:\binu0008in\app.exe" codes=[67,58,8,105,110,92,97,112,112,46,101,120,101]
in={"path":"C:\temp\x"}      expected="C:\temp\x"      got="C:u0009emp\x"           codes=[67,58,9,101,109,112,92,120]
in={"path":"C:\new\file"}    expected="C:\new\file"    got="C:u000Aewu000Cile"       codes=[67,58,10,101,119,12,105,108,101]
in={"path":"D:\reports\q"}   expected="D:\reports\q"   got="D:u000Deports\q"          codes=[68,58,13,101,112,111,114,116,115,92,113]
in={"path":"C:\users\bob"}   expected="C:\users\bob"   got="C:\usersu0008ob"          codes=[67,58,92,117,115,101,114,115,8,111,98]

Impact and severity

  • Affected: any user whose model streams a tool-call string argument containing an unescaped Windows path segment beginning with b/f/n/r/t — extremely common (\bin, \temp/\tmp, \new, \reports, \build, \foo after a prior \f, etc.). Provider-independent.
  • Severity: High. Silent data corruption of file/exec/read/write/edit tool arguments — the tool operates on the wrong path (wrong file, or a path containing a backspace/newline) with no error surfaced to the user.
  • Frequency: Deterministic for the affected input shape (5/5 cases above), gated on the model emitting unescaped backslashes (well-behaved models that emit \\ succeed in JSON.parse first and never hit repairJson).
  • Consequence: wrong-target or failed file operations; the user sees a correct path requested by the model but a corrupted path acted upon.

Additional information

Suggested fix: handle b f n r t the same way the invalid \u case is already handled — when the backslash is an unescaped-path indicator, double it (repaired += "\\\\") instead of passing the escape through. Equivalently, restrict the VALID_JSON_ESCAPES pass-through to ", \, / and double-escape the control-letter forms. Add the path cases (\bin, \temp, \new, \reports) to src/llm/utils/json-parse.test.ts, which currently only covers the \u case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High-priority user-facing bug, regression, or broken workflow.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.clawsweeper:queueable-fixClawSweeper marked this issue as an existing queue_fix_pr work candidate.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.impact:data-lossCan lose, corrupt, or silently drop user/session/config data.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions