Fix #67488: Handle Unicode smart quotes in tool call argument JSON by mushuiyu886 · Pull Request #67551 · openclaw/openclaw

mushuiyu886 · 2026-04-16T06:16:40Z

Summary

Cron jobs using the edit tool on files containing Chinese quotation marks, markdown bold syntax, or mixed CJK + ASCII content were incorrectly marked as error with a JSON.parse SyntaxError, even though the agent completed successfully and delivery was sent.

Root Cause

Some models (e.g., MiniMax, Moonshot) occasionally emit tool-call arguments using Unicode smart quotes (", ", U+201C/U+201D) as JSON string delimiters instead of ASCII ". The existing JSON.parse call in tryExtractUsableToolCallArguments fails because JSON only recognizes ASCII " as a string delimiter.

Additionally, the extractBalancedJsonPrefix repair function only tracked ASCII " for string boundaries, causing incorrect brace-depth counting when Unicode quotes were present.

Changes

normalizeStructuralUnicodeQuotes(): New function that performs a character-level scan tracking string boundaries (recognizing both ASCII " and Unicode smart quotes). Only structural Unicode quotes at key/value boundaries are rewritten to ASCII ", preserving content-level smart quotes inside strings.
extractBalancedJsonPrefix(): Updated to recognize Unicode smart quotes (\u201C, \u201D, \u201E, \u201F) as string delimiters for correct brace-depth tracking.
tryExtractUsableToolCallArguments(): After the initial JSON.parse fails, attempts normalization before falling back to balanced-prefix extraction. The balanced-prefix extraction also uses normalized input for correct results.

Testing

The existing tool-call argument repair test suite covers the preserved/repaired paths.
Manual verification: the normalization correctly rewrites structural Unicode quotes while leaving content-level quotes intact.

Backward Compatibility

Fully backward compatible. The fix only activates when the initial JSON.parse fails and the raw string contains Unicode smart quotes — no behavior change for standard ASCII JSON arguments.

When models (e.g., MiniMax, Moonshot) emit tool-call arguments using Unicode smart quotes (U+201C/U+201D) as JSON string delimiters instead of ASCII `"`, JSON.parse fails with a SyntaxError. This caused cron jobs using the edit tool on files with special characters to be marked as error even though the agent completed successfully. - Add normalizeStructuralUnicodeQuotes() that scans the raw JSON, tracks string boundaries using both ASCII and Unicode quote chars, and rewrites only structural Unicode quotes to ASCII `"` while preserving content-level smart quotes inside strings. - Update extractBalancedJsonPrefix() to recognize Unicode smart quotes as string delimiters for correct brace-depth tracking. - In tryExtractUsableToolCallArguments(), attempt the normalized version before falling back to balanced-prefix extraction. Fixes openclaw#67488

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a2d261ae2d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

greptile-apps · 2026-04-16T06:26:34Z

Greptile Summary

Adds a normalizeStructuralUnicodeQuotes repair step for models (MiniMax, Moonshot) that emit tool-call arguments with Unicode smart quotes (U+201C/U+201D) as JSON string delimiters instead of ASCII ". The extractBalancedJsonPrefix function is also updated to recognise those code-points for correct brace-depth tracking.

The core normalizeStructuralUnicodeQuotes logic is correct for the primary target case (fully Unicode-delimited JSON), but when JSON.parse(raw) fails for a different reason (e.g. 1-3 char trailing garbage) and the raw string uses ASCII delimiters with U+201C/U+201D as content (common in CJK text), the normalisation step corrupts those content characters before passing normalized to extractBalancedJsonPrefix. The old code used raw here and would have repaired these inputs successfully; the new code returns undefined instead.
normalizeStructuralUnicodeQuotes(extracted.json) at line 236 is a no-op because extracted.json is already a slice of the already-normalised string; it can be dropped.

Confidence Score: 4/5

Mostly safe to merge for the primary MiniMax/Moonshot Unicode-delimiter fix, but a regression in the balanced-prefix repair path should be addressed first.

The P1 finding is a real behavioural regression: inputs that were repaired correctly before this PR (ASCII-delimited JSON with U+201C/U+201D content characters plus ≤3-char trailing garbage) will now silently fail repair. This is plausible for CJK content — the exact scenario mentioned in the PR description. The fix is a one-liner change to the extractBalancedJsonPrefix call site.

src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts — specifically the extractBalancedJsonPrefix(normalized) call at line 218 and the redundant double-normalization at line 236.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts
Line: 218

Comment:
**Regression: balanced-prefix extraction uses corrupted `normalized` instead of `raw`**

When the raw JSON uses ASCII delimiters but has U+201C/U+201D characters as legitimate string content (common in CJK tool arguments, e.g. `{"content": "File with \u201Csmart\u201D quotes"}`), `normalizeStructuralUnicodeQuotes` treats the content smart quotes as structural delimiters and corrupts the string — producing e.g. `{"content": "File with "smart" quotes"}`. Passing that corrupted string to `extractBalancedJsonPrefix` then yields a candidate that `JSON.parse` rejects (line 238), so repair returns `undefined`.

Before this PR, `extractBalancedJsonPrefix(raw)` was called on the original string where U+201C/U+201D were invisible to the ASCII-only quote tracker, so the balanced prefix was extracted cleanly and `JSON.parse` succeeded. The change to `normalized` silently breaks that path for any input where:
1. `JSON.parse(raw)` fails for a reason unrelated to smart-quote delimiters (e.g. 1-3 char trailing garbage), AND
2. The ASCII-delimited strings contain U+201C/U+201D as content.

Suggested approach: fall back to `extractBalancedJsonPrefix(raw)` (the original behaviour) and only use `normalized` in `JSON.parse`, or try `raw`-extracted JSON first and only re-attempt with `normalized` if that parse fails:

```
// Prefer extracting from the original string so ASCII-delimited JSON with
// Unicode content characters is not corrupted by normalisation.
const extractedRaw = extractBalancedJsonPrefix(raw);
const extracted = extractedRaw ?? extractBalancedJsonPrefix(normalized);
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts
Line: 236-237

Comment:
**Redundant double-normalization**

`extracted.json` is a slice of `normalized`, which `normalizeStructuralUnicodeQuotes` already processed. Calling it again here is a no-op (all structural Unicode quotes were replaced in the first pass) and can be replaced with a direct assignment:

```suggestion
    const candidateJson = extracted.json;
    try {
```

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "fix(cron): handle Unicode smart quotes i..." | Re-trigger Greptile}

Address review feedback from Greptile on PR openclaw#67551: 1. Prefer extractBalancedJsonPrefix(raw) over normalized to avoid corrupting CJK content smart quotes when the original JSON uses ASCII delimiters. Fall back to normalized only when raw extraction yields nothing. 2. Remove redundant double-normalization of extracted.json — it is already a slice of either raw or normalized. 3. Add fallback: if the raw-extracted JSON fails to parse, try normalizing Unicode quotes in the extracted slice and re-attempt.

…hars

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 51a75c09e1

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Address Greptile P1 review feedback: both normalizeStructuralUnicodeQuotes and extractBalancedJsonPrefix now track whether a string was opened with ASCII `"` or a Unicode smart quote, and only close on the matching type. Before this fix, `{"text":"hello\u201D world"}x` would incorrectly close the string at the U+201D character even though the string was opened with ASCII `"`, causing mis-tokenization and failed repair of otherwise recoverable payloads containing CJK smart quotes in string content. - normalizeStructuralUnicodeQuotes: only replaces Unicode quotes that open/close their own strings; ASCII-opened strings with Unicode content are left untouched. - extractBalancedJsonPrefix: same opener-tracking logic applied.

…hars

clawsweeper · 2026-04-27T04:50:33Z

Codex review: needs changes before merge.

Summary
The PR changes embedded runner tool-call argument repair to normalize Unicode smart quotes in malformed argument JSON and teach balanced-prefix extraction to treat smart quotes as string delimiters.

Reproducibility: yes. A focused parser reproduction is enough: current main has no Unicode quote normalization and the shared scanner only tracks ASCII quotes, so structural smart-quoted argument JSON still fails before tool-call repair can produce usable args.

Next step before merge
This is a narrow, concrete parser repair whose remaining blocker is a current-main adaptation plus tests, so it is suitable for an automated repair lane.

Security
Cleared: The diff only changes TypeScript tool-call argument parsing and does not touch workflows, dependencies, lockfiles, install/build/release scripts, secrets, downloaded artifacts, or publishing metadata.

Review findings

[P2] Move smart-quote scanning onto the shared helper — src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts:120

Review details

Best possible solution:

Ship a rebased narrow parser repair on current main that adds delimiter-aware smart-quote normalization in the shared helper/current runner structure, with focused tests and the required changelog entry.

Do we have a high-confidence way to reproduce the issue?

Yes. A focused parser reproduction is enough: current main has no Unicode quote normalization and the shared scanner only tracks ASCII quotes, so structural smart-quoted argument JSON still fails before tool-call repair can produce usable args.

Is this the best way to solve the issue?

No, not as-is. The repair direction is reasonable, but the implementation needs to move onto src/shared/balanced-json.ts and the current runner call path instead of preserving the older local scanner layout.

Full review comments:

[P2] Move smart-quote scanning onto the shared helper — src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts:120
Current main imports extractBalancedJsonPrefix from src/shared/balanced-json.ts, but this diff updates the older local scanner inside the runner. Rebase the change and put the delimiter-aware scan where current callers read it, otherwise the core parser fix will be lost or conflict-resolved away.
Confidence: 0.88

Overall correctness: patch is incorrect
Overall confidence: 0.88

Acceptance criteria:

pnpm test src/shared/balanced-json.test.ts src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.test.ts
pnpm test src/agents/pi-embedded-runner/run/attempt.test.ts -- --runInBand is not allowed; use the repo wrapper without Jest flags if this broader runner surface is needed
pnpm check:changed

What I checked:

Current main has no smart-quote normalization in the repair parser: tryExtractUsableToolCallArguments first calls JSON.parse(raw) and then falls back to extractBalancedJsonPrefix(raw) plus JSON.parse(extracted.json); there is no Unicode delimiter normalization path. (src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts:54, 0a798af4fc7a)
Current main uses the shared balanced JSON helper: The runner imports extractBalancedJsonPrefix from src/shared/balanced-json.ts, so scanner behavior now lives outside the runner file touched by this PR's older local helper diff. (src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts:7, 0a798af4fc7a)
Shared scanner only recognizes ASCII string delimiters: extractBalancedJsonPrefix enters and exits string mode only on ASCII double quotes, so U+201C/U+201D delimiters are not handled by the current shared helper. (src/shared/balanced-json.ts:42, 0a798af4fc7a)
The PR diff patches the obsolete local scanner layout: The live PR file patch adds UNICODE_QUOTE_CHARS and changes a local extractBalancedJsonPrefix inside the runner file, which no longer matches current main's shared-helper structure. (src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts:120, ec04ee5c8ae8)
Related merged PR is not a parser supersession: The changelog entry for fix(agents): repair malformed tool-call args on openai-completions #70294 says it enabled existing malformed tool-call repair for openai-completions backends; it does not mention Unicode quote normalization or scanner changes. (CHANGELOG.md:2151, 0a798af4fc7a)
Feature-history routing: GitHub commit history shows f006678f3c9a introduced the shared balanced JSON helper and fix(agents): repair malformed tool-call args on openai-completions #70294 added the adjacent openai-completions repair enablement. (src/shared/balanced-json.ts:21, f006678f3c9a)

Likely related people:

steipete: Authored the shared balanced JSON extraction refactor and recently maintained the embedded stream wrapper/repair area that this PR must now target. (role: recent maintainer and shared-helper owner; confidence: high; commits: f006678f3c9a, 8879ed153d27, 2ba43f5d270c; files: src/shared/balanced-json.ts, src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts)
MonkeyLeeT: Authored the merged fix(agents): repair malformed tool-call args on openai-completions #70294 openai-completions repair enablement and focused tests in the same runner repair path. (role: adjacent behavior author; confidence: medium; commits: 49c7319ea5a2, 2ba43f5d270c; files: src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts, src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.test.ts)

Remaining risk / open question:

The parser-level gap is clear, but the final fix should still add focused regression tests for structural smart-quote delimiters, ASCII-delimited content smart quotes, and trailing-junk repair.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 0a798af4fc7a.

openclaw-barnacle Bot added agents Agent runtime and tooling size: S labels Apr 16, 2026

chatgpt-codex-connector Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts Outdated

greptile-apps Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts Outdated

Comment thread src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts Outdated

mushuiyu886 added 4 commits April 16, 2026 14:34

Merge branch 'main' into fix/issue-67488-cron-serialization-special-c…

6ad84c6

…hars

Merge branch 'main' into fix/issue-67488-cron-serialization-special-c…

d2fb60c

…hars

Merge branch 'main' into fix/issue-67488-cron-serialization-special-c…

51a75c0

…hars

chatgpt-codex-connector Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts Outdated

mushuiyu886 added 3 commits April 16, 2026 19:51

Merge branch 'main' into fix/issue-67488-cron-serialization-special-c…

536e2bc

…hars

Merge branch 'main' into fix/issue-67488-cron-serialization-special-c…

ec04ee5

…hars

MonkeyLeeT mentioned this pull request Apr 22, 2026

fix(agents): repair malformed tool-call args on openai-completions #70294

Merged

25 tasks

clawsweeper Bot mentioned this pull request Apr 27, 2026

Cron job result serialization fails on special characters in edit tool arguments #67488

Closed

mushuiyu886 closed this May 9, 2026

ferminquant mentioned this pull request May 25, 2026

Fix smart-quoted tool call argument repair #86611

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix #67488: Handle Unicode smart quotes in tool call argument JSON#67551

Fix #67488: Handle Unicode smart quotes in tool call argument JSON#67551
mushuiyu886 wants to merge 8 commits into
openclaw:mainfrom
mushuiyu886:fix/issue-67488-cron-serialization-special-chars

mushuiyu886 commented Apr 16, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 16, 2026

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

clawsweeper Bot commented Apr 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

mushuiyu886 commented Apr 16, 2026

Summary

Root Cause

Changes

Testing

Backward Compatibility

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 16, 2026

Greptile Summary

Confidence Score: 4/5

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

clawsweeper Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

clawsweeper Bot commented Apr 27, 2026 •

edited

Loading