Skip to content

fix: suppress raw JSON parse errors from leaking to Discord channels (#59076) [AI-assisted]#59118

Merged
hxy91819 merged 5 commits intoopenclaw:mainfrom
singleGanghood:fix/issue-59076-json-parse-error-leak-discord
Apr 29, 2026
Merged

fix: suppress raw JSON parse errors from leaking to Discord channels (#59076) [AI-assisted]#59118
hxy91819 merged 5 commits intoopenclaw:mainfrom
singleGanghood:fix/issue-59076-json-parse-error-leak-discord

Conversation

@singleGanghood
Copy link
Copy Markdown
Contributor

When streaming tool calls with long CJK text, the Anthropic SDK's SSE parser can throw SyntaxError. This error was forwarded as-is to the Discord channel instead of being caught and rewritten to a user-friendly message.

Summary

  • Problem: Raw JSON parse errors (e.g. Expected ',' or '}' after property value in JSON at position 334) from the Anthropic SDK streaming layer are leaked to Discord channels as standalone messages.
  • Why it matters: Users see confusing, technical error messages in their Discord channels that provide no actionable information. The tool call still recovers and the response comes through, making this a cosmetic but disruptive UX issue.
  • What changed: Added isStreamingJsonParseError() detector in pi-embedded-helpers/errors.ts that intercepts JSON parse error patterns in both formatAssistantErrorText() and sanitizeUserFacingText(), replacing them with a friendly "LLM streaming response contained a malformed fragment. Please try again." message.
  • What did NOT change (scope boundary): No changes to the streaming pipeline itself, the Anthropic SDK, pi-ai library, or the Discord channel plugin. The fix is purely in the error formatting layer that sits between the agent runner and the channel delivery.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Root Cause / Regression History (if applicable)

  • Root cause: formatAssistantErrorText() in pi-embedded-helpers/errors.ts has a comprehensive pattern-matching chain for known error types (context overflow, billing, rate limit, role ordering, transport, etc.), but lacked a pattern for JSON parse errors from streaming SSE data. Unrecognized errors fall through to L949 which returns the raw error text (truncated to 600 chars).
  • Missing detection / guardrail: No pattern existed for SyntaxError messages from JSON.parse failures in the streaming pipeline.
  • Prior context: The error originates in @anthropic-ai/sdk/src/core/streaming.ts L69-75 where JSON.parse(sse.data) is called on each SSE event. The SDK re-throws the SyntaxError, which pi-ai's streamAnthropic catch block (L340) captures and converts to output.errorMessage. This flows through the agent lifecycle handler to formatAssistantErrorText().
  • Why this regressed now: Issue reporter notes it appeared after upgrading to 2026.3.31. Likely a change in the Anthropic SDK or SSE payload format for CJK content made the SSE data parsing more fragile.
  • If unknown, what was ruled out: The parseStreamingJson() function in pi-ai (used for input_json_delta partial JSON) is safe (triple-fallback). The problem is in the SDK's SSE layer, which parses the raw SSE data: line with strict JSON.parse before pi-ai ever sees the event.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts
  • Scenario the test should lock in: formatAssistantErrorText returns a friendly message (not raw error text) when the error matches JSON parse error patterns from streaming SDKs.
  • Why this is the smallest reliable guardrail: The formatAssistantErrorText function is the last-mile sanitizer before error text reaches channels. Testing at this layer catches all upstream error sources without needing to mock the full streaming pipeline.
  • Existing test that already covers this (if any): None — this is a new error pattern class.
  • If no new test is added, why not: 5 new tests added covering 3 JSON parse error variants + 2 sanitizeUserFacingText context tests.

User-visible / Behavior Changes

  • Before: Raw JSON parse errors like Expected ',' or '}' after property value in JSON at position 334 (line 1 column 335) appear as standalone Discord messages.
  • After: These errors are rewritten to "LLM streaming response contained a malformed fragment. Please try again." — a clear, non-technical message that tells the user what happened and what to do.

Diagram (if applicable)

Before:
  SDK SyntaxError → pi-ai catch → errorMessage → formatAssistantErrorText (no match) → raw text → Discord channel ❌

After:
  SDK SyntaxError → pi-ai catch → errorMessage → formatAssistantErrorText (isStreamingJsonParseError ✓) → friendly text → Discord channel ✅

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)

Repro + Verification

Environment

  • OS: Darwin 23.4.0 arm64
  • Runtime/container: Node v22.18.0, pnpm 10.28.0
  • Model/provider: Anthropic Claude Opus 4
  • Integration/channel (if any): Discord
  • Relevant config (redacted): Channel with [skills: legal, lawclaw] in topic (adds ~10KB of skill context with regex patterns)

Steps

  1. Create a Discord channel with [skills: legal, lawclaw] in the topic
  2. Send a message that triggers the agent to use Edit or Write with substantial Chinese text content
  3. Observe the raw JSON parse error appearing as a Discord message before the actual response

Expected

  • JSON parse errors from the streaming partial JSON parser should be caught internally and not forwarded to the Discord channel.

Actual

  • Raw error messages like Expected ',' or '}' after property value in JSON at position 334 (line 1 column 335) appear as standalone Discord messages.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

5 new test cases added and passing:

  • sanitizes streaming JSON parse errors from Anthropic SDK (#59076) — verifies Expected ',' or '}' ... pattern
  • sanitizes 'Expected double-quoted property name' JSON parse errors (#59076) — verifies Expected double-quoted... pattern
  • sanitizes 'Unexpected token' JSON parse errors (#59076) — verifies Unexpected token... pattern
  • sanitizeUserFacingText: rewrites JSON parse error in error context — verifies sanitize path
  • sanitizeUserFacingText: does not rewrite when not in error context — ensures normal text is not affected

All 131 related tests pass. pnpm build succeeds. AI-generated code verified by automated test runs.

Human Verification (required)

  • Verified scenarios: All 5 new test cases + 126 existing tests in the error formatting suite pass. Build verification passed.
  • Edge cases checked: (1) JSON parse error with various position/line formats, (2) sanitizeUserFacingText only rewrites in error context (doesn't affect normal assistant text mentioning JSON), (3) regex pattern doesn't false-positive on unrelated error messages.
  • What you did NOT verify: Live Discord channel testing with actual CJK streaming tool calls — requires a running gateway with Discord integration and an Anthropic API key.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)

Risks and Mitigations

  • Risk: The regex \b(?:expected|unexpected)\b.+\bin json\b.+\bposition\b could theoretically match an assistant message that happens to contain all three keywords in sequence (e.g., explaining JSON parsing to a user).
    • Mitigation: The pattern only fires inside formatAssistantErrorText() (which processes stopReason: "error" assistant messages) and inside sanitizeUserFacingText() only when errorContext: true. Normal assistant text is never routed through these error-handling paths. Additionally, the regex requires all three keywords (expected/unexpected, in JSON, position) in a single line, making false positives on natural language extremely unlikely.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 1, 2026

Greptile Summary

This PR adds a targeted fix for JSON parse errors from the Anthropic SDK's SSE streaming layer leaking as raw error messages to Discord channels. It introduces isStreamingJsonParseError() in errors.ts and wires it into both formatAssistantErrorText() and sanitizeUserFacingText() (error context only), replacing the raw SyntaxError text with a user-friendly message. Five new unit tests lock in the behaviour for three error format variants and the sanitizeUserFacingText guard.

  • New isStreamingJsonParseError() helper uses /\b(?:expected|unexpected)\b.+\bin json\b.+\bposition\b/i to detect V8 JSON.parse error strings from the SDK's SSE parser.
  • Check is inserted in formatAssistantErrorText() after the existing isInvalidStreamingEventOrderError guard and before the role-ordering check — appropriate placement in the chain.
  • In sanitizeUserFacingText() the check is correctly gated behind if (errorContext), so normal assistant text containing the word "JSON" is never affected.
  • The regex is intentionally narrow (requires all three keywords: expected/unexpected, in json, position) to minimise false-positive risk. Node v22 V8 error messages for other JSON.parse failure modes (e.g. "Unexpected end of JSON input" or the newer "X is not valid JSON" format) will not match, but those are out of scope for issue JSON parse error leaked to Discord when streaming tool calls with long CJK text #59076.
  • All five new tests pass; build verification confirmed by author.

Confidence Score: 5/5

Safe to merge — the change is purely additive, scoped to error formatting paths, and all tests pass.

No P0 or P1 issues found. The regex is appropriately scoped, the guard is only active inside error-context paths, and five new tests lock in the intended behaviour. All remaining observations are at most P2 style notes.

No files require special attention.

Reviews (1): Last reviewed commit: "fix: suppress raw JSON parse errors from..." | Re-trigger Greptile

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S labels Apr 1, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 082dba2343

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/agents/pi-embedded-helpers/errors.ts Outdated
@hxy91819 hxy91819 self-assigned this Apr 2, 2026
@hxy91819
Copy link
Copy Markdown
Member

hxy91819 commented Apr 2, 2026

Thanks for chasing this leak down. Routing the fix through src/agents/pi-embedded-helpers/errors.ts is the right scope, and reusing the existing user-facing sanitization layer is better than patching this in a channel-specific path.

That said, I do not think the current implementation is robust enough yet. isStreamingJsonParseError() is still matching specific JSON.parse() message shapes:

  • it depends on wording like expected|unexpected
  • it requires in JSON
  • it requires position

That is a pretty fragile contract for a user-facing fix, especially since JSON.parse() error text varies not only across Node/V8 versions, but also across different malformed-input cases on the same runtime. For example, Node 22 also emits messages like Unexpected end of JSON input and Unexpected non-whitespace character after JSON at position 4, and those would still fall through to raw user-visible text.

So my concern is not the scope of the fix, but the level it is implemented at: this currently looks more like a patch for the observed strings than a robust fix for the underlying error class.

I think the best path would be one of:

  1. Preferably classify this earlier as a structured "invalid/malformed streaming fragment" error and then reuse that classification here.
  2. If that is too large for this PR, broaden the matcher to cover the wider JSON.parse() SyntaxError family and add regression coverage for at least a truncated-input case plus a trailing-garbage case.

As written, I do not think we can confidently say this fixes the general leak class yet.

@vincentkoc
Copy link
Copy Markdown
Member

ProjectClownfish pushed a narrow repair to this branch so the original contributor path can stay canonical.

Source PR: #59118
Validation: pnpm -s vitest run src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts src/agents/pi-embedded-helpers/errors.test.ts; pnpm check:changed
Contributor credit is preserved in the branch history and PR context.

@vincentkoc vincentkoc force-pushed the fix/issue-59076-json-parse-error-leak-discord branch from 082dba2 to d82d524 Compare April 27, 2026 22:41
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Apr 28, 2026

Codex review: needs maintainer review before merge.

What this changes:

This PR classifies malformed Anthropic SSE JSON fragments as a stable transport error, renders friendly assistant/TUI/channel-facing text, adds regression tests, and adds a changelog entry.

Maintainer follow-up before merge:

This is already an open implementation PR with maintainer repair commits; the remaining action is human review, exact-head CI/rebase judgment, and landing rather than a separate automated fix PR.

Best possible solution:

Land the focused transport-boundary classification and shared rendering fix after maintainer review and exact-head validation, while keeping wider retry/failover work separate from this leak-suppression PR.

Acceptance criteria:

  • pnpm test src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts src/agents/pi-embedded-helpers/errors.test.ts src/agents/anthropic-transport-stream.test.ts src/tui/tui-event-handlers.test.ts src/tui/tui-formatters.test.ts src/tui/tui.test.ts
  • pnpm check:changed

What I checked:

  • Current main parses Anthropic SSE JSON directly: parseAnthropicSseBody() still yields JSON.parse(data) for normal frames and tail data, so malformed SSE payloads can still throw a raw SyntaxError before classification on current main. (src/agents/anthropic-transport-stream.ts:561, a972c9ec4547)
  • Current formatter lacks malformed-fragment classification: formatAssistantErrorText() only has the invalid streaming event-order special case; no malformed_streaming_fragment, MALFORMED_STREAMING, or isStreamingJsonParseError symbol exists on main, and unclassified short errors still fall through to the raw text return. (src/agents/pi-embedded-helpers/errors.ts:1080, a972c9ec4547)
  • Channel-facing reply path remains live: When the last assistant message has stopReason === "error", buildAgentReplyPayloads() calls formatAssistantErrorText() and pushes the returned text as an isError reply item, so the formatter fallback can still reach chat/channel surfaces. (src/agents/pi-embedded-runner/run/payloads.ts:216, a972c9ec4547)
  • Main tests do not lock this leak class: The current formatter tests cover invalid streaming event order but not malformed Anthropic SSE JSON fragments or the proposed stable sentinel path. (src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts:344, a972c9ec4547)
  • PR head is current and focused: Remote PR head is c0bd453c5bb938447486d29aa0fb7f49baf29395; the fetched diff is limited to agent transport/error formatting, shared error rendering, TUI rendering/tests, a whitespace-only transport shared change, and the changelog. (c0bd453c5bb9)
  • Security review pass: The current PR diff does not touch GitHub workflows, dependency manifests/lockfiles, package metadata, lifecycle scripts, permissions, secrets handling, downloaded artifacts, generated/vendor code, or publishing surfaces; it changes TypeScript error classification/rendering and tests. (c0bd453c5bb9)

Likely related people:

  • steipete: Current-line blame in the checked-out main points to Peter Steinberger's recent main history across the Anthropic transport, assistant error formatter, shared error formatter, TUI error rendering, and channel payload path involved in this leak. (role: recent main maintainer / adjacent owner; confidence: medium; commits: 1fb58ca5eee0, a972c9ec4547; files: src/agents/anthropic-transport-stream.ts, src/agents/pi-embedded-helpers/errors.ts, src/agents/pi-embedded-runner/run/payloads.ts)
  • hxy91819: Reviewed the original approach, called out the fragile JSON.parse string matching, and authored later branch commits that move classification to the transport boundary and prevent sentinel leaks through TUI/raw formatter paths. (role: reviewer and recent branch repair author; confidence: high; commits: 9b358ca8eb0d, 5e2716775d40, c0bd453c5bb9; files: src/agents/anthropic-transport-stream.ts, src/agents/pi-embedded-helpers/errors.ts, src/agents/pi-embedded-helpers/sanitize-user-facing-text.ts)
  • vincentkoc: The PR discussion says vincentkoc pushed a narrow repair to keep the contributor path canonical, and the branch contains a commit preserving provider diagnostics while suppressing streaming parse leaks. (role: branch repair author / adjacent error-format maintainer; confidence: high; commits: 1735b05fb940; files: src/agents/pi-embedded-helpers/errors.ts, src/agents/pi-embedded-helpers/sanitize-user-facing-text.ts, src/shared/assistant-error-format.ts)

Remaining risk / open question:

Codex review notes: model gpt-5.5, reasoning high; reviewed against a972c9ec4547.

@openclaw-barnacle
Copy link
Copy Markdown

This assigned pull request has been automatically marked as stale after being open for 27 days.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle Bot added stale Marked as stale due to inactivity size: M and removed size: S labels Apr 29, 2026
@hxy91819 hxy91819 force-pushed the fix/issue-59076-json-parse-error-leak-discord branch from 90345ef to 7c4f54f Compare April 29, 2026 13:05
@hxy91819 hxy91819 force-pushed the fix/issue-59076-json-parse-error-leak-discord branch 2 times, most recently from cbe6a16 to c0bd453 Compare April 29, 2026 13:47
singleGanghood and others added 5 commits April 29, 2026 21:58
…penclaw#59076) [AI-assisted]

When streaming tool calls with long CJK text, the Anthropic SDK's SSE
parser can throw SyntaxError (e.g. 'Expected "," or "}" after property
value in JSON at position 334'). This error propagates through pi-ai's
catch block as an assistant error message and reaches formatAssistantErrorText,
which did not recognize the pattern and forwarded the raw error text to
the Discord channel.

- Add isStreamingJsonParseError() detector in pi-embedded-helpers/errors.ts
- Intercept in formatAssistantErrorText() to return a friendly message
- Also guard sanitizeUserFacingText() errorContext path for defense-in-depth
- Add 5 test cases covering various JSON parse error patterns and contexts

Verified: 131 related tests pass, pnpm build succeeds, zero regression.
AI-assisted: code fix and test generation verified by automated test runs
@hxy91819 hxy91819 force-pushed the fix/issue-59076-json-parse-error-leak-discord branch from c0bd453 to b8b3686 Compare April 29, 2026 13:58
@hxy91819 hxy91819 merged commit 0544c6d into openclaw:main Apr 29, 2026
67 checks passed
@hxy91819
Copy link
Copy Markdown
Member

Merged via squash.

Thanks @singleGanghood!

comradecowboy9 pushed a commit to comradecowboy9/excited-llama that referenced this pull request Apr 30, 2026
…penclaw#59076) [AI-assisted] (openclaw#59118)

Merged via squash.

Prepared head SHA: b8b3686
Co-authored-by: singleGanghood <156392444+singleGanghood@users.noreply.github.com>
Co-authored-by: hxy91819 <8814856+hxy91819@users.noreply.github.com>
Reviewed-by: @hxy91819
lxe pushed a commit to lxe/openclaw that referenced this pull request May 6, 2026
…penclaw#59076) [AI-assisted] (openclaw#59118)

Merged via squash.

Prepared head SHA: b8b3686
Co-authored-by: singleGanghood <156392444+singleGanghood@users.noreply.github.com>
Co-authored-by: hxy91819 <8814856+hxy91819@users.noreply.github.com>
Reviewed-by: @hxy91819
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
…penclaw#59076) [AI-assisted] (openclaw#59118)

Merged via squash.

Prepared head SHA: b8b3686
Co-authored-by: singleGanghood <156392444+singleGanghood@users.noreply.github.com>
Co-authored-by: hxy91819 <8814856+hxy91819@users.noreply.github.com>
Reviewed-by: @hxy91819
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: M stale Marked as stale due to inactivity

Projects

None yet

Development

Successfully merging this pull request may close these issues.

JSON parse error leaked to Discord when streaming tool calls with long CJK text

3 participants