-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Fix malformed Unicode in outgoing MCP responses #1447
Description
A browser action can succeed, but the follow-up MCP response fails to serialize with:
the request body is not valid JSON: invalid high surrogate in string
This appears to happen when tool output includes malformed Unicode from page-derived text.
Reproduction
- Start
@playwright/mcpwith--extension - Navigate to a page containing problematic text content
- Perform a browser action such as
browser_click - Observe that the action succeeds, but the returned MCP response fails
Observed behavior
The browser action completes, but the client receives an error during response handling:
invalid high surrogate in string
Expected behavior
If page-derived text contains malformed Unicode, MCP should still return a valid JSON-RPC response instead of failing serialization.
Likely root cause
The outgoing MCP message contains a JavaScript string with a lone surrogate or other non-well-formed Unicode sequence. This is most likely coming from page-derived strings such as accessibility snapshot text, console text, page title, URL, or other extracted content.
Evidence
A local wrapper that sanitizes outgoing JSON-RPC string values before transport serialization fixes the issue without changing browser behavior. That strongly suggests the bug is in outgoing message serialization, not in the browser action itself.
Suggested fix
Sanitize outgoing MCP message strings before transport serialization:
- use
String.prototype.toWellFormed()when available - otherwise replace lone surrogates with
\uFFFD
Apply this at the transport boundary so it covers stdio, SSE, and streamable HTTP responses.