Bug
readStdin() in packages/cli/src/utils/readStdin.ts defines the stdin size limit as 8 * 1024 * 1024 (8 MB in bytes), but enforces it using string.length after process.stdin.setEncoding('utf8').
Root cause
process.stdin.setEncoding('utf8');
if (totalSize + chunk.length > MAX_STDIN_SIZE) {
const remainingSize = MAX_STDIN_SIZE - totalSize;
data += chunk.slice(0, remainingSize);
}
string.length counts UTF-16 code units, not UTF-8 bytes
- multi-byte characters (e.g., CJK, emoji) are undercounted
string.slice() may split surrogate pairs, producing malformed output
Impact
- The 8 MB limit is not byte-accurate for non-ASCII input
- Truncation may corrupt characters at the boundary
Suggested fix
Scope
This issue applies to:
packages/cli/src/utils/readStdin.ts
Note:
readStdinLines.ts was addressed separately in PR #23414.
Bug
readStdin()inpackages/cli/src/utils/readStdin.tsdefines the stdin size limit as8 * 1024 * 1024(8 MB in bytes), but enforces it usingstring.lengthafterprocess.stdin.setEncoding('utf8').Root cause
string.lengthcounts UTF-16 code units, not UTF-8 bytesstring.slice()may split surrogate pairs, producing malformed outputImpact
Suggested fix
Buffer.byteLength(chunk, 'utf8')for byte-accurate size trackingBuffer) to avoid splitting multi-byte charactersreadStdinLines.ts(PR feat(cli): allow -i/--prompt-interactive with piped stdin #23414)Scope
This issue applies to:
packages/cli/src/utils/readStdin.tsNote:
readStdinLines.tswas addressed separately in PR #23414.