fix(process): decode Windows command output with console codepage awareness#72393
fix(process): decode Windows command output with console codepage awareness#72393vincentkoc merged 2 commits intomainfrom
Conversation
🔒 Aisle Security AnalysisWe found 3 potential security issue(s) in this PR:
1. 🟠 Windows search-order hijack and potential hang in console codepage detection via spawnSync("cmd.exe")
Description
Impacts:
Vulnerable code: const result = spawnSync("cmd.exe", ["/d", "/s", "/c", "chcp"], {
windowsHide: true,
encoding: "utf8",
stdio: ["ignore", "pipe", "pipe"],
});RecommendationAvoid search-order resolution and add a bounded execution time.
Example fix: import path from "node:path";
function getSystemCmdExe(): string {
const systemRoot = process.env.SystemRoot || process.env.windir;
if (systemRoot) {
return path.join(systemRoot, "System32", "cmd.exe");
}
// Fallback (still better than relying on CWD); consider hard-failing if missing.
return process.env.ComSpec || "cmd.exe";
}
const result = spawnSync(getSystemCmdExe(), ["/d", "/s", "/c", "chcp"], {
windowsHide: true,
encoding: "utf8",
stdio: ["ignore", "pipe", "pipe"],
timeout: 2000,
maxBuffer: 64 * 1024,
});If this code can run in elevated/service contexts, consider also setting a safe 2. 🟡 Unbounded stdout/stderr buffering in runCommandWithTimeout can cause memory exhaustion
DescriptionThe
If an attacker can influence the executed command (or cause a spawned tool/plugin to emit大量 output), they can produce arbitrarily large stdout/stderr quickly and trigger high memory usage / out-of-memory termination (denial of service). Vulnerable code: child.stdout?.on("data", (d) => {
stdoutChunks.push(Buffer.isBuffer(d) ? d : Buffer.from(d));
});
...
stdout: decodeWindowsOutputBuffer({
buffer: Buffer.concat(stdoutChunks),
windowsEncoding,
}),RecommendationIntroduce an explicit output cap (similar to Node's Example approach: const maxBuffer = options.maxBuffer ?? 1024 * 1024; // 1 MiB default (pick an appropriate default)
let stdoutLen = 0;
let stderrLen = 0;
child.stdout?.on("data", (d: Buffer) => {
const b = Buffer.isBuffer(d) ? d : Buffer.from(d);
stdoutLen += b.length;
if (stdoutLen > maxBuffer) {
child.kill("SIGKILL");
return;
}
stdoutChunks.push(b);
});
child.stderr?.on("data", (d: Buffer) => {
const b = Buffer.isBuffer(d) ? d : Buffer.from(d);
stderrLen += b.length;
if (stderrLen > maxBuffer) {
child.kill("SIGKILL");
return;
}
stderrChunks.push(b);
});Alternatively, stream output to a temporary file or expose a streaming API/callback instead of buffering all output in memory. 3. 🟡 Terminal/log escape injection via unsanitized decoded child-process output in verbose logging
DescriptionThe This allows a child process (or any untrusted command being executed) to inject:
Vulnerable code paths:
Vulnerable code: const decodedStdout = decodeWindowsOutputBuffer({ buffer: stdout, windowsEncoding });
const decodedStderr = decodeWindowsOutputBuffer({ buffer: stderr, windowsEncoding });
if (shouldLogVerbose()) {
if (decodedStdout.trim()) {
logDebug(decodedStdout.trim());
}
if (decodedStderr.trim()) {
logError(decodedStderr.trim());
}
}RecommendationNeutralize control characters / ANSI escape sequences before emitting child-process output to terminals or persistent logs. Option A (safe-by-default for logs): escape non-printables and newlines. function sanitizeForLog(s: string): string {
// Remove ANSI escapes and other C0 controls except \n/\t if desired.
// Consider a well-tested lib like `strip-ansi` for ANSI removal.
const withoutAnsi = s.replace(/\x1B\[[0-?]*[ -/]*[@-~]/g, "");
// Replace other control chars (including CR) to prevent log forging.
return withoutAnsi.replace(/[\x00-\x08\x0B\x0C\x0E-\x1F\x7F\r]/g, (c) =>
`\\x${c.charCodeAt(0).toString(16).padStart(2, "0")}`,
);
}
logDebug(sanitizeForLog(decodedStdout.trim()));
logError(sanitizeForLog(decodedStderr.trim()));Option B (terminal-focused): strip ANSI escapes and disallow OSC sequences too (window-title, hyperlinks), then emit. Also consider logging output with clear delimiters (prefix each line) to mitigate log forging. Analyzed PR: #72393 at commit Last updated on: 2026-04-27T06:08:56Z |
Greptile SummaryThis PR refactors Windows console encoding detection and CJK mojibake repair into a dedicated Confidence Score: 4/5Safe to merge; all findings are P2 style/robustness suggestions with no present defects on the changed paths. The encoding logic is well-tested with both unit and integration tests, the stateful-decoder invariant holds for all exercised cases, and the refactoring correctly removes the duplicate implementation from invoke.ts. Only P2 concerns remain: a redundant undefined guard, an implicit invariant in the streaming decoder that should be commented, a pre-existing regex ordering issue in parseWindowsCodePage, and a shared-state footgun in the adapter if onStdout is ever called more than once. src/infra/windows-encoding.ts (streaming decoder invariant comment) and src/process/supervisor/adapters/child.ts (shared decoder state) Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/infra/windows-encoding.ts
Line: 145-151
Comment:
**Redundant undefined guard in `getTrailingIncompleteUtf8Bytes` loop**
The `buffer[index] !== undefined` check in the while condition is redundant — the `index >= 0` guard already ensures the access is in bounds for a `Buffer`, and TypeScript types `Buffer` indexed access as `number` (not `number | undefined`). Removing it makes the intent clearer.
```suggestion
while (
index >= 0 &&
buffer[index] >= 0x80 &&
buffer[index] <= 0xbf &&
continuationBytes < 3
) {
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/infra/windows-encoding.ts
Line: 111-114
Comment:
**`decode` passes raw chunk, not `replayBuffer`, to `utf8Decoder`**
Inside the `try` block, `utf8Decoder.decode(buffer, { stream: true })` receives only the new chunk, while `getTrailingIncompleteUtf8Bytes(replayBuffer)` uses the concatenated `[pendingUtf8Bytes + buffer]`. This works correctly because the stateful `utf8Decoder` already holds `pendingUtf8Bytes` in its internal incomplete-sequence buffer (that's the maintained invariant), so feeding it just `buffer` is equivalent to feeding it `replayBuffer`. However, this dependency is non-obvious and worth a comment to prevent future regressions — e.g., resetting the `utf8Decoder` instance anywhere would silently break the invariant.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/infra/windows-encoding.ts
Line: 20
Comment:
**`parseWindowsCodePage` matches the first 3-5 digit number, not the last**
The regex `/\b(\d{3,5})\b/` picks the first match in the combined `stdout + "\n" + stderr` string. If `stderr` happens to contain a 3-5 digit number before the code page (e.g., `"Error 100:"`) the wrong value would be captured, leaving `cachedWindowsConsoleEncoding` as `null` (since 100 is not in `WINDOWS_CODEPAGE_ENCODING_MAP`). Using a more anchored pattern — such as matching the last number or keying on the colon-delimited format — would be more robust, though this issue is pre-existing from the original `invoke.ts` implementation.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/process/supervisor/adapters/child.ts
Line: 112-113
Comment:
**`stdoutDecoder`/`stderrDecoder` are shared across multiple `onStdout`/`onStderr` calls**
The decoders are created once at the adapter level. If `onStdout` or `onStderr` is ever invoked more than once (e.g., by two consumers of the same `ChildAdapter`), each call registers a new `"data"` listener on `child.stdout`, but both share the same stateful `stdoutDecoder`. The first listener to handle a `"data"` event will consume the decoder's state, leaving the second listener's call to `stdoutDecoder.decode(chunk)` to return empty output. The current call sites appear to invoke these at most once per adapter, so this is not an active bug, but the lack of a guard or comment makes this fragile as consumers grow.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix(clownfish): address review for ghcra..." | Re-trigger Greptile |
aeaf469 to
d1eedef
Compare
…reness (openclaw#72393) * fix(process): decode Windows command output with console codepage awareness * fix(clownfish): address review for ghcrawl-199248-agentic-merge (1)
…reness (openclaw#72393) * fix(process): decode Windows command output with console codepage awareness * fix(clownfish): address review for ghcrawl-199248-agentic-merge (1)
Summary
Credit
This replacement carries forward and credits ideas and tests from #43611 by @iready, #50586 by @kevinten10, #50885 by @zhangyongjie1997, #56538 by @knightplat-blip / heiqishi666, and #64661/#64709 by @slepybear.
Validation
Fixes #50519. Supersedes #43611, #50586, #50885, #56538, #64661, and #64709 after this PR lands and validation passes.
ProjectClownfish replacement details: