fix: use TextDecoder for proper GBK encoding support on Windows#56538
fix: use TextDecoder for proper GBK encoding support on Windows#56538knightplat-blip wants to merge 54 commits intoopenclaw:mainfrom
Conversation
- Import TextDecoder from node:util - Fix runExec(): use encoding: 'buffer' and decode with TextDecoder - Fix runCommandWithTimeout(): create decoder once, use stream decoding - Handle multi-byte Chinese characters correctly with streaming decode - Add Bun compatibility with try/catch fallback
Greptile SummaryThis PR fixes garbled Chinese characters on Windows by switching to Confidence Score: 5/5Safe to merge; all remaining findings are minor style issues. The core encoding logic is sound — UTF-8 validity is checked before falling back to the code-page encoding, the catch fallback is now UTF-8 (not GBK), streaming decode uses No files require special attention. Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/process/exec.ts
Line: 5-6
Comment:
**Unused imports**
`TextDecoder` and `resolveWindowsConsoleEncoding` are both imported but never referenced in this file. Encoding is performed entirely through `decodeCapturedOutputBuffer`, which calls `resolveWindowsConsoleEncoding` internally.
```suggestion
import { promisify } from "node:util";
import { decodeCapturedOutputBuffer } from "../node-host/invoke.js";
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/node-host/invoke.ts
Line: 136-143
Comment:
**Chinese comments in English codebase**
Several new comments are written in Chinese (`// 先尝试用 GBK 解码 chcp 输出(Windows 中文默认编码)`, `// 不管标称编码是什么,先检测是否为有效 UTF-8`, etc.), while the rest of the file and repository use English comments. For consistency and readability across contributors, these should be in English. The same pattern appears in `isValidUtf8Buffer` and `decodeCapturedOutputBuffer`.
How can I resolve this? If you propose a fix, please make it concise.Reviews (12): Last reviewed commit: "Merge branch 'main' into fix/windows-exe..." | Re-trigger Greptile |
- Import decodeCapturedOutputBuffer and resolveWindowsConsoleEncoding - Use decodeCapturedOutputBuffer for runExec() - Use resolveWindowsConsoleEncoding with try/catch fallback for runCommandWithTimeout() - Properly support all Windows code pages (GBK, UTF-8, Shift-JIS, etc.) - Add graceful fallback for unsupported encodings - Fully address PR review feedback
|
@greptileai review |
- Export resolveWindowsConsoleEncoding for use in process/exec - Fix TypeScript import error - Address PR review feedback
|
@greptileai review |
|
@greptileai review |
|
Hi there! 👋 Just a gentle ping about this PR fixing the Windows exec Chinese character encoding issue. The fix uses the existing project utilities (decodeCapturedOutputBuffer and resolveWindowsConsoleEncoding) to properly handle GBK/CP936 encoding on Windows, with graceful fallback to UTF-8. I've verified the fix works correctly locally - Chinese output now displays properly! @vincentkoc Would appreciate if you could take a look when you have a moment. Thanks! 🙏 (Also, let me know if anything needs to be adjusted!) |
…页 936),修复了每个 chunk 使用正确编码解码,解决中文乱码
|
@greptileai review |
之前修复只在编码标称是UTF-8时才做UTF-8有效性检测,但是如果子进程(比如agent-chat.cmd) 自己在启动时chcp 65001切到UTF-8,输出实际是UTF-8,但父进程OpenClaw缓存的编码还是GBK, 就会导致用GBK解码UTF-8,仍然乱码。 修复方案: - 不管缓存标称编码是什么,第一步先检测实际buffer是否为有效UTF-8 - 如果是有效UTF-8,直接用UTF-8解码,自动适配子进程改编码的情况 - 如果不是有效UTF-8,再用缓存的标称编码尝试 - 这样兼容所有场景:默认GBK、子进程改UTF-8、用户手动改控制台编码都正常 验证: - agent-chat list(子进程UTF-8)中文正常显示 ✓ - echo 中文(默认GBK)依然正常 ✓ - 7-Zip列出带中文文件名压缩包正常 ✓
e8f3e89 to
5c86287
Compare
|
@greptileai review |
|
@greptileai review |
|
Tip: Greploops — Automatically fix all review issues by running Use the Greptile plugin for Claude Code to query reviews, search comments, and manage custom context directly from your terminal. |
…textdecoder # Conflicts: # src/process/supervisor/adapters/child.ts # src/process/supervisor/adapters/pty.ts
|
@greptileai review |
|
Closing this as duplicate or superseded after Codex automated review. This PR is superseded by the open maintainer replacement #72393, which explicitly credits this TextDecoder/codepage-aware direction and tracks the same Windows CJK exec-output bug against the canonical issue #50519. Current main has not shipped the fix yet, so this is a superseded-PR close rather than implemented-on-main. Best possible solution: Close this PR as superseded by #72393. Maintainers should continue review and validation on #72393 as the canonical Windows command-output decoding fix, preserve the credit it already gives to this contribution, keep #50519 open until the replacement lands, and split any residual PTY-specific encoding concern into a narrow follow-up only if current behavior still reproduces there. What I checked:
So I’m closing this here and keeping the remaining discussion on the canonical linked item. Codex Review notes: model gpt-5.5, reasoning high; reviewed against 4b9c85776d29. |
Summary
Fixes the Windows exec tool garbled Chinese characters issue (#56462). Uses TextDecoder from node:util to properly handle GBK encoding on Windows, which is not natively supported as a BufferEncoding by Node.js.
Changes
runExec():encoding: 'buffer'to get raw binary datarunCommandWithTimeout():{ stream: true }to handle multi-byte characters correctlyTesting
Write-Output "这是中文测试"now works properlyRelated Issues
Closes #56462
Closes #50519 (duplicate)