Skip to content

fix: use TextDecoder for proper GBK encoding support on Windows#56538

Closed
knightplat-blip wants to merge 54 commits intoopenclaw:mainfrom
knightplat-blip:fix/windows-exec-gbk-textdecoder
Closed

fix: use TextDecoder for proper GBK encoding support on Windows#56538
knightplat-blip wants to merge 54 commits intoopenclaw:mainfrom
knightplat-blip:fix/windows-exec-gbk-textdecoder

Conversation

@knightplat-blip
Copy link
Copy Markdown

Summary

Fixes the Windows exec tool garbled Chinese characters issue (#56462). Uses TextDecoder from node:util to properly handle GBK encoding on Windows, which is not natively supported as a BufferEncoding by Node.js.

Changes

  1. Import TextDecoder from node:util
  2. Fix runExec():
    • Use encoding: 'buffer' to get raw binary data
    • Decode with TextDecoder using platform-specific encoding (GBK on Windows, UTF-8 elsewhere)
  3. Fix runCommandWithTimeout():
    • Create TextDecoder instance once outside stream callbacks
    • Use streaming decode with { stream: true } to handle multi-byte characters correctly
    • Final decode on 'end' event to flush remaining bytes
  4. Add Bun compatibility (try/catch fallback to UTF-8 if GBK not supported)

Testing

  • Tested on Windows 1 with PowerShell 7.6.0
  • Verified Chinese output displays correctly: Write-Output "这是中文测试" now works properly
  • Verified backwards compatibility: UTF-8 still works on non-Windows platforms

Related Issues

Closes #56462
Closes #50519 (duplicate)

- Import TextDecoder from node:util
- Fix runExec(): use encoding: 'buffer' and decode with TextDecoder
- Fix runCommandWithTimeout(): create decoder once, use stream decoding
- Handle multi-byte Chinese characters correctly with streaming decode
- Add Bun compatibility with try/catch fallback
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 28, 2026

Greptile Summary

This PR fixes garbled Chinese characters on Windows by switching to encoding: 'buffer' in exec calls and decoding via a UTF-8 validity check + TextDecoder with the detected code-page encoding. It also addresses previous review feedback: resolveWindowsConsoleEncoding is now exported, isValidUtf8Buffer does a full round-trip comparison (no partial sampling), the catch fallback is utf-8 instead of hardcoded GBK, and child.ts uses TextDecoder with { stream: true } to handle multi-byte characters across chunk boundaries.

Confidence Score: 5/5

Safe to merge; all remaining findings are minor style issues.

The core encoding logic is sound — UTF-8 validity is checked before falling back to the code-page encoding, the catch fallback is now UTF-8 (not GBK), streaming decode uses { stream: true }, and previous reviewer concerns have been addressed. Only P2 style findings remain (unused imports in exec.ts, Chinese comments).

No files require special attention.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/process/exec.ts
Line: 5-6

Comment:
**Unused imports**

`TextDecoder` and `resolveWindowsConsoleEncoding` are both imported but never referenced in this file. Encoding is performed entirely through `decodeCapturedOutputBuffer`, which calls `resolveWindowsConsoleEncoding` internally.

```suggestion
import { promisify } from "node:util";
import { decodeCapturedOutputBuffer } from "../node-host/invoke.js";
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/node-host/invoke.ts
Line: 136-143

Comment:
**Chinese comments in English codebase**

Several new comments are written in Chinese (`// 先尝试用 GBK 解码 chcp 输出(Windows 中文默认编码)`, `// 不管标称编码是什么,先检测是否为有效 UTF-8`, etc.), while the rest of the file and repository use English comments. For consistency and readability across contributors, these should be in English. The same pattern appears in `isValidUtf8Buffer` and `decodeCapturedOutputBuffer`.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (12): Last reviewed commit: "Merge branch 'main' into fix/windows-exe..." | Re-trigger Greptile

Comment thread src/process/exec.ts Outdated
Comment thread src/process/exec.ts Outdated
heiqishi666 and others added 2 commits March 29, 2026 01:37
- Import decodeCapturedOutputBuffer and resolveWindowsConsoleEncoding
- Use decodeCapturedOutputBuffer for runExec()
- Use resolveWindowsConsoleEncoding with try/catch fallback for runCommandWithTimeout()
- Properly support all Windows code pages (GBK, UTF-8, Shift-JIS, etc.)
- Add graceful fallback for unsupported encodings
- Fully address PR review feedback
@knightplat-blip
Copy link
Copy Markdown
Author

@greptileai review

Comment thread src/process/exec.ts
- Export resolveWindowsConsoleEncoding for use in process/exec
- Fix TypeScript import error
- Address PR review feedback
@knightplat-blip
Copy link
Copy Markdown
Author

@greptileai review

@knightplat-blip
Copy link
Copy Markdown
Author

@greptileai review

@knightplat-blip
Copy link
Copy Markdown
Author

Hi there! 👋

Just a gentle ping about this PR fixing the Windows exec Chinese character encoding issue.

The fix uses the existing project utilities (decodeCapturedOutputBuffer and resolveWindowsConsoleEncoding) to properly handle GBK/CP936 encoding on Windows, with graceful fallback to UTF-8.

I've verified the fix works correctly locally - Chinese output now displays properly!

@vincentkoc Would appreciate if you could take a look when you have a moment. Thanks! 🙏

(Also, let me know if anything needs to be adjusted!)

@knightplat-blip
Copy link
Copy Markdown
Author

@greptileai review

之前修复只在编码标称是UTF-8时才做UTF-8有效性检测,但是如果子进程(比如agent-chat.cmd)
自己在启动时chcp 65001切到UTF-8,输出实际是UTF-8,但父进程OpenClaw缓存的编码还是GBK,
就会导致用GBK解码UTF-8,仍然乱码。

修复方案:
- 不管缓存标称编码是什么,第一步先检测实际buffer是否为有效UTF-8
- 如果是有效UTF-8,直接用UTF-8解码,自动适配子进程改编码的情况
- 如果不是有效UTF-8,再用缓存的标称编码尝试
- 这样兼容所有场景:默认GBK、子进程改UTF-8、用户手动改控制台编码都正常

验证:
- agent-chat list(子进程UTF-8)中文正常显示 ✓
- echo 中文(默认GBK)依然正常 ✓
- 7-Zip列出带中文文件名压缩包正常 ✓
@knightplat-blip knightplat-blip force-pushed the fix/windows-exec-gbk-textdecoder branch from e8f3e89 to 5c86287 Compare March 29, 2026 18:06
@knightplat-blip
Copy link
Copy Markdown
Author

@greptileai review

@knightplat-blip
Copy link
Copy Markdown
Author

@greptileai review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 8, 2026

Tip:

Greploops — Automatically fix all review issues by running /greploops in Claude Code. It iterates: fix, push, re-review, repeat until 5/5 confidence.

Use the Greptile plugin for Claude Code to query reviews, search comments, and manage custom context directly from your terminal.

@knightplat-blip
Copy link
Copy Markdown
Author

@greptileai review

@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Apr 26, 2026

Closing this as duplicate or superseded after Codex automated review.

This PR is superseded by the open maintainer replacement #72393, which explicitly credits this TextDecoder/codepage-aware direction and tracks the same Windows CJK exec-output bug against the canonical issue #50519. Current main has not shipped the fix yet, so this is a superseded-PR close rather than implemented-on-main.

Best possible solution:

Close this PR as superseded by #72393. Maintainers should continue review and validation on #72393 as the canonical Windows command-output decoding fix, preserve the credit it already gives to this contribution, keep #50519 open until the replacement lands, and split any residual PTY-specific encoding concern into a narrow follow-up only if current behavior still reproduces there.

What I checked:

So I’m closing this here and keeping the remaining discussion on the canonical linked item.

Codex Review notes: model gpt-5.5, reasoning high; reviewed against 4b9c85776d29.

@clawsweeper clawsweeper Bot closed this Apr 27, 2026
@knightplat-blip knightplat-blip deleted the fix/windows-exec-gbk-textdecoder branch April 29, 2026 04:14
@knightplat-blip knightplat-blip restored the fix/windows-exec-gbk-textdecoder branch April 29, 2026 04:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

2 participants