Skip to content

fix(node-host): decode Windows exec output with active code page#30652

Merged
Takhoffman merged 1 commit intoopenclaw:mainfrom
Sid-Qin:fix/30640-windows-exec-encoding
Mar 2, 2026
Merged

fix(node-host): decode Windows exec output with active code page#30652
Takhoffman merged 1 commit intoopenclaw:mainfrom
Sid-Qin:fix/30640-windows-exec-encoding

Conversation

@Sid-Qin
Copy link
Contributor

@Sid-Qin Sid-Qin commented Mar 1, 2026

Summary

  • Problem: exec / system.run output was always decoded as UTF-8, so Windows shells running on legacy code pages (for example CP936/GBK) rendered Chinese text as garbled characters.
  • Why it matters: Users can run commands successfully but cannot trust tool output, which blocks debugging and file verification workflows on Windows.
  • What changed: src/node-host/invoke.ts now resolves the active Windows console code page (chcp) once per process and decodes captured stdout/stderr buffers with the matching encoding before returning results.
  • What did NOT change: Command execution policy, allowlist/security checks, and non-Windows decoding behavior remain unchanged.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

  • On Windows, system.run/exec output now preserves non-UTF-8 console text (e.g. GBK Chinese output) instead of showing replacement characters.

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No

Repro + Verification

Environment

  • OS: macOS (dev) + Windows code page behavior validated via unit tests
  • Runtime: Node.js 24.x
  • Integration/channel: node-host system.run path

Steps

  1. On Windows with code page 936, execute a command that emits Chinese text through exec/system.run.
  2. Observe output rendering before vs after this patch.

Expected

  • Output keeps original Chinese characters.

Actual

  • Before fix: output contained garbled characters (/乱码).
  • After fix: output is decoded with the active console code page and renders correctly.

Evidence

  • pnpm exec vitest run src/node-host/invoke.sanitize-env.test.ts
  • pnpm exec vitest run src/node-host/invoke-system-run.test.ts

Human Verification (required)

  • Verified scenarios: code page parsing from chcp output; GBK decoding path; existing node-host system-run tests.
  • Edge cases checked: unknown/unsupported encoding falls back to UTF-8; non-Windows platforms remain UTF-8.
  • What I did not verify: live manual run on a physical Windows machine in this environment.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No

Failure Recovery (if this breaks)

  • How to disable/revert: revert this PR commit.
  • Files/config to restore: src/node-host/invoke.ts and test updates.
  • Known bad symptoms: Windows output may revert to UTF-8-only decoding and show mojibake again.

Risks and Mitigations

  • Risk: chcp output may be unavailable in some shells.
  • Mitigation: decoding safely falls back to UTF-8 when code page resolution fails.

Decode system.run stdout/stderr on Windows using the current console code page so
non-UTF-8 shells (for example CP936/GBK PowerShell) no longer produce garbled text.
Adds regression coverage for code page parsing and GBK output decoding.

Closes openclaw#30640
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a603fc8c09

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +254 to +257
const stdout = decodeCapturedOutputBuffer({
buffer: Buffer.concat(stdoutChunks),
windowsEncoding,
});

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid decoding all Windows output as console code page

runCommand now always decodes captured Windows output with windowsEncoding, which is derived from chcp and applied to every spawned process. On non-UTF-8 consoles, this mis-decodes tools that already emit UTF-8 bytes (for example Node-based CLIs and other cross-platform binaries), so valid UTF-8 output is converted into mojibake; this is a behavior regression from the previous UTF-8 decoding path for those commands. Restricting code-page decoding to cmd.exe-style output (or adding UTF-8 fallback heuristics per process) would avoid corrupting these cases.

Useful? React with 👍 / 👎.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 1, 2026

Greptile Summary

This PR fixes Windows command output decoding by detecting and using the active console code page instead of always assuming UTF-8. The implementation adds a chcp detection mechanism that runs once per process and caches the result, then decodes captured stdout/stderr buffers with the appropriate encoding (e.g., GBK, Shift_JIS) before returning results.

Key changes:

  • Added parseWindowsCodePage() to extract code page numbers from chcp output
  • Added resolveWindowsConsoleEncoding() to detect Windows console encoding once per process with caching
  • Added decodeCapturedOutputBuffer() to decode buffers using the detected encoding with UTF-8 fallback
  • Modified runCommand() to collect raw Buffer chunks instead of strings, then decode at finalization time
  • Added comprehensive test coverage for code page parsing and GBK decoding

Design strengths:

  • Robust error handling with UTF-8 fallback at every layer
  • Non-Windows platforms unchanged (early return)
  • UTF-8 code page (65001) explicitly handled to maintain existing behavior
  • Module-level caching prevents redundant chcp calls
  • Buffer-based collection preserves raw bytes for correct decoding

Confidence Score: 4/5

  • This PR is safe to merge with minimal risk - it's a well-contained bug fix with appropriate fallbacks and no security surface changes.
  • The implementation is solid with excellent error handling, comprehensive test coverage, and backward compatibility. Score is 4 instead of 5 only because manual verification on a physical Windows machine with legacy code pages hasn't been performed (as noted in PR description), though unit tests provide strong confidence.
  • No files require special attention - both changes are well-implemented with appropriate error handling and test coverage.

Last reviewed commit: a603fc8

@Takhoffman Takhoffman merged commit 7b5a410 into openclaw:main Mar 2, 2026
28 of 29 checks passed
@Takhoffman
Copy link
Contributor

PR #30652 - fix(node-host): decode Windows exec output with active code page (#30652)

Merged via squash.

  • Merge commit: 7b5a410
  • Verified: pnpm vitest run src/node-host/invoke.sanitize-env.test.ts src/node-host/invoke-system-run.test.ts
  • Changelog: no entry required for this canonicalization run.

Thanks @Sid-Qin!

hanqizheng pushed a commit to hanqizheng/openclaw that referenced this pull request Mar 2, 2026
…nclaw#30652) thanks @Sid-Qin

Verified:
- pnpm vitest run src/node-host/invoke.sanitize-env.test.ts src/node-host/invoke-system-run.test.ts

Co-authored-by: Sid-Qin <53659198+Sid-Qin@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
obviyus pushed a commit that referenced this pull request Mar 2, 2026
) thanks @Sid-Qin

Verified:
- pnpm vitest run src/node-host/invoke.sanitize-env.test.ts src/node-host/invoke-system-run.test.ts

Co-authored-by: Sid-Qin <53659198+Sid-Qin@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
Linux2010 pushed a commit to Linux2010/openclaw that referenced this pull request Mar 2, 2026
…nclaw#30652) thanks @Sid-Qin

Verified:
- pnpm vitest run src/node-host/invoke.sanitize-env.test.ts src/node-host/invoke-system-run.test.ts

Co-authored-by: Sid-Qin <53659198+Sid-Qin@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
execute008 pushed a commit to execute008/openclaw that referenced this pull request Mar 2, 2026
…nclaw#30652) thanks @Sid-Qin

Verified:
- pnpm vitest run src/node-host/invoke.sanitize-env.test.ts src/node-host/invoke-system-run.test.ts

Co-authored-by: Sid-Qin <53659198+Sid-Qin@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
dawi369 pushed a commit to dawi369/davis that referenced this pull request Mar 3, 2026
…nclaw#30652) thanks @Sid-Qin

Verified:
- pnpm vitest run src/node-host/invoke.sanitize-env.test.ts src/node-host/invoke-system-run.test.ts

Co-authored-by: Sid-Qin <53659198+Sid-Qin@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
OWALabuy pushed a commit to kcinzgg/openclaw that referenced this pull request Mar 4, 2026
…nclaw#30652) thanks @Sid-Qin

Verified:
- pnpm vitest run src/node-host/invoke.sanitize-env.test.ts src/node-host/invoke-system-run.test.ts

Co-authored-by: Sid-Qin <53659198+Sid-Qin@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
zooqueen pushed a commit to hanzoai/bot that referenced this pull request Mar 6, 2026
…nclaw#30652) thanks @Sid-Qin

Verified:
- pnpm vitest run src/node-host/invoke.sanitize-env.test.ts src/node-host/invoke-system-run.test.ts

Co-authored-by: Sid-Qin <53659198+Sid-Qin@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] exec tool output encoding issue on Windows - Chinese characters display as garbled text

2 participants