fix(encoding): correct UTF-8 handling for accented characters#68
Conversation
…all tools
Windows PowerShell outputs UTF-16-LE and native tools (reg.exe, netsh,
pnputil, schtasks, sfc, dism) use the system's OEM code page (e.g.
CP1252). Node.js decodes stdout as UTF-8 by default, corrupting accented
characters like "Système" → garbled text in drive labels and elsewhere.
- Add exec-utf8.ts: central utility with psUtf8() for PowerShell and
execNativeUtf8() for native tools (chcp 65001 + cmd.exe arg escaping)
- Wrap all PowerShell -Command calls with psUtf8() (25 files)
- Route all reg/schtasks/pnputil/netsh calls through execNativeUtf8()
- Fix SFC/DISM streaming with StringDecoder for multi-byte UTF-8 chunks
- Update ASCII-only regex patterns to Unicode-aware (\p{L}\p{N}/u) so
accented display names are no longer silently rejected
- Harden execNativeUtf8 against cmd.exe shell injection by escaping
metacharacters in dynamic arguments (registry names, task paths, etc.)
Closes #66
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fb12913d7a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Address PR review feedback: P1 — execNativeUtf8 now detects % in arguments and falls back to direct execFileAsync (no cmd.exe shell) to avoid %VAR% expansion corrupting literal percent sequences like %APPDATA%\App\app.exe in registry values. The chcp 65001 code-page switch is skipped for these calls, but % in arguments occurs almost exclusively in write operations whose output is plain ASCII. P2 — Revert WiFi profile name validation to original (block only " and control chars). Shell metacharacters like ( ) & are now safely handled by cmdEscapeArg inside execNativeUtf8, so names like "Home (5G)" are no longer incorrectly filtered out. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 96154d978d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Restructure execNativeUtf8 to pass arguments through temporary environment variables (__KA0, __KA1, …) instead of concatenating user-controlled data into the cmd.exe command string. The command line now contains only hardcoded %__KAn% references, eliminating the CodeQL "Uncontrolled command line" critical finding. Also adds a tool whitelist and fixes 3 test files that were missing vi.mock for the new exec-utf8 module. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f5cde65189
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Add /v:off flag to cmd.exe calls in execNativeUtf8 to explicitly disable delayed expansion. Without this, systems with cmd /v:on defaults would re-expand ! characters in environment variable values, potentially corrupting Wi-Fi names, registry data, or other arguments containing exclamation marks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
src/main/services/exec-utf8.ts— a central encoding utility withpsUtf8()for PowerShell andexecNativeUtf8()for native tools, including cmd.exe argument escaping to prevent shell injection from dynamic values (registry names, task paths, WiFi names, etc.)/^[A-Za-z0-9...]/) to Unicode-aware (/^[\p{L}\p{N}...]/u) so accented display names in startup items and task names are no longer silently rejectedChanges by category
PowerShell UTF-8 (
psUtf8()wrapper) — forces[Console]::OutputEncoding = UTF-8before every PS command:cli.ts,index.ts,ipc/index.ts,ipc/debloater.ipc.ts,ipc/disk-analyzer.ipc.ts,ipc/game-mode.ipc.ts,ipc/malware-scanner.ipc.ts,ipc/recycle-bin.ipc.ts,ipc/registry-cleaner.ipc.ts,ipc/service-manager.ipc.ts,ipc/shortcut-cleaner.ipc.ts,ipc/startup-manager.ipc.ts,platform/win32/commands.ts,platform/win32/security.ts,platform/win32/network.ts,services/cloud-agent.ts,services/perf-monitor.ts,services/program-uninstaller.ts,services/restore-point.ts,services/software-updater.ts,services/uninstall-leftovers.tsNative tool UTF-8 (
execNativeUtf8()) — runs viacmd /c chcp 65001with escaped arguments:reg.exe: registry-cleaner, startup-manager, privacy-shield, network-cleanup, malware-scanner, program-uninstaller, uninstall-leftoversschtasks: index.ts, registry-cleaner, privacy-shieldpnputil: driver-managernetsh: platform/win32/networkStreaming UTF-8 (
StringDecoder) — prevents multi-byte character corruption from Buffer chunk splitting:disk-analyzer.ipc.ts(SFC + DISM progress streaming)Unicode regex — allows accented characters in display names/task names:
startup-manager.ipc.ts(display name validation,isSafeTaskName)registry-cleaner.ipc.ts(SAFE_TASK_PATH_RE)Test plan
npx tsc --noEmit— no new type errors introduced🤖 Generated with Claude Code