Skip to content

fix(update): protect dashboard wmic scan against UnicodeDecodeError on Windows non-UTF-8 locales (#17049)#17123

Closed
briandevans wants to merge 2 commits into
NousResearch:mainfrom
briandevans:fix/update-stale-dashboard-wmic-encoding-17049
Closed

fix(update): protect dashboard wmic scan against UnicodeDecodeError on Windows non-UTF-8 locales (#17049)#17123
briandevans wants to merge 2 commits into
NousResearch:mainfrom
briandevans:fix/update-stale-dashboard-wmic-encoding-17049

Conversation

@briandevans

Copy link
Copy Markdown
Contributor

Summary

  • hermes update's _warn_stale_dashboard_processes() shells out to wmic with text=True and no encoding=, so on Windows non-UTF-8 locales (e.g. cp936) the subprocess reader thread crashes with UnicodeDecodeError, leaves result.stdout = None, and hermes update aborts with the exact AttributeError: 'NoneType' object has no attribute 'split' reported in [Bug]: UnicodeDecodeError and AttributeError in Windows gateway process scanning #17049.
  • Pass encoding="utf-8", errors="ignore" (matching the ASCII-only key parsing the function does), and add a result.stdout is None defensive guard.
  • New regression tests reproduce the AttributeError on clean main and pass with the fix.

The bug

The reported #17049 stack trace lists two errors stacked:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 ...   (reader thread)
...
File "hermes_cli/gateway.py", line 290, in _scan_gateway_pids
  for line in result.stdout.split("\n"):
AttributeError: 'NoneType' object has no attribute 'split'      (main thread)

The same buggy pattern exists in two places:

File:line Function Triggered by
hermes_cli/gateway.py:279 _scan_gateway_pids hermes setup
hermes_cli/main.py:5239 _warn_stale_dashboard_processes hermes update

#17074 patches the gateway one. This PR patches the second one. Until both land, Chinese-locale (and other non-UTF-8) Windows users remain unable to run hermes update — they hit the AttributeError before the upgrade can warn-or-skip its way to completion.

The fix

hermes_cli/main.py:

result = subprocess.run(
    ["wmic", "process", "get", "ProcessId,CommandLine", "/FORMAT:LIST"],
    capture_output=True, text=True, timeout=10,
    encoding="utf-8", errors="ignore",
)
if result.returncode != 0 or result.stdout is None:
    return
  • errors="ignore" is safe here because the parser only matches the literal ASCII prefixes CommandLine= and ProcessId=. Dropping undecodable bytes can never produce a false positive PID match — and it cannot regress the existing tests, which all use plain ASCII fixtures.
  • The is None guard is defensive cover for any future / older Python where the reader thread still fails for some other reason.

Test plan

  • Focused regression tests added (tests/hermes_cli/test_update_stale_dashboard.py::TestWindowsWmicEncoding):
    • test_wmic_invoked_with_utf8_ignore_errors — asserts subprocess.run is called with encoding="utf-8", errors="ignore".
    • test_wmic_returns_none_stdout_does_not_crash — simulates the post-decode-crash result.stdout=None state and asserts no AttributeError.
  • Both new tests fail against clean origin/main (7d46484) with AttributeError: 'NoneType' object has no attribute 'split' (the exact error from [Bug]: UnicodeDecodeError and AttributeError in Windows gateway process scanning #17049), proving the regression guard.
  • All 10 existing _warn_stale_dashboard_processes tests still pass.
  • Adjacent suites: tests/hermes_cli/test_cmd_update.py, test_update_autostash.py, test_update_check.py, test_update_gateway_restart.py, test_update_hangup_protection.py, test_update_config_clears_custom_fields.py — 93 pass; the 3 unrelated failures (test_update_refreshes_repo_and_tui_node_dependencies, test_cmd_update_succeeds_with_extras, test_cmd_update_retries_optional_extras_individually_when_all_fails) reproduce identically on clean origin/main and do not touch the wmic scan.

Related / Positioning

PR Scope Difference
#17074 (oyoyo) hermes_cli/gateway.py::_scan_gateway_pids Fixes the hermes setup callsite; gateway PID detection.
This PR hermes_cli/main.py::_warn_stale_dashboard_processes Fixes the hermes update callsite; dashboard staleness warning. Adds a Windows-locale regression test (the existing test file only covered the macOS/Linux ps branch).

Same root cause, two callsites — these are intentionally separate, complementary PRs. There is no overlap in touched lines.

Related

…n Windows non-UTF-8 locales (NousResearch#17049)

`hermes update` calls `_warn_stale_dashboard_processes()` to warn about
dashboard processes still running the pre-update Python backend. On
Windows, that scan shells out to `wmic process get ProcessId,CommandLine
/FORMAT:LIST` with `text=True` and no explicit encoding.

`wmic` emits text in the system code page (e.g. cp936 on zh-CN locales),
not UTF-8. Without an explicit `encoding=`, Python's default UTF-8
decoder crashes the subprocess reader thread with
`UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 ...`. In
Python 3.11 that crash is silently absorbed: `subprocess.run()` returns
a `CompletedProcess` with `result.stdout = None`, the next line calls
`result.stdout.split("\n")`, and `hermes update` aborts with the
exact `AttributeError: 'NoneType' object has no attribute 'split'`
trace reported in NousResearch#17049.

Fix: pass `encoding="utf-8", errors="ignore"` so undecodable bytes
cannot take down the reader thread (the parsing only matches the ASCII
prefixes `CommandLine=` and `ProcessId=`, so dropping non-UTF-8 bytes
is safe), and short-circuit when `result.stdout is None` as a defensive
guard for environments where the reader thread still fails for other
reasons.

This is the same root cause as NousResearch#17074 (which patches
`hermes_cli/gateway._scan_gateway_pids` for the `hermes setup` path).
That PR does not touch `_warn_stale_dashboard_processes`, so
`hermes update` remains broken on the same locales until this lands.

Regression test in `tests/hermes_cli/test_update_stale_dashboard.py`:
- `test_wmic_invoked_with_utf8_ignore_errors` asserts the explicit
  encoding/errors kwargs reach `subprocess.run`.
- `test_wmic_returns_none_stdout_does_not_crash` simulates the
  reader-thread-crashed `result.stdout=None` aftermath and asserts the
  function returns silently instead of raising AttributeError.

Both new tests fail against clean origin/main (7d46484) reproducing
the original AttributeError; both pass with this patch. The remaining
3 failures in `tests/hermes_cli/test_cmd_update.py` and
`test_update_autostash.py` are pre-existing baselines on origin/main —
they reproduce identically without this change and are unrelated to
the wmic scan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 28, 2026 19:13

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a Windows-specific crash in hermes update when scanning for stale dashboard processes via wmic, where non‑UTF‑8 output could trigger a UnicodeDecodeError and cascade into stdout=NoneAttributeError.

Changes:

  • Make the Windows wmic subprocess call decode deterministically with encoding="utf-8", errors="ignore" and defensively return when stdout is None.
  • Add regression tests covering the encoding/errors invocation and the stdout=None scenario.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
hermes_cli/main.py Hardens the Windows wmic branch in _warn_stale_dashboard_processes() to avoid decode-thread crashes and None stdout handling.
tests/hermes_cli/test_update_stale_dashboard.py Adds Windows-locale regression tests ensuring wmic is invoked with the right decoding settings and that stdout=None does not crash.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread hermes_cli/main.py Outdated
Comment on lines +5239 to +5244
# wmic emits text in the system code page (e.g. cp936 on zh-CN
# locales), not UTF-8. Without an explicit encoding, Python's
# default UTF-8 decoder crashes the subprocess reader thread
# with UnicodeDecodeError, leaving result.stdout=None and the
# subsequent .split() call crashing `hermes update` with an
# AttributeError on Windows non-UTF-8 locales (#17049).

Copilot AI Apr 28, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explanatory comment is a bit misleading: the failure in #17049 happens when the subprocess output is decoded as UTF-8 (e.g., Python UTF-8 mode), and the key fix here is errors="ignore" (not merely “explicit encoding”). Consider rewording to reflect that the default encoding may be locale-dependent and that errors is what prevents the reader-thread UnicodeDecodeError / stdout=None cascade.

Suggested change
# wmic emits text in the system code page (e.g. cp936 on zh-CN
# locales), not UTF-8. Without an explicit encoding, Python's
# default UTF-8 decoder crashes the subprocess reader thread
# with UnicodeDecodeError, leaving result.stdout=None and the
# subsequent .split() call crashing `hermes update` with an
# AttributeError on Windows non-UTF-8 locales (#17049).
# wmic may emit text in the system code page (for example cp936
# on zh-CN systems), not UTF-8. In text mode, subprocess output
# decoding depends on Python's configuration (locale-dependent by
# default, or UTF-8 in UTF-8 mode). The important protection here
# is errors="ignore": it prevents a reader-thread
# UnicodeDecodeError from leaving result.stdout=None and turning
# the later .split() into an AttributeError (#17049).

Copilot uses AI. Check for mistakes.
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/cli CLI entry point, hermes_cli/, setup wizard labels Apr 28, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #17074 and #17049 — same wmic UnicodeDecodeError root cause. This PR fixes the second call site in _warn_stale_dashboard_processes() that #17074 doesn't cover.

…ys.platform

Two fix-ups for NousResearch#17123:

1. Reword the inline comment in `_warn_stale_dashboard_processes` to
   accurately describe the failure mode (locale-dependent decoder, not a
   "default UTF-8 decoder") and identify `errors="ignore"` as the
   load-bearing protection. Per Copilot's review.

2. Switch `TestWindowsWmicEncoding` from `patch("hermes_cli.main.sys")`
   to `monkeypatch.setattr(sys, "platform", "win32")` — the codebase's
   canonical pattern (e.g. `tests/hermes_cli/test_auth_ssl_macos.py`).
   The MagicMock-replacement approach passed locally on Python 3.12 but
   the platform-equality check failed under CI's xdist+Python 3.11,
   leaving both new tests red despite the fix being present.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@briandevans

Copy link
Copy Markdown
Contributor Author

@copilot Finding addressed in commit b7c97fdca:

Finding (line 5239 — explanatory comment is misleading): Reworded the inline comment to accurately describe the failure mode. Your reading is correct — the failure isn't "Python's default UTF-8 decoder crashes" (the default is locale-dependent, not always UTF-8); it's that whatever encoding subprocess.PIPE selects at the reader-thread level cannot decode wmic's locale-emitted bytes, and errors="ignore" is the actual load-bearing protection. Comment now says "In text mode, subprocess output decoding depends on Python's configuration (locale-dependent by default, or UTF-8 in UTF-8 mode). The important protection here is errors="ignore"".

Drive-by: Also fixed the new TestWindowsWmicEncoding tests, which were passing locally on Python 3.12 but failing under CI's xdist+Python 3.11. Switched from patch("hermes_cli.main.sys") to the codebase's canonical monkeypatch.setattr(sys, "platform", "win32") (matches tests/hermes_cli/test_auth_ssl_macos.py). Both regression tests now pass alongside the rest of tests/hermes_cli/test_update_stale_dashboard.py (12/12 green).

@briandevans

Copy link
Copy Markdown
Contributor Author

CI audit — all 44 test job failures on commit b7c97fdca are pre-existing baselines on clean origin/main (a830f25f7). Zero failures intersect with touched code (hermes_cli/main.py::_warn_stale_dashboard_processes or tests/hermes_cli/test_update_stale_dashboard.py).

The two failures that look dashboard-related are also baselines:

Test Symptom Root cause on main
tests/hermes_cli/test_update_autostash.py::test_cmd_update_succeeds_with_extras AttributeError: 'types.SimpleNamespace' object has no attribute 'stdout' at hermes_cli/main.py:5309 Test fixture mocks subprocess.run to return SimpleNamespace(returncode=0) (no stdout attribute). Commit 66b114238 ("fix(cli): warn about stale dashboard processes after hermes update") added a new subprocess.run call inside _cmd_update_impl that reads result.stdout, but didn't extend the autostash test fixtures to include stdout="". Reproduces on clean origin/main with identical traceback.
tests/hermes_cli/test_update_autostash.py::test_cmd_update_retries_optional_extras_individually_when_all_fails same as above same as above

The Nix failures (nix (ubuntu-latest), nix-lockfile-check) are also infra baselines: npm-deps.drv hash mismatch + fix-lockfiles script crashed without reporting. Neither is in touched code.

Happy to put up a separate one-line fix for the autostash fixture if useful — adding stdout="" to the two SimpleNamespace(returncode=0) mocks would lift the baseline.

teknium1 pushed a commit that referenced this pull request Apr 29, 2026
…ys.platform

Two fix-ups for #17123:

1. Reword the inline comment in `_warn_stale_dashboard_processes` to
   accurately describe the failure mode (locale-dependent decoder, not a
   "default UTF-8 decoder") and identify `errors="ignore"` as the
   load-bearing protection. Per Copilot's review.

2. Switch `TestWindowsWmicEncoding` from `patch("hermes_cli.main.sys")`
   to `monkeypatch.setattr(sys, "platform", "win32")` — the codebase's
   canonical pattern (e.g. `tests/hermes_cli/test_auth_ssl_macos.py`).
   The MagicMock-replacement approach passed locally on Python 3.12 but
   the platform-equality check failed under CI's xdist+Python 3.11,
   leaving both new tests red despite the fix being present.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@teknium1

Copy link
Copy Markdown
Contributor

Merged via #17435 (#17435) alongside @Kailigithub's companion fix for the sibling wmic call site in gateway.py. Both your commits were cherry-picked with your authorship preserved (rebase-merge), including the regression tests. Thanks!

@teknium1 teknium1 closed this Apr 29, 2026
donald131 pushed a commit to donald131/hermes-agent that referenced this pull request May 2, 2026
…ys.platform

Two fix-ups for NousResearch#17123:

1. Reword the inline comment in `_warn_stale_dashboard_processes` to
   accurately describe the failure mode (locale-dependent decoder, not a
   "default UTF-8 decoder") and identify `errors="ignore"` as the
   load-bearing protection. Per Copilot's review.

2. Switch `TestWindowsWmicEncoding` from `patch("hermes_cli.main.sys")`
   to `monkeypatch.setattr(sys, "platform", "win32")` — the codebase's
   canonical pattern (e.g. `tests/hermes_cli/test_auth_ssl_macos.py`).
   The MagicMock-replacement approach passed locally on Python 3.12 but
   the platform-equality check failed under CI's xdist+Python 3.11,
   leaving both new tests red despite the fix being present.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…ys.platform

Two fix-ups for NousResearch#17123:

1. Reword the inline comment in `_warn_stale_dashboard_processes` to
   accurately describe the failure mode (locale-dependent decoder, not a
   "default UTF-8 decoder") and identify `errors="ignore"` as the
   load-bearing protection. Per Copilot's review.

2. Switch `TestWindowsWmicEncoding` from `patch("hermes_cli.main.sys")`
   to `monkeypatch.setattr(sys, "platform", "win32")` — the codebase's
   canonical pattern (e.g. `tests/hermes_cli/test_auth_ssl_macos.py`).
   The MagicMock-replacement approach passed locally on Python 3.12 but
   the platform-equality check failed under CI's xdist+Python 3.11,
   leaving both new tests red despite the fix being present.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jsboige pushed a commit to jsboige/hermes-agent that referenced this pull request May 14, 2026
…ys.platform

Two fix-ups for NousResearch#17123:

1. Reword the inline comment in `_warn_stale_dashboard_processes` to
   accurately describe the failure mode (locale-dependent decoder, not a
   "default UTF-8 decoder") and identify `errors="ignore"` as the
   load-bearing protection. Per Copilot's review.

2. Switch `TestWindowsWmicEncoding` from `patch("hermes_cli.main.sys")`
   to `monkeypatch.setattr(sys, "platform", "win32")` — the codebase's
   canonical pattern (e.g. `tests/hermes_cli/test_auth_ssl_macos.py`).
   The MagicMock-replacement approach passed locally on Python 3.12 but
   the platform-equality check failed under CI's xdist+Python 3.11,
   leaving both new tests red despite the fix being present.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dannyJ848 pushed a commit to dannyJ848/hermes-agent that referenced this pull request May 17, 2026
…ys.platform

Two fix-ups for NousResearch#17123:

1. Reword the inline comment in `_warn_stale_dashboard_processes` to
   accurately describe the failure mode (locale-dependent decoder, not a
   "default UTF-8 decoder") and identify `errors="ignore"` as the
   load-bearing protection. Per Copilot's review.

2. Switch `TestWindowsWmicEncoding` from `patch("hermes_cli.main.sys")`
   to `monkeypatch.setattr(sys, "platform", "win32")` — the codebase's
   canonical pattern (e.g. `tests/hermes_cli/test_auth_ssl_macos.py`).
   The MagicMock-replacement approach passed locally on Python 3.12 but
   the platform-equality check failed under CI's xdist+Python 3.11,
   leaving both new tests red despite the fix being present.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…ys.platform

Two fix-ups for NousResearch#17123:

1. Reword the inline comment in `_warn_stale_dashboard_processes` to
   accurately describe the failure mode (locale-dependent decoder, not a
   "default UTF-8 decoder") and identify `errors="ignore"` as the
   load-bearing protection. Per Copilot's review.

2. Switch `TestWindowsWmicEncoding` from `patch("hermes_cli.main.sys")`
   to `monkeypatch.setattr(sys, "platform", "win32")` — the codebase's
   canonical pattern (e.g. `tests/hermes_cli/test_auth_ssl_macos.py`).
   The MagicMock-replacement approach passed locally on Python 3.12 but
   the platform-equality check failed under CI's xdist+Python 3.11,
   leaving both new tests red despite the fix being present.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…ys.platform

Two fix-ups for NousResearch#17123:

1. Reword the inline comment in `_warn_stale_dashboard_processes` to
   accurately describe the failure mode (locale-dependent decoder, not a
   "default UTF-8 decoder") and identify `errors="ignore"` as the
   load-bearing protection. Per Copilot's review.

2. Switch `TestWindowsWmicEncoding` from `patch("hermes_cli.main.sys")`
   to `monkeypatch.setattr(sys, "platform", "win32")` — the codebase's
   canonical pattern (e.g. `tests/hermes_cli/test_auth_ssl_macos.py`).
   The MagicMock-replacement approach passed locally on Python 3.12 but
   the platform-equality check failed under CI's xdist+Python 3.11,
   leaving both new tests red despite the fix being present.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/cli CLI entry point, hermes_cli/, setup wizard P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: UnicodeDecodeError and AttributeError in Windows gateway process scanning

4 participants