Skip to content

fix: exclude parent process chain from concurrent instance detection on Windows#31712

Closed
Strontvod wants to merge 1 commit into
NousResearch:mainfrom
Strontvod:fix/windows-concurrent-instance-parent-chain
Closed

fix: exclude parent process chain from concurrent instance detection on Windows#31712
Strontvod wants to merge 1 commit into
NousResearch:mainfrom
Strontvod:fix/windows-concurrent-instance-parent-chain

Conversation

@Strontvod

Copy link
Copy Markdown
Contributor

Bug Description

On Windows, every hermes update reports a false positive concurrent instance error and aborts, because the setuptools-generated hermes.exe launcher is detected as a separate running instance of Hermes.

Root Cause

The setuptools launcher (hermes.exe) is a separate native process that spawns python.exe (the interpreter that runs the update code). os.getpid() returns the Python PID, but the launcher (which holds the file lock that prevents the update from renaming/quarantining the .exe) is the parent process. The old code only excluded the Python PID itself via exclude_pid = os.getpid(), so the launcher process was always flagged as a concurrent instance.

Fix

Build an exclusion set containing the Python process and its entire ancestor chain by walking psutil.Process.parent() upward. This ensures the running hermes update invocation never reports any process in its own tree — whether it's the Python interpreter, the .exe launcher, or any shell in between.

How to Verify

  1. On a Windows machine, run hermes update without this fix — it fails with a concurrent-instance error.
  2. Apply the patch — the same command succeeds.
  3. Running hermes update with a second Hermes process (e.g. desktop app) still correctly blocks, since the other process is outside this process's parent chain.

Test Plan

  • Existing tests pass
  • Manual verification on Windows (the existing test suite mocks psutil; the real fix is exercised by the update flow itself)

Risk Assessment

Low — Only affects the Windows concurrent-instance detection path. The change expands the exclusion set from a single PID to the full parent chain; all other logic (shim comparison, process enumeration) is unchanged. Off-Windows the function returns [] early, so no effect on Linux/macOS.

…on Windows

On Windows, the setuptools-generated hermes.exe launcher is a separate native
process that spawns python.exe (the interpreter running the update code).
os.getpid() returns the Python PID, but the launcher (which holds the file
lock) is the parent. Without walking the parent chain, every 'hermes update'
reports its own launcher as a concurrent instance - a false positive.

This patch builds an exclusion set containing the Python process and its
entire ancestor chain, so the running invocation never reports itself.
@alt-glitch alt-glitch added type/bug Something isn't working comp/cli CLI entry point, hermes_cli/, setup wizard P2 Medium — degraded but workaround exists labels May 24, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Salvage of #28819 + #31315, fixing P0 data-loss issue #28818. Supersedes #31315, #30664, #29242, #31358 (all competing fixes for the same kanban scratch-cleanup data loss vector).

@Strontvod

Copy link
Copy Markdown
Contributor Author

Hey @alt-glitch was this meant for a different PR? #31712 is about the Windows concurrent instance false-positive fix, not kanban scratch cleanup. Just flagging in case it landed here by accident.

@teknium1

Copy link
Copy Markdown
Contributor

Merged via #31806 — your commit was cherry-picked onto current main with authorship preserved (323cce7e9). Thanks for the fix!

Follow-up commit (46f8948ba) on top:

  • Broadened the parent-walk except to bare Exception. Heads-up: the original except (psutil.NoSuchProcess, psutil.AccessDenied, ValueError) evaluates those attributes lazily at handler-match time — if psutil is a partial stub (as in the existing test_update_concurrent_quarantine.py tests, which use SimpleNamespace(process_iter=...)), the handler itself raises AttributeError and escapes. That broke 4 existing tests; the broader catch fixes them without touching the test fixtures, and matches the helper's documented "never raises" contract.
  • Added 5 tests for the parent-chain walk: launcher exclusion, unrelated sibling still reported, 3-deep ancestry, cycle handling, partial-stub graceful degrade.
  • AUTHOR_MAP entry mapping your email to @Strontvod for the release notes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/cli CLI entry point, hermes_cli/, setup wizard P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants