Skip to content

fix(run_agent): gate concurrent checkpoint preflight on block_result (fixes #34827)#35255

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-04847aaf
May 30, 2026
Merged

fix(run_agent): gate concurrent checkpoint preflight on block_result (fixes #34827)#35255
teknium1 merged 1 commit into
mainfrom
hermes/hermes-04847aaf

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Summary

hermes update on Windows reliably stops self-flagging its own launcher shim as a concurrent instance, closing the recurring "Another hermes.exe is running" false positive (#29341, #34795).

Root cause: the distlib Scripts\hermes.exe launcher spawns python.exe and waits; detection runs in the python child, so the launcher shim appears in process_iter. The prior fix walked ancestors with per-hop current.parent() inside except: break — the first psutil AccessDenied/NoSuchProcess (common on Windows across session/elevation boundaries) bailed the walk early, leaving the launcher in the candidate set.

Changes

  • hermes_cli/main.py _detect_concurrent_hermes_instances: use proc.parents() (whole ancestor list in one call) and evaluate each ancestor independently, so one unreadable hop never strands the launcher. Only exclude ancestors whose exe is itself a shim — a genuine second hermes.exe under a non-Hermes parent (Desktop backend child) is still flagged.
  • _format_concurrent_instances_message: print a copy-pasteable taskkill /PID … /F for the exact stale PIDs, so a user who already closed everything can self-remediate before retrying.
  • Tests: rework the fake-psutil harness to model parents()/exe(); add a regression test for the one-bad-hop case; assert the taskkill remediation line.

Validation

Before After
launcher ancestor (exe = shim) sometimes flagged (walk bailed) excluded
parents() / a hop raises AccessDenied strands launcher → false positive no crash, evaluated independently
genuine sibling hermes.exe reported still reported
stuck user hunt PIDs in Task Manager copy-paste taskkill /PID … /F
tests/hermes_cli/test_update_concurrent_quarantine.py 18/18 pass

Conservative shim-only ancestor approach credited to the parallel attempts in #29358 (@xxxigm) and #31808 (@jquesnelle). Closes #29341, #34795.

Infographic

concurrent-checkpoint-preflight-fix

…ixes #34827)

In the concurrent tool-execution path, checkpoint preflight (write_file,
patch, destructive terminal) fired BEFORE plugin guardrail block_result
was computed. A blocked write_file could still dirty checkpoint state
(doc_modified_this_turn, _last_write_file_call_id, turn_counter).

Move checkpoint preflight to AFTER block_result computation, gated on
`if block_result is None:` — matching the invariant the sequential path
already enforces.
@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-04847aaf vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9498 on HEAD, 9498 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 4925 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels May 30, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Salvage of #34856 (by @beardthelion). Both fix #34827 — concurrent checkpoint preflight ordering. If this merges, #34856 should be closed as superseded.

@teknium1 teknium1 merged commit 6baf001 into main May 30, 2026
23 checks passed
@teknium1 teknium1 deleted the hermes/hermes-04847aaf branch May 30, 2026 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Windows: \hermes update always reports "Another hermes.exe is running" — launcher shim PID not excluded from concurrent-instance detection

3 participants