Skip to content

bug: desktop auto-update kills its own backend via _kill_stale_dashboard_processes #37532

@jcrabapple

Description

@jcrabapple

Bug Description

When launching hermes desktop, the in-app auto-update mechanism (applyUpdatesPosixInApp) runs hermes update, which calls _kill_stale_dashboard_processes(). That function scans the process table for anything matching hermes dashboard and sends SIGTERM — including the hermes dashboard process that the Electron app itself spawned as its backend.

This creates a boot→kill→reboot→crash loop where the desktop can never stay connected.

Steps to Reproduce

  1. Be 1+ commits behind origin/main
  2. Run hermes desktop
  3. Desktop spawns hermes dashboard --no-open --tui --host 127.0.0.1 --port <random> as its backend
  4. Backend becomes ready, desktop connects successfully
  5. Desktop renderer detects pending updates and triggers hermes:updates:apply
  6. applyUpdatesPosixInApp runs hermes update --yes
  7. hermes update calls _kill_stale_dashboard_processes() which SIGTERMs the desktop's own backend
  8. Desktop detects backend died, restarts it
  9. Update rebuilds the desktop app, triggering another restart
  10. Final backend starts but then crashes: ⚡ Interrupted during API call
  11. All subsequent hermes:api IPC calls timeout: Error: Timed out connecting to Hermes backend after 15000ms

Root Cause

_find_stale_dashboard_pids() in hermes_cli/main.py (line ~7131) only exempts its own PID:

if any(p in command for p in patterns) and pid != self_pid:
    dashboard_pids.append(pid)

It does not account for the case where hermes update was invoked by the Electron desktop app, whose child hermes dashboard process should NOT be killed.

Desktop Log Evidence

[hermes] [boot] Hermes backend is ready. Finalizing desktop startup
[hermes] [updates] update: Updating Hermes (git + dependencies)…
... (update runs) ...
[hermes] Hermes backend exited (SIGTERM)
[hermes] Hermes backend exited (SIGTERM)
[hermes] [updates] update: ⟲ Stopping 2 dashboard process(es) (the running backend no longer matches the updated frontend)
[hermes] [updates] update: ✓ stopped PID 717447
[hermes] [updates] update: ✓ stopped PID 717805
... (rebuild happens, desktop reboots) ...
[hermes] [boot] Hermes backend is ready. Finalizing desktop startup
[hermes] ⚡ Interrupted during API call.

After this, every IPC call from the renderer fails with:

Error occurred in handler for 'hermes:api': Error: Timed out connecting to Hermes backend after 15000ms

Suggested Fix

Any of:

  1. Exclusion PID list: _find_stale_dashboard_pids() accepts an optional exclude_pids parameter. The desktop passes its backend child PID.
  2. Env var marker: The desktop sets HERMES_DESKTOP_MANAGED=1 on its spawned backend process. _find_stale_dashboard_pids() skips processes with that env var.
  3. Desktop handles post-update restart: Instead of having hermes update kill dashboards, the desktop kills its own backend after the update completes and restarts it cleanly.

Environment

  • Hermes v0.15.1
  • Linux 7.0.10-201.fc44.x86_64 (Aurora/Fedora immutable)
  • Electron 40.9.3
  • Python 3.11.14

Workaround

Run hermes update separately before launching the desktop, so the auto-update has nothing to trigger on:

hermes update --yes
hermes desktop

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/cliCLI entry point, hermes_cli/, setup wizardtype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions