Skip to content

Bug: slash_worker lifecycle gaps and system fragility observed during intensive Dashboard usage (#21370) #22855

@leether

Description

@leether

During high-frequency interaction tests with hermes dashboard on macOS, we observed a failure state that highlights critical gaps in subprocess lifecycle management. While #21370 identifies the leak, our experience suggests that these leaks contribute to a state of system fragility.

Observations

  1. Accumulation: Intensive UI usage orphans dozens of slash_worker processes (parented to PID 1).
  2. Update Collision: A routine "hermes update" was executed while orphans were present. The update became "stuck" and the environment was corrupted.
  3. Failure State: Subsequent commands failed with .../venv/bin/python3: No module named pip.
  4. Recovery Issue: Orphans survived atexit hooks and remained active, complicating manual recovery and environment sync.

Architectural Gaps

  • Unmanaged Resource: slash_worker is spawned outside tools.process_registry, making it invisible to global teardown logic.
  • Fragile Cleanup: Reliance on atexit is insufficient for update-induced restarts or hard crashes.
  • Lack of Self-Termination: The worker cannot detect when its parent has died.

Proposed Fix

I have verified a "Defense-in-Depth" solution on macOS following AGENTS.md:

  1. Unified Management: Extended ProcessRegistry with register_host_process to track these workers.
  2. Fingerprinted Watchdog: Added a thread to slash_worker.py monitoring parent PID + create_time.

I have the patch ready for PR. Please let me know if you would like me to submit it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/gatewayGateway runner, session dispatch, deliverycomp/tuiTerminal UI (ui-tui/ + tui_gateway/)sweeper:implemented-on-mainSweeper: behavior already present on current maintype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions