Skip to content

fix(tui/pty): reap leaked slash_worker subprocesses on PTY chat disconnect#42132

Merged
teknium1 merged 4 commits into
mainfrom
hermes/hermes-04b2659b
Jun 8, 2026
Merged

fix(tui/pty): reap leaked slash_worker subprocesses on PTY chat disconnect#42132
teknium1 merged 4 commits into
mainfrom
hermes/hermes-04b2659b

Conversation

@teknium1

@teknium1 teknium1 commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Summary

Dashboard /chat PTY disconnects no longer leak tui_gateway.slash_worker subprocesses. Two independent, cheap guards close the leak for both Linux and macOS.

Root cause: /api/pty spawns hermes --tui, which spawns its own tui_gateway, which spawns _SlashWorker grandchildren. ptyprocess makes the PTY child a session leader (setsid()), but PtyBridge.close() only signalled the single leader PID — so the grandchild slash workers never received SIGHUP/TERM/KILL and orphaned on every refresh / proxy drop. The in-process orphan reaper (#38591) only covers the dashboard's own gateway sessions, not this subprocess tree.

Changes

Validation

Vector Before After
PTY process-group grandchild survives close() reaped (killpg) — E2E confirmed
Orphaned worker (parent dies) lingers forever self-exits within grace — E2E confirmed
  • Targeted: tests/hermes_cli/test_pty_bridge.py + tests/test_slash_worker_watchdog.py → 23 passed.
  • E2E: real PTY-spawned grandchild reaped by bridge.close(); real orphaned watchdog process self-terminated after parent exit.

Closes #32377. Salvaged from #24135 (@paulb26) and #35626 (@banditburai/firefly); supersedes the slash-worker-leak PR cluster.

Infographic

slash-worker-leak-sealed

paulb26 and others added 3 commits June 8, 2026 06:44
…tchdog

Daemon thread polls _is_orphaned (original ppid check + psutil create_time PID-reuse
guard, no PR_SET_PDEATHSIG). On orphan, drains an in-flight command up to a grace
window then os._exit(0). Started before the HermesCLI build to cover the spawn window.

Task: swl-qrf.8
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-04b2659b vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 10399 on HEAD, 10397 on base (🆕 +2)

🆕 New issues (2):

Rule Count
unresolved-import 2
First entries
tui_gateway/slash_worker.py:15: [unresolved-import] unresolved-import: Cannot resolve imported module `psutil`
tests/test_slash_worker_watchdog.py:1: [unresolved-import] unresolved-import: Cannot resolve imported module `psutil`

✅ Fixed issues: none

Unchanged: 5430 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dashboard hangs: tui_gateway.slash_worker subprocesses leak on PTY chat disconnect (524 via reverse proxy)

3 participants