Dashboard: leaked tui_gateway.slash_worker processes accumulate, exhaust file descriptors on macOS launchd

## Summary

The dashboard (`hermes dashboard --tui`) leaks `tui_gateway.slash_worker` subprocesses when chat sessions end. Over hours/days these accumulate to the point that the launchd-default file-descriptor cap (256 on macOS) is exhausted, after which any new chat session spawn fails with `OSError: [Errno 24] Too many open files` from `subprocess.Popen` in `hermes_cli/main.py:_make_tui_argv`, and the user sees `[session ended]` immediately upon opening `/chat`.

## Environment

- Hermes v0.13.0
- macOS Darwin 24.6.0 (arm64)
- Python 3.11.15
- Launched via launchd user agent (`ai.hermes.dashboard.plist` installed by `install-hermes` / setup wizard)
- `launchctl limit maxfiles` = `256 unlimited` (the macOS default for launchd-spawned user agents — note this is much lower than the interactive-shell `ulimit -n` of 1048576)

## Reproduction

1. Install Hermes on macOS with the bundled launchd setup, no explicit `SoftResourceLimits.NumberOfFiles` in the dashboard plist.
2. Open `localhost:9119/chat` in a browser, exchange a few turns, close the tab.
3. Repeat over several hours / days (or many short sessions in succession).
4. Observe: `tui_gateway.slash_worker` processes persist after the parent session/tab ends — `ps -eo pid,etime,command | grep tui_gateway` shows accumulating PIDs.
5. Once the per-process file-descriptor budget is exhausted, every new `/chat` open shows `[session ended]` instantly, and the dashboard error log shows the OSError below.

## Evidence

In my install I just observed **75+ orphan `tui_gateway.slash_worker` processes**, the oldest with `etime` over 27 hours, and corresponding `subprocess.Popen` → `os.pipe()` → `OSError: [Errno 24]` in `dashboard-launchd.err`. Cleaning the orphans + restarting the dashboard restored `/chat` immediately. Kanban DB writes were also failing (`hermes_dashboard_plugin_kanban: unable to open database file`) — same FD-exhaustion symptom, not a DB-corruption issue.

Stack trace from `dashboard-launchd.err`:

```
File ".../hermes_cli/web_server.py", line 3146, in pty_ws
  argv, cwd, env = _resolve_chat_argv(resume=resume, sidecar_url=sidecar_url)
File ".../hermes_cli/web_server.py", line 3051, in _resolve_chat_argv
  argv, cwd = _make_tui_argv(PROJECT_ROOT / "ui-tui", tui_dev=False)
File ".../hermes_cli/main.py", line 1092, in _make_tui_argv
  result = subprocess.run(...)
File ".../subprocess.py", line 1715, in _get_handles
  c2pread, c2pwrite = os.pipe()
OSError: [Errno 24] Too many open files
```

## Suggested fix

Two-part fix, both needed:

1. **Reap `tui_gateway.slash_worker` (and `tui_gateway.entry`) processes when their session ends.** The dashboard's WebSocket-close / session-end handler should track the spawned worker PID, send SIGTERM, wait briefly, then SIGKILL if it does not exit. Right now they are leaking silently. This is the root cause.
2. **Set `SoftResourceLimits.NumberOfFiles` in the bundled `ai.hermes.dashboard.plist` template** (and gateway plists, for parity) so freshly-installed dashboards have headroom even if any residual leak path remains. Suggested: 8192 soft / 16384 hard. This is a defense-in-depth measure.

## Workaround (for users hitting this now)

```bash
pkill -KILL -f "tui_gateway\.slash_worker"
pkill -KILL -f "tui_gateway\.entry"
launchctl kickstart -k "gui/\$(id -u)/ai.hermes.dashboard"
```

Optionally add `SoftResourceLimits.NumberOfFiles` to `~/Library/LaunchAgents/ai.hermes.dashboard.plist` and reload via `launchctl bootout` + `launchctl load -w`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dashboard: leaked tui_gateway.slash_worker processes accumulate, exhaust file descriptors on macOS launchd #24775

Summary

Environment

Reproduction

Evidence

Suggested fix

Workaround (for users hitting this now)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Dashboard: leaked tui_gateway.slash_worker processes accumulate, exhaust file descriptors on macOS launchd #24775

Description

Summary

Environment

Reproduction

Evidence

Suggested fix

Workaround (for users hitting this now)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions