Summary
A single TUI session appears to eagerly spawn two separate hermes mcp serve subprocesses during normal operation:
- one under
tui_gateway.entry
- one under
tui_gateway.slash_worker
This is distinct from the already-known orphan-cleanup problem. In this case, the duplicate children are live and parented, not stale zombies.
The result is unnecessary subprocess fan-out, extra MCP sessions, and likely contribution to downstream contention issues such as intermittent SQLite WAL write pressure (fact_store lock symptoms) and avoidable resource growth.
Why this looks like a bug
This does not appear to be an intentional isolation boundary. The observed behavior is that one logical TUI session creates two Hermes MCP server children because both startup paths reach MCP discovery / tool bootstrap.
That is incorrect lifecycle behavior, not merely a performance enhancement request.
Environment
- Repo:
NousResearch/hermes-agent
- Host: Linux VPS
- Hermes TUI/gateway in active use
- Config includes Hermes itself as an MCP server via
~/.hermes/config.yaml
Relevant config shape:
mcp_servers:
hermes:
command: hermes
args: [mcp, serve]
Evidence
Code-path inspection showed:
model_tools.py calls discover_mcp_tools() at import time
tui_gateway/server.py creates one persistent slash worker per TUI session
tui_gateway/slash_worker.py creates one HermesCLI per TUI session
- both
tui_gateway.entry and tui_gateway.slash_worker therefore appear able to trigger MCP discovery / stdio server startup
Live process mapping showed 8 active hermes mcp serve children at one point:
- 3 under
python3 -m tui_gateway.slash_worker
- 3 under
python3 -m tui_gateway.entry
- 2 under direct
hermes / hermes --resume sessions
Representative mapping from ps:
PID=3643062 PPID=3643047 child of python3 -m tui_gateway.slash_worker --session-key ...
PID=3643966 PPID=3643953 child of python3 -m tui_gateway.slash_worker --session-key ...
PID=3656716 PPID=3656702 child of python3 -m tui_gateway.slash_worker --session-key ...
PID=3681559 PPID=3642998 child of python3 -m tui_gateway.entry
PID=3681598 PPID=3643872 child of python3 -m tui_gateway.entry
PID=3681628 PPID=3656657 child of python3 -m tui_gateway.entry
PID=3674837 PPID=3674807 child of direct /usr/bin/python3 /home/openclaw/.local/bin/hermes
PID=3677397 PPID=3677390 child of direct /usr/bin/python3 /home/openclaw/.local/bin/hermes --resume ...
The important part is not the absolute count, but the topology: each inspected TUI session effectively had two Hermes MCP children, one from entry and one from slash worker.
Existing mitigation is insufficient
There is already a restart-scoped cleanup mitigation:
ExecStartPre=/bin/bash -c 'pkill -f "hermes.*mcp" || true'
That helps clean up before a gateway restart, but it does not prevent normal runtime duplication during active sessions.
Expected behavior
For a normal TUI session, Hermes should either:
- create one shared MCP stdio child for the session, or
- explicitly avoid MCP discovery in one of the two startup paths unless needed
A single logical session should not eagerly double-spawn Hermes MCP subprocesses by default.
Actual behavior
Both tui_gateway.entry and tui_gateway.slash_worker appear to reach MCP bootstrap, causing duplicate hermes mcp serve children during ordinary session startup.
Impact
- unnecessary subprocess duplication
- extra MCP sessions and pipe handles
- avoidable memory / process growth over time
- likely contributor to transient lock/contention symptoms in other subsystems
- operational confusion, because restart cleanup can hide the symptom without fixing the source
Suspected root cause
The likely root cause is the combination of:
- Hermes being configured as an MCP server in
config.yaml
- eager
discover_mcp_tools() in model_tools.py
slash_worker creating its own HermesCLI
- both the main TUI path and slash-worker path performing tool bootstrap independently
Proposed direction
Near-term safe fix:
- prevent
slash_worker from eagerly triggering MCP discovery unless it actually needs MCP-backed tools
More durable architectural fix:
- make MCP server lifecycle shared / singleton per relevant scope, instead of per bootstrap path
Related issues
This seems related to, but distinct from:
This issue is specifically about duplicate creation during normal TUI session startup, not just orphan reaping or performance tuning.
Summary
A single TUI session appears to eagerly spawn two separate
hermes mcp servesubprocesses during normal operation:tui_gateway.entrytui_gateway.slash_workerThis is distinct from the already-known orphan-cleanup problem. In this case, the duplicate children are live and parented, not stale zombies.
The result is unnecessary subprocess fan-out, extra MCP sessions, and likely contribution to downstream contention issues such as intermittent SQLite WAL write pressure (
fact_storelock symptoms) and avoidable resource growth.Why this looks like a bug
This does not appear to be an intentional isolation boundary. The observed behavior is that one logical TUI session creates two Hermes MCP server children because both startup paths reach MCP discovery / tool bootstrap.
That is incorrect lifecycle behavior, not merely a performance enhancement request.
Environment
NousResearch/hermes-agent~/.hermes/config.yamlRelevant config shape:
Evidence
Code-path inspection showed:
model_tools.pycallsdiscover_mcp_tools()at import timetui_gateway/server.pycreates one persistent slash worker per TUI sessiontui_gateway/slash_worker.pycreates oneHermesCLIper TUI sessiontui_gateway.entryandtui_gateway.slash_workertherefore appear able to trigger MCP discovery / stdio server startupLive process mapping showed 8 active
hermes mcp servechildren at one point:python3 -m tui_gateway.slash_workerpython3 -m tui_gateway.entryhermes/hermes --resumesessionsRepresentative mapping from
ps:The important part is not the absolute count, but the topology: each inspected TUI session effectively had two Hermes MCP children, one from entry and one from slash worker.
Existing mitigation is insufficient
There is already a restart-scoped cleanup mitigation:
That helps clean up before a gateway restart, but it does not prevent normal runtime duplication during active sessions.
Expected behavior
For a normal TUI session, Hermes should either:
A single logical session should not eagerly double-spawn Hermes MCP subprocesses by default.
Actual behavior
Both
tui_gateway.entryandtui_gateway.slash_workerappear to reach MCP bootstrap, causing duplicatehermes mcp servechildren during ordinary session startup.Impact
Suspected root cause
The likely root cause is the combination of:
config.yamldiscover_mcp_tools()inmodel_tools.pyslash_workercreating its ownHermesCLIProposed direction
Near-term safe fix:
slash_workerfrom eagerly triggering MCP discovery unless it actually needs MCP-backed toolsMore durable architectural fix:
Related issues
This seems related to, but distinct from:
This issue is specifically about duplicate creation during normal TUI session startup, not just orphan reaping or performance tuning.