Skip to content

[Bug]: OpenCode accumulates one context-mode MCP child per session — never reaped (26 idle children, 1.6 GB RSS in repro) #565

@omercnet

Description

@omercnet

Summary

A single long-lived opencode serve host accumulates one context-mode MCP child process per session/subagent it opens. Children are never reaped while the host stays alive. On this machine, a single opencode host (uptime 1d 18h) has 26 idle context-mode children, all parented to the same opencode PID, totalling ~1.6 GB RSS.

This is a new class of leak, distinct from #103 (orphan-on-host-death), #311 (npm-exec wrapper reparent), #534 (nested pi --help), #559 (upgrade-time leftover), and #562 (Pi subagent DB lock). None of those code paths fires here — the parent is alive, the chain is direct (no npm-exec wrapper), and sibling discovery is hard-coded to Claude plugin paths.

Observed (live snapshot)

26 context-mode children, single opencode parent (PID 23986), ~1.6 GB total RSS
ps --ppid 23986 -o pid,ppid,etime,pcpu,rss,command
    PID    PPID     ELAPSED %CPU   RSS COMMAND
  25257   23986  1-18:51:26  0.1 64752 node /home/USER/.npm-global/bin/context-mode
  25273   23986  1-18:51:26  0.1 62708 node /home/USER/.npm-global/bin/context-mode
  25301   23986  1-18:51:26  0.1 62768 node /home/USER/.npm-global/bin/context-mode
  25412   23986  1-18:51:25  0.1 63248 node /home/USER/.npm-global/bin/context-mode
  25477   23986  1-18:51:25  0.1 61112 node /home/USER/.npm-global/bin/context-mode
  27197   23986  1-18:51:00  0.1 61492 node /home/USER/.npm-global/bin/context-mode
  27686   23986  1-18:50:52  0.1 63648 node /home/USER/.npm-global/bin/context-mode
  28601   23986  1-18:50:34  0.1 62584 node /home/USER/.npm-global/bin/context-mode
  28810   23986  1-18:50:33  0.1 61740 node /home/USER/.npm-global/bin/context-mode
  28993   23986  1-18:50:32  0.1 63092 node /home/USER/.npm-global/bin/context-mode
  29006   23986  1-18:50:32  0.1 62152 node /home/USER/.npm-global/bin/context-mode
  29142   23986  1-18:50:31  0.1 61628 node /home/USER/.npm-global/bin/context-mode
  29557   23986  1-18:50:23  0.1 65152 node /home/USER/.npm-global/bin/context-mode
  29911   23986  1-18:50:18  0.1 62728 node /home/USER/.npm-global/bin/context-mode
  30065   23986  1-18:50:18  0.1 61384 node /home/USER/.npm-global/bin/context-mode
  30556   23986  1-18:50:12  0.1 61936 node /home/USER/.npm-global/bin/context-mode
  30765   23986  1-18:50:09  0.1 63000 node /home/USER/.npm-global/bin/context-mode
  30810   23986  1-18:50:09  0.1 62480 node /home/USER/.npm-global/bin/context-mode
  30933   23986  1-18:50:08  0.1 60728 node /home/USER/.npm-global/bin/context-mode
  31378   23986  1-18:50:01  0.1 62108 node /home/USER/.npm-global/bin/context-mode
  31593   23986  1-18:49:59  0.1 62528 node /home/USER/.npm-global/bin/context-mode
  31785   23986  1-18:49:56  0.1 60280 node /home/USER/.npm-global/bin/context-mode
  32149   23986  1-18:49:52  0.1 61992 node /home/USER/.npm-global/bin/context-mode
  41686   23986  1-18:39:16  0.1 65956 node /home/USER/.npm-global/bin/context-mode
2217937   23986    01:30:25  0.2 83616 node /home/USER/.npm-global/bin/context-mode
2236947   23986       21:08  0.2 94200 node /home/USER/.npm-global/bin/context-mode
  • All 26 PPID = 23986 (the single .opencode serve process)
  • All idle (~0.1% CPU)
  • Oldest alive 1d 18h, matching the opencode parent's uptime
  • Total RSS ≈ 1.6 GB

Expected

Idle MCP children should not accumulate for the lifetime of a long-running host. One of:

  • the server self-shuts after a configurable idle period, or
  • a startup-time sibling reaper collapses duplicate children with the same parent, or
  • the host-side integration shares a single MCP client across sessions.

Root cause analysis

Why none of the existing lifecycle paths fire on opencode

OpenCode integrates context-mode through two surfaces simultaneously (per shipped configs/opencode/opencode.json):

{
  "mcp":    { "context-mode": { "type": "local", "command": ["context-mode"] } },
  "plugin": ["context-mode"]
}
Surface Mechanism Runs where
mcp stdio MCP server Separate child process per MCP client
plugin TS plugin (build/adapters/opencode/plugin.js) In-process inside opencode

OpenCode opens a fresh MCP client per session / subagent task. context-mode's only shutdown paths in src/lifecycle.ts:

src/util/sibling-mcp.ts (#559) also misses this. Its POSIX discovery regex is hard-coded to Claude plugin paths:

node.*plugins/(cache|marketplace

OpenCode's child argv is node /home/USER/.npm-global/bin/context-mode. Discovery never matches → /ctx-upgrade from inside opencode would not reap them either. And the killer only fires on upgrade, not at MCP startup.

Why this is not a duplicate of #103, #311, #366, #471, #534, #559, or #562

No existing issue covers the "host-alive, MCP children accumulate per session, indefinitely" pattern.

Proposed course of action

COA-1 — Idle self-shutdown timer in src/lifecycle.ts (recommended first)
  • Track timestamp of last JSON-RPC request handled.
  • If no request for CONTEXT_MODE_IDLE_TIMEOUT_MS (default e.g. 15 min), call gracefulShutdown().
  • Skip when process.stdin.isTTY is true (interactive dev) and when env is 0.
  • Cost on reconnect: next tool call spawns a fresh child (~1–3 s). Acceptable; matches opencode's existing per-session spawn behaviour.
  • Zero-config win for every host (opencode, KiloCode, future).
COA-2 — Generalize sibling-mcp.ts discovery + run at server startup

Today's POSIX regex misses npm-global/bin/context-mode and bun ... server.bundle.mjs shapes. Generalize to also match:

node.*(context-mode/start\.mjs|context-mode/server\.bundle\.mjs|bin/context-mode)
bun.*context-mode/server\.bundle\.mjs

Then:

  1. Run discovery at MCP server startup (not only on /ctx-upgrade).
  2. Kill siblings where ppid === own ppid AND argv matches AND pid !== own pid AND age > N s with no recent activity.
  3. Add a context-mode prune CLI subcommand for manual cleanup.

Composes with COA-1: idle timer handles long-running accumulation, startup sweep handles the case where a session reconnects to a host that already has 25 stale siblings.

COA-5 — North star: port ctx_* tools into the TS plugin, drop the mcp block on opencode

OpenClaw already does this via src/adapters/openclaw/mcp-tools.ts (mcp-bridge pattern). The TS plugin runs in-process and could expose tool registration the same way, eliminating the stdio child entirely on opencode/KiloCode. Largest engineering investment but the cleanest structural fix for any ts-plugin paradigm host.

Immediate user mitigation (no code change)

# Reap idle context-mode MCP children of the running opencode serve
opencode_pid=$(pgrep -fx '.*\.opencode serve.*' | head -1 || pgrep -f 'opencode.*serve' | head -1)
pgrep -P "$opencode_pid" -f 'context-mode' | xargs -r kill -TERM

Suggested fix order

Ship COA-1 + COA-2 as a v1.0.x bugfix (small, mechanical, compose). Track COA-5 as the structural follow-up.


Template fields

Platform: OpenCode

context-mode version: 1.0.131 (latest)

Exact prompt that triggered the bug: N/A — bug is a process-lifecycle accumulation observed over normal multi-session use of opencode over ~2 days, not triggered by any specific prompt. Reproduction is "run opencode for hours, open multiple sessions, observe ps --ppid <opencode-pid>".

Steps to reproduce:

  1. Install: npm install -g context-mode@1.0.131 (or any version ≥ 1.0.19).
  2. Configure opencode with the shipped configs/opencode/opencode.json (mcp.context-mode.type=local + plugin: ["context-mode"]).
  3. Run opencode serve (or opencode interactively) and use it normally across multiple sessions and subagent tasks over a working day.
  4. Inspect children of the opencode PID:
    opencode_pid=$(pgrep -f '\.opencode serve' | head -1)
    ps --ppid "$opencode_pid" -o pid,ppid,etime,pcpu,rss,command | grep context-mode
  5. Observe N children (one per session/subagent), all idle, all PPID = opencode, RSS growing linearly with session count.

Full error output: None. There is no error — the symptom is RAM/CPU pressure from accumulating idle MCP children. No crash, no -32000, no log entry.

What I tried before filing:

Pre-submission checklist: All four items satisfied (latest version, searched existing issues, repro steps provided, debug script run).

Operating System: Linux (Ubuntu/Debian-class, kernel 6.x)

JS Runtime: node v24.14.1 (mise-managed); bun 1.3.10 available for the opencode host itself.

Debug script output (scripts/ctx-debug.sh)
context-mode diagnostic v2.0.0
─────────────────────────────
25 passed, 4 failed, 1 warnings

Failed (local dev tree only — not relevant to this bug):
  ✗ build/ directory exists
  ✗ better-sqlite3 .node binary exists
  ✗ require('better-sqlite3') succeeds
  ✗ FTS5 in-memory test

The four failures are because the script was run from the development repo before npm run build. The installed npm-global v1.0.131 used by opencode is fully functional (it's actively running 26 healthy MCP processes that respond to JSON-RPC); these failures only reflect the absence of built artifacts in the local repo and are unrelated to the lifecycle leak described above.

Key fields from the JSON output:

  • Installed version: 1.0.131 (latest)
  • OS: Linux
  • Bash: 5.2.21(1)-release
  • Node.js: v24.14.1 at /home/USER/.local/share/mise/installs/node/24.14.1/bin/node
  • Bun: 1.3.10 at /home/USER/.bun/bin/bun
  • better-sqlite3 ABI: 137
  • Environment: NODE_OPTIONS=unset, CONTEXT_MODE_NODE=unset

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions