Skip to content

lsp: idle subprocesses are never reaped — long-lived gateways accumulate ~200 MB per server forever #25016

@kshitijk4poor

Description

@kshitijk4poor

Summary

agent/lsp/manager.py defines DEFAULT_IDLE_TIMEOUT = 600 (line 66) and assigns it to self._idle_timeout (line 167), but no reaper exists. Idle LSP subprocesses live forever inside a long-running gateway / CLI session.

A long-lived gateway process accumulates one LSP subprocess per (language, workspace) ever touched. Realistic memory cost:

  • pyright ~200 MB
  • gopls ~80 MB
  • tsserver ~150 MB
  • rust-analyzer ~300+ MB

A gateway that sees edits across 5–10 different repos in different languages over a day-long session ends up holding ~1–2 GB of dead LSP subprocesses with no path to reclaim until the gateway restarts.

Repro

Spawn a gateway, edit .py in repo A, edit .go in repo B, edit .rs in repo C, leave the gateway running for an hour, watch ps aux | grep -E 'pyright|gopls|rust-analyzer' — none of them get reaped.

Locations

  • agent/lsp/manager.py:66DEFAULT_IDLE_TIMEOUT = 600
  • agent/lsp/manager.py:157idle_timeout: float = DEFAULT_IDLE_TIMEOUT,
  • agent/lsp/manager.py:167self._idle_timeout = idle_timeout

grep -n '_reaper_loop\|_reap_idle' agent/lsp/manager.py returns no hits — the constant is wired into the constructor signature but nothing consumes it.

Proposed fix

Background reaper coroutine scheduled on _BackgroundLoop:

async def _reaper_loop(self):
    while True:
        await asyncio.sleep(self._idle_timeout / 2)
        self._reap_idle()

_reap_idle() is the unit-testable single pass — iterate self._last_used, tear down clients whose timestamp is older than now - self._idle_timeout. Schedule via _BackgroundLoop.schedule() in __init__ when enabled; cancel the handle in shutdown().

A note on cleanup: agent/lsp/__init__.py:75 already registers an atexit hook that tears down everything on process exit, so this issue is purely about reclaiming memory during a long-lived session, not at exit.

This was verified-real in scubamount's PR #24467 (defect D3). Credit to @scubamount for the original analysis. Filing as a standalone issue with a focused fix scope rather than the bundled refactor in that PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/agentCore agent loop, run_agent.py, prompt buildertype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions