Skip to content

[Bug]: ACP sessions leave orphaned claude processes, swap exhaustion #44790

@danielvos1998

Description

@danielvos1998

Bug type

Behavior bug (incorrect output/state without crash)

Summary

ACP (Agent Communication Protocol) sessions spawned via acpx / claude-agent-a cp leave orphaned claude CLI processes running after session completion. Thes
e processes accumulate over time, consuming all available swap memory and degrad
ing system performance.

Impact: On a VPS with 3.8 GB RAM and 8 GB swap, after ~14 hours of normal us
age, 65 orphaned claude processes consuming 4.8 GB of swap (60% of tot
al capacity) were found.

Steps to reproduce

  1. Install OpenClaw v2026.3.12+ with acpx plugin enabled
  2. Run multiple ACP agent sessions sequentially or concurrently
    /acp spawn claude --mode persistent --thread here
    # (complete the session or let it timeout)
  3. Wait for sessions to complete
  4. Check for lingering processes:
    ps aux | grep '[c]laude' | wc -l
    ps -eo pid,etime,rss,args | grep '[c]laude' | head -20
  5. Observe: claude and claude-agent-acp processes persist long after the
    ir sessions ended

Expected behavior

When an ACP session completes (success, error, timeout, or manual close), all ch
ild processes in the session's process tree should be:

  • Sent SIGTERM (graceful shutdown)
  • Reaped and cleaned up within 5 seconds
  • Removed from process list and swap

Actual behavior

  • claude-agent-acp (Node.js wrapper) and claude (CLI) processes remain ru
    nning indefinitely
  • ❌ Each orphaned pair consumes ~170–250 MB of swap when idle
  • ❌ Processes accumulate linearly (~1–2 new orphans per ACP session)
  • ❌ After 12–24 hours of active use, swap is completely exhausted (7–8 GB full)
  • ❌ System becomes unresponsive; potential OOM killer triggers; may crash servi
    ces

OpenClaw version

v2026.3.12

Operating system

Ubuntu 24.04 LTS (x86_64)

Install method

No response

Model

Bundled with @zed-industries/claude-agent-acp

Provider / routing chain

User Request (Telegram) > OpenClaw Main Agent (Haiku) > sessions_spawn(runtime="acp", agentId="claude") > Gateway ACP Dispatcher > acpx Plugin Runtime > npm exec @zed-industries/claude-agent-acp > claude CLI Process (Child) > [SESSION COMPLETES] > ❌ CLEANUP FAILS HERE

Config file / key location

No response

Additional provider/model setup details

No response

Logs, screenshots, and evidence

# Count of orphaned processes
$ ps aux | grep -c '[c]laude'
65
# Total swap consumed by claude processes
$ free -h
              total      used      free
Mem:          3.8Gi     3.5Gi     153Mi
Swap:         8.0Gi     6.3Gi     1.7Gi
$ ps -eo pid,etime,args | grep '[c]laude' | awk '{print $1, $2}' | sort -k2 -t:
-rn | head -10
33597  1:15  (oldest)
51523  1:13
73002  1:10
(... 62 more processes, ages 30min–13+ hours)
# Process tree: orphaned session hierarchy
$ ps -ef --forest | grep -A2 'claude-agent'
root      xxxxx  2154  npm exec @zed-industries/claude-agent-acp
root      yyyyy xxxxx  claude

Impact and severity

Aspect Details
Severity 🔴 CRITICAL — Production-blocking
Impact Scope All ACP harnesses (claude, codex, pi, opencode, etc.)
System Impact Swap exhaustion → OOM kills → system unresponsiveness
User Impact System becomes unresponsive/unusable within 12–24 hours of normal ACP usage
Data Loss Risk Moderate — OOM killer may terminate critical processes (gateway, OpenClaw services)
Frequency 100% reproducible with active ACP sessions
Time to Impact 12–24 hours depending on ACP session frequency
Affected Users Any OpenClaw deployment using ACP agents (especially small VPS)
Workaround Available? ✅ Yes — cron cleanup script (30–60 min manual mitigation)
Permanent Fix Complexity Medium — Requires changes to both OpenClaw gateway and acpx plugin

Minimum Viable Fix: Force-kill ACP child processes 30 seconds after session
end (in gateway.session.cleanup())
Proper Fix: Implement graceful shutdown with SIGTERM→wait(5s)→SIGKILL cascade in acpx plugin

Additional information

Environment

Key Value
OS Ubuntu 24.04 LTS (x86_64)
Kernel 6.8.0-101-generic
RAM 3.8 GB
Swap 8.0 GB
Node.js v24.14.0
OpenClaw v2026.3.12 (6472949)
ACP Plugin acpx (enabled, bundled)
Claude CLI Bundled with @zed-industries/claude-agent-acp

Suggested Fix Areas

  • OpenClaw gateway (session.cleanup): Ensure kill(-pgid, SIGTERM) on ACP session cleanup
  • acpx plugin: Register shutdown hooks; validate process groups
  • claude-agent-acp: Implement graceful shutdown signal handling

Temporary Workaround

Until fix is deployed, run this periodic cleanup:

# Kill orphaned claude processes older than 30 minutes
0,30 * * * * ps -eo pid,etimes,args | grep '[c]laude' | awk '$2 > 1800 {print $1}' | xargs -r kill -9 2>/dev/null

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingbug:behaviorIncorrect behavior without a crash

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions