fix(windows): prevent conhost.exe zombie leak + fix agentEntry type narrowing#30060
fix(windows): prevent conhost.exe zombie leak + fix agentEntry type narrowing#30060edincampara wants to merge 1 commit intoopenclaw:mainfrom
Conversation
Greptile SummaryAdds
Confidence Score: 5/5
Last reviewed commit: e3c2676 |
…dows On Windows, spawning a detached child process with stdio:'inherit' causes Node.js to allocate a conhost.exe (Windows Console Host) for the child. When child.unref() is called, the Node event loop detaches but the conhost process is never reaped -- it stays alive as a zombie. With cron jobs running nightly, these accumulate rapidly. In production we observed 404 zombie conhost.exe processes after ~7 hours, consuming 3.5 GB of RAM and causing CPU spikes when they all become schedulable simultaneously. Fix: add windowsHide:true to the spawn options in restartGatewayProcessWithFreshPid(). This suppresses console window (and conhost.exe) allocation on Windows entirely. It is a no-op on macOS and Linux, so there is zero cross-platform risk. Tested on: Windows 10 22H2, Node.js v24.13.0 Before: 404 conhost.exe @ 3.5 GB RAM after 7h uptime with 9 cron jobs After: 0 conhost.exe accumulation
92a8b59 to
36591e0
Compare
|
Thanks @nikolasdehor! Agreed on the type narrowing — I've already rebased and dropped those commits since #30048's approach is cleaner. The branch now has only the windowsHide fix on top of latest main, conflict-free. |
|
This pull request has been automatically marked as stale due to inactivity. |
Changes
1. fix(windows): prevent conhost.exe zombie process leak
On Windows, the gateway accumulates hundreds of zombie \conhost.exe\ (Windows Console Host) processes over time, one per cron execution. These are never reaped and consume significant RAM.
Root cause: In \src/infra/process-respawn.ts,
estartGatewayProcessWithFreshPid()\ spawns a detached child with \stdio: 'inherit'. On Windows this causes Node.js to allocate a \conhost.exe\ for the child's console I/O. When \child.unref()\ is called, the Node event loop detaches from the child -- but the \conhost.exe\ stays alive as a zombie indefinitely.
Observed impact (real production system, 9 cron jobs, ~7h uptime):
Fix: Add \windowsHide: true\ to the \spawn()\ options. This suppresses console window (and \conhost.exe) allocation on Windows entirely. No-op on macOS/Linux, zero cross-platform risk.
\\ s
// src/infra/process-respawn.ts
const child = spawn(process.execPath, args, {
env: process.env,
detached: true,
stdio: 'inherit',
windowsHide: true, // prevents conhost.exe allocation on Windows
});
\\
2. fix(types): guard agentEntry against false before accessing heartbeat
Pre-existing TypeScript error caught by CI:
\
src/gateway/server-cron.ts(197,24): error TS2339: Property 'heartbeat' does not exist on type 'false | AgentConfig'
\\
\�gentEntry\ is the result of \Array.find()\ which can return \alse\ when no matching agent is found. The spread ...agentEntry?.heartbeat\ doesn't narrow the \alse\ case properly.
Fix: Explicit type guard before spreading:
\\ s
// before
...agentEntry?.heartbeat,
// after
...(agentEntry && agentEntry !== false ? agentEntry.heartbeat : undefined),
\\
Tested on
References