Skip to content

[Bug]: Browser CDP connection silently dies after idle period — gateway reports cdpReady but act/snapshot times out #23427

@darkneo29

Description

@darkneo29

Summary

After ~5-6 hours of browser idle time, the Playwright-to-Chrome CDP WebSocket connection silently dies. browser status continues reporting cdpReady: true, but all browser act and browser snapshot commands time out immediately. openclaw gateway restart does NOT fix the issue because it reconnects to the same zombie Chrome process — Chrome must be killed entirely and relaunched to restore functionality.

Steps to reproduce

  1. Launch browser via browser start (profile: openclaw)
  2. Use browser normally — snapshot, act, navigate all work fine
  3. Leave browser idle for ~5-6 hours (e.g., last interaction at 7 PM, next attempt at 12:15 AM via cron)
  4. Attempt any browser act or browser snapshot command → times out
  5. Run browser status → reports cdpReady: true (misleading)
  6. Run openclaw gateway restart → gateway comes back but browser commands still time out
  7. Only fix: pkill -9 -f "Google Chrome" → wait 3s → browser start (fresh Chrome + new CDP connection)

Expected behavior

  1. browser status should accurately reflect CDP WebSocket health — if the command channel is dead, report cdpReady: false
  2. Browser commands should either maintain the CDP connection via keepalive pings, or auto-recover (kill Chrome + relaunch) when a stale connection is detected
  3. openclaw gateway restart should either kill the managed Chrome process or detect the stale WebSocket and force a fresh launch

Actual behavior

  1. browser status reports cdpReady: true even when the Playwright WebSocket command channel is dead
  2. All browser act and browser snapshot calls time out silently — no error indicating a stale CDP connection
  3. openclaw gateway restart reconnects to the same zombie Chrome process with the same dead WebSocket, so browser commands continue to fail
  4. The only recovery path is manually killing Chrome (pkill -9 -f "Google Chrome"), waiting a few seconds, then calling browser start for a fresh CDP session

Root cause analysis: The CDP WebSocket channel between Playwright and Chrome decays during long idle periods. On macOS, this is likely caused by App Nap throttling the idle Chrome process, or Chrome’s internal cleanup of inactive debug connections. The gateway health check only verifies the CDP HTTP endpoint responds, not that the actual WebSocket command channel is alive.

OpenClaw version

2026.2.13

Operating system

macOS 15.3 (Darwin 25.3.0, Apple Silicon arm64)

Install method

npm global

Logs, screenshots, and evidence

Impact and severity

Affected: Any user running scheduled cron jobs or automated tasks that use browser automation after an idle period (especially overnight crons)
Severity: High — blocks all browser-dependent automation silently with no actionable error
Frequency: 100% reproducible after ~5-6 hours of idle time on macOS
Consequence: Cron jobs that depend on browser automation (e.g., daily training quizzes, web scraping, form submissions) fail completely. The misleading cdpReady: true status makes debugging difficult. Users must manually intervene to kill Chrome and restart.

Additional information

Current workarounds

  1. Kill Chrome before browser use: pkill -9 -f "Google Chrome"sleep 3browser start
  2. Disable macOS App Nap for Chrome: defaults write com.google.Chrome NSAppSleepDisabled -bool YES (may prevent idle decay)
  3. Self-recovery in cron prompts: Added error-handling that detects timeout, kills Chrome, relaunches, and retries

Suggested fixes

  1. CDP WebSocket keepalive — Periodic lightweight ping on the Playwright CDP channel (e.g., Runtime.evaluate({expression: "1"}) every few minutes) to keep the connection alive and detect failures early
  2. Health check improvementbrowser status should verify the actual WebSocket command channel, not just the HTTP endpoint. If the WebSocket is dead, report cdpReady: false
  3. Auto-recovery — When a browser act or snapshot times out due to a dead CDP channel, automatically kill Chrome and relaunch instead of returning a timeout error
  4. Gateway restart should clean up Chromeopenclaw gateway restart should kill the managed Chrome process to ensure a fresh connection on restart

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions