Summary
Gateway WS self-contention when calling cron tool from within an active LLM session is still unresolved. The original issues (#5703 and #6508) were circular-duped shut — each closed as duplicate of the other — without an actual fix landing.
The bug persists as of v2026.3.x. We hit it daily when triggering cron jobs from active sessions.
Reproduction (still works)
- From an active LLM session (e.g., Discord or Telegram), call the
cron tool (run/list/add)
- The tool opens a second WS connection to the same gateway
- Gateway's single-threaded event loop is busy processing the current LLM turn
- Second WS request sits in queue, never gets processed → timeout after 10s
- The job actually runs successfully — it's just the ack that times out
Error: gateway timeout after 10000ms
Gateway target: ws://127.0.0.1:18789
Also reproducible via CLI: openclaw cron run <jobId> from within an active session.
Root cause (unchanged from #6508)
The cron tool routes through a new WS connection to the gateway instead of using the existing session's IPC/WS channel. The gateway's Node.js event loop is occupied by the current LLM turn, so it can't respond to the second connection within the timeout window.
This is not a resource issue — CPU/memory are fine. It's purely single-threaded event loop contention.
Evidence from #6508 discussion
- Internal IPC path (
server-bridge-methods) works instantly from active sessions
- External WS connections work fine when no session is active (17ms response)
- The timeout only occurs when the same gateway is already processing an LLM turn
handshake=connected in logs confirms it's not an auth issue — the connection establishes, the gateway just never responds
Preferred fix (from #6508 community discussion)
Option B: In-process function calls for gateway-native tools
- The internal
cron tool already has an IPC path that works perfectly
- Route embedded tool calls (cron, gateway config, etc.) through in-process IPC instead of WS
- Zero overhead, immune to event loop contention
- This is how some tools (e.g.,
gateway.config.get) already work internally
Alternative: Option A — Multiplex tool calls on the existing session WS channel instead of opening a new connection. More complex but also viable.
Current workaround
# Use CLI via exec tool instead of native cron tool
openclaw cron run <jobId> --timeout 3000 2>&1 || true
This spawns a separate process with its own event loop. The timeout error is cosmetic — the job always runs. But it's noisy and confusing for LLM agents that may interpret the timeout as a failure and retry.
Impact
- Every
cron tool call from an active session hits this
- Risk of duplicate jobs if LLM retries on false timeout
- Affects
cron.run, cron.list, cron.add, and potentially other gateway-native tool calls under load
- Users/agents must use CLI workaround, which adds latency and error noise
References
Summary
Gateway WS self-contention when calling
crontool from within an active LLM session is still unresolved. The original issues (#5703 and #6508) were circular-duped shut — each closed as duplicate of the other — without an actual fix landing.The bug persists as of v2026.3.x. We hit it daily when triggering cron jobs from active sessions.
Reproduction (still works)
crontool (run/list/add)Also reproducible via CLI:
openclaw cron run <jobId>from within an active session.Root cause (unchanged from #6508)
The
crontool routes through a new WS connection to the gateway instead of using the existing session's IPC/WS channel. The gateway's Node.js event loop is occupied by the current LLM turn, so it can't respond to the second connection within the timeout window.This is not a resource issue — CPU/memory are fine. It's purely single-threaded event loop contention.
Evidence from #6508 discussion
server-bridge-methods) works instantly from active sessionshandshake=connectedin logs confirms it's not an auth issue — the connection establishes, the gateway just never respondsPreferred fix (from #6508 community discussion)
Option B: In-process function calls for gateway-native tools
crontool already has an IPC path that works perfectlygateway.config.get) already work internallyAlternative: Option A — Multiplex tool calls on the existing session WS channel instead of opening a new connection. More complex but also viable.
Current workaround
This spawns a separate process with its own event loop. The timeout error is cosmetic — the job always runs. But it's noisy and confusing for LLM agents that may interpret the timeout as a failure and retry.
Impact
crontool call from an active session hits thiscron.run,cron.list,cron.add, and potentially other gateway-native tool calls under loadReferences