Background
OpenClaw aborts a model call when its stream emits no bytes for ~120s (idleTimedOut: true in trajectory). For most providers this is appropriate dead-stream protection, but for Google Gemini preview models we observe legitimate completions that are silent on the wire for >120s — likely server-side buffering of thinking tokens — so the watchdog kills them prematurely.
The threshold is not user-configurable in this build:
- Not in
openclaw config schema under agents.defaults (or anywhere I could find).
- No matching env var (
OPENCLAW_*_IDLE* only covers OpenAI WS pool, browser/CDP, etc.).
- Plumbed only via runtime-internal
idleTimeoutMs parameters / session-binding lifecycle.
Result: users hit unrecoverable stalls with no escape hatch.
Reproduction (2026.5.4-beta.1, commit 9cc3ae1)
Provider: google (google-generative-ai), model: gemini-3.1-pro-preview-customtools (and other Gemini -preview variants — pro / flash / flash-lite all reproduce).
User asks main agent for a moderate-output reply ("List the 46…"). Trajectory shows:
T+0 session.started, prompt.submitted (input ~25k tokens)
T+~120s model.completed { aborted: true, idleTimedOut: true, timedOut: true,
externalAbort: false, usage: null }
session.ended status=error
T+~1s auto-fallback to next model in agents.defaults.model.fallbacks
T+~120s same idleTimedOut
T+~1s next fallback
T+~120s same
=> turn final-fails after ~6 minutes
Failed runs report no usage and zero output tokens — fully silent for the entire idle window. Same pattern occurs in cron jobs: an isolated Lawzane outreach cron yesterday ran for ~6.8h on a single Gemini call before the wall-clock timeout fired (in that run, idleTimedOut did not trigger because thinking-token heartbeat-like data kept the stream "active" but no useful content was emitted — different but related failure).
Setting agents.defaults.thinkingDefault to medium (instead of adaptive) did not eliminate the stalls. We are testing thinking: off next.
Request
Expose the model stream idle-timeout as a user-configurable setting:
- Primary:
agents.defaults.idleTimeoutMs — global default in openclaw.json, settable via openclaw config set. Default current behavior (120000?), users can raise to e.g. 600000 for tolerant slow-stream providers.
- Per-agent override:
agents.list[<id>].idleTimeoutMs.
- Per-cron-job override:
--idle-timeout-ms flag on openclaw cron edit, parallel to existing --timeout-seconds.
- (Optional but valuable):
0 or null to disable the idle watchdog entirely (rely solely on wall-clock timeoutSeconds).
Why expose rather than just bump the default
Different providers have different reasonable silence windows. OpenAI / Anthropic streams reasoning tokens visibly; Gemini preview buffers them. A one-size-fits-all bump trades reliability against cost-on-dead-streams. Per-agent / per-cron tuning is the right ergonomic.
Workaround for users hitting this
- Trim
agents.defaults.model.fallbacks to [] so a stall fails in ~2 min instead of cycling for ~6 min.
- Move chat / sensitive workloads off Gemini preview models to a non-Google provider (anthropic, openai).
- For mechanical cron jobs, set
thinking: off on the job to avoid the thinking-buffer pathology.
Related
Background
OpenClaw aborts a model call when its stream emits no bytes for ~120s (
idleTimedOut: truein trajectory). For most providers this is appropriate dead-stream protection, but for Google Gemini preview models we observe legitimate completions that are silent on the wire for >120s — likely server-side buffering of thinking tokens — so the watchdog kills them prematurely.The threshold is not user-configurable in this build:
openclaw config schemaunderagents.defaults(or anywhere I could find).OPENCLAW_*_IDLE*only covers OpenAI WS pool, browser/CDP, etc.).idleTimeoutMsparameters / session-binding lifecycle.Result: users hit unrecoverable stalls with no escape hatch.
Reproduction (2026.5.4-beta.1, commit 9cc3ae1)
Provider:
google(google-generative-ai), model:gemini-3.1-pro-preview-customtools(and other Gemini-previewvariants — pro / flash / flash-lite all reproduce).User asks main agent for a moderate-output reply ("List the 46…"). Trajectory shows:
Failed runs report no
usageand zero output tokens — fully silent for the entire idle window. Same pattern occurs in cron jobs: an isolated Lawzane outreach cron yesterday ran for ~6.8h on a single Gemini call before the wall-clock timeout fired (in that run,idleTimedOutdid not trigger because thinking-token heartbeat-like data kept the stream "active" but no useful content was emitted — different but related failure).Setting
agents.defaults.thinkingDefaulttomedium(instead ofadaptive) did not eliminate the stalls. We are testingthinking: offnext.Request
Expose the model stream idle-timeout as a user-configurable setting:
agents.defaults.idleTimeoutMs— global default inopenclaw.json, settable viaopenclaw config set. Default current behavior (120000?), users can raise to e.g.600000for tolerant slow-stream providers.agents.list[<id>].idleTimeoutMs.--idle-timeout-msflag onopenclaw cron edit, parallel to existing--timeout-seconds.0ornullto disable the idle watchdog entirely (rely solely on wall-clocktimeoutSeconds).Why expose rather than just bump the default
Different providers have different reasonable silence windows. OpenAI / Anthropic streams reasoning tokens visibly; Gemini preview buffers them. A one-size-fits-all bump trades reliability against cost-on-dead-streams. Per-agent / per-cron tuning is the right ergonomic.
Workaround for users hitting this
agents.defaults.model.fallbacksto[]so a stall fails in ~2 min instead of cycling for ~6 min.thinking: offon the job to avoid the thinking-buffer pathology.Related