Bug type
Behavior bug (incorrect output/state without crash)
Beta release blocker
No
Summary
On a gateway that has been idle long enough for the subagent runtime's plugin host to cool, the first memory-core dreaming-narrative subagent.run of the morning sweep routinely exceeds the hard-coded 60 s NARRATIVE_TIMEOUT_MS because loading non-bundled tsx extensions eats the full timeout before Claude is invoked — so the first workspace deterministically gets no light-phase dream-diary entry while every subsequent workspace in the same sweep completes in ~10-15 s.
Steps to reproduce
-
Install 4 non-bundled extensions under ~/.openclaw/extensions/*/src/index.ts (ones that tsx must compile on first load). Keep them out of plugins.allow.
-
Enable the memory-core nightly dreaming cron:
"plugins": {
"entries": {
"memory-core": {
"enabled": true,
"config": {
"dreaming": { "enabled": true, "frequency": "0 5 * * *" }
}
}
}
}
-
Configure ≥ 2 workspaces in agents.list (the one that sorts first in the sweep becomes the affected victim).
-
Let the gateway sit idle ≥ ~30 min before the cron fires (e.g. run the gateway overnight with no inbound messages). The subagent runtime's plugin host cools during the idle window.
-
Wait for the dreaming cron to fire.
-
Inspect the journal/log around the fire time.
Expected behavior
Every workspace's light-phase narrative completes within a similar latency envelope (~10-15 s end-to-end for 50-80 staged candidates), with the light-phase dream-diary entry written to DREAMS.md. This is what the 2nd–Nth workspaces in the same sweep do on the same run.
Actual behavior
The first-in-sweep workspace consistently hits status=timeout on light phase and gets no light-phase diary entry. REM phase recovers for that workspace because by then the plugin host has warmed. All other workspaces in the same sweep succeed.
Timeline from one affected morning (relevant lines only, timestamps verbatim, paths anonymized):
05:00:00.158 [plugins] memory-core: managed dreaming cron could not be reconciled (cron service unavailable).
05:00:01.149 [plugins] memory-core: light dreaming staged 70 candidate(s) [workspace=~/.openclaw/workspace-<first>]
05:00:34.931 [whatsapp] Web connection closed (status 408). Retry 1/12 in 2.41s… # incidental; not related
05:00:46.875 [plugins] plugins.allow is empty; discovered non-bundled plugins may auto-load: <four non-bundled ids> (~/.openclaw/extensions/<id>/src/index.ts ...). Set plugins.allow to explicit trusted ids.
05:00:47.721 [<plugin-a>] register() called, registering hook + tools
05:00:47.735 [<plugin-b>] register() called, registering hook + tool
05:01:07.956 [plugins] memory-core: narrative generation ended with status=timeout for light phase.
05:01:25.763 [plugins] memory-core: REM dreaming wrote reflections from N recent memory trace(s) [workspace=~/.openclaw/workspace-<first>]
05:01:38.190 [plugins] memory-core: dream diary entry written for rem phase [workspace=~/.openclaw/workspace-<first>]
05:01:38.972 [plugins] memory-core: light dreaming staged 79 candidate(s) [workspace=~/.openclaw/workspace-<second>]
05:01:51.115 [plugins] memory-core: dream diary entry written for light phase [workspace=~/.openclaw/workspace-<second>] # ← 12 s; plugin host now warm
Net: light dreaming staged → plugins.allow is empty + register() = ~46 s of silence (the tsx-import + plugin-host init), after which there is ~22 s left in the timeout budget, which runs out before Claude returns. Total wall time from stage to timeout = ~66 s > NARRATIVE_TIMEOUT_MS = 60_000.
Reproduced 3/4 consecutive mornings on the same install. On the 4th morning, ~25 min of inbound channel activity in the hour before the sweep kept the runtime warm, and the first workspace completed in 23 s.
The session transcript written for the timed-out run (sessionKey dreaming-narrative-light-<hash>-<nowMs>) confirms that the Claude call itself, once it actually starts, completes in ~5-13 s — the bottleneck is everything in front of it, not the model.
OpenClaw version
2026.4.14 (323493f)
Operating system
Ubuntu 24.04 LTS
Install method
npm global (/usr/lib/node_modules/openclaw)
Model
anthropic/claude-sonnet-4-6
Provider / routing chain
openclaw -> anthropic
Additional provider/model setup details
None relevant. Bug is runtime-init / plugin-loader timing, not model- or provider-dependent. Reproduces with stock anthropic provider config; prompt is the built-in NARRATIVE_SYSTEM_PROMPT in extensions/memory-core/src/dreaming-narrative.ts.
Logs, screenshots, and evidence
Source references:
extensions/memory-core/src/dreaming-narrative.ts:87 — const NARRATIVE_TIMEOUT_MS = 60_000;
extensions/memory-core/src/dreaming-narrative.ts:877 — the subagent.waitForRun({ runId, timeoutMs: NARRATIVE_TIMEOUT_MS }) call
extensions/memory-core/src/dreaming-narrative.ts generateAndAppendDreamNarrative — the call site that emits narrative generation ended with status=timeout when the wait elapses
Log excerpts already inlined above. Session transcripts corroborate: the Claude call only starts writing to the session jsonl ~40-65 s after light dreaming staged, i.e. the full NARRATIVE_TIMEOUT_MS budget is spent on tsx-loading plugins, not on the narrative generation itself.
Impact and severity
- Affected: any deployment that (a) has non-bundled extensions under
~/.openclaw/extensions/ loaded from TypeScript sources via tsx, (b) runs the memory-core nightly dreaming cron, (c) has typical idle overnight before the cron fires. This is the default shape for any install that adds even one custom TypeScript extension and leaves the gateway up 24/7.
- Severity: Moderate — the first workspace in the sweep silently loses one of its two daily dream-diary entries (light phase). Quality/parity regression, not a crash.
- Frequency: Deterministic on cold mornings (3/3 observed); masked only by luck when unrelated channel activity happens to warm the runtime just before the cron fires.
- Consequence: One workspace's light-phase narrative and its associated short-term promotions for that day never land. Asymmetric across workspaces (always the same one in the sort order), so a single user's dream-diary silently thins out over time relative to others.
Additional information
Suggested fixes (in rough order of preference, not prescriptive):
- Eager-load non-bundled plugins at gateway startup so the first runtime consumer doesn't pay the
tsx compile cost. plugins.allow already exists as the trust boundary — it could also gate eager-load. Setting plugins.allow in openclaw.json on an affected install already causes the gateway to list the expected plugins as loaded on startup (gateway] ready (N plugins: ...; 18.7s)), so the mechanism exists; it just doesn't preempt the per-session plugin-host init on the subagent runtime path.
- Make
NARRATIVE_TIMEOUT_MS configurable via plugins.entries["memory-core"].config.narrativeTimeoutMs (same shape as the existing dreaming.* config). This is a partial mitigation — it masks the slow cold-start instead of fixing it, but would at least make the first dream of the morning reliable for installs where eager-loading is undesirable.
- Warm the subagent runtime's plugin host in parallel with gateway boot, so scheduled crons that fire soon after boot also hit a hot host.
Current client-side workaround: a lightweight agentTurn cron scheduled 2 min before the dreaming cron, which forces the subagent runtime to initialize on a throwaway session. Works, but it's an ugly papering-over of a cold-start that should not be in the hot path of the built-in dreaming cron.
Bug type
Behavior bug (incorrect output/state without crash)
Beta release blocker
No
Summary
On a gateway that has been idle long enough for the subagent runtime's plugin host to cool, the first
memory-coredreaming-narrativesubagent.runof the morning sweep routinely exceeds the hard-coded 60 sNARRATIVE_TIMEOUT_MSbecause loading non-bundledtsxextensions eats the full timeout before Claude is invoked — so the first workspace deterministically gets no light-phase dream-diary entry while every subsequent workspace in the same sweep completes in ~10-15 s.Steps to reproduce
Install 4 non-bundled extensions under
~/.openclaw/extensions/*/src/index.ts(ones thattsxmust compile on first load). Keep them out ofplugins.allow.Enable the
memory-corenightly dreaming cron:Configure ≥ 2 workspaces in
agents.list(the one that sorts first in the sweep becomes the affected victim).Let the gateway sit idle ≥ ~30 min before the cron fires (e.g. run the gateway overnight with no inbound messages). The subagent runtime's plugin host cools during the idle window.
Wait for the dreaming cron to fire.
Inspect the journal/log around the fire time.
Expected behavior
Every workspace's light-phase narrative completes within a similar latency envelope (~10-15 s end-to-end for 50-80 staged candidates), with the light-phase dream-diary entry written to
DREAMS.md. This is what the 2nd–Nth workspaces in the same sweep do on the same run.Actual behavior
The first-in-sweep workspace consistently hits
status=timeouton light phase and gets no light-phase diary entry. REM phase recovers for that workspace because by then the plugin host has warmed. All other workspaces in the same sweep succeed.Timeline from one affected morning (relevant lines only, timestamps verbatim, paths anonymized):
Net:
light dreaming staged→plugins.allow is empty+ register() = ~46 s of silence (thetsx-import + plugin-host init), after which there is ~22 s left in the timeout budget, which runs out before Claude returns. Total wall time from stage to timeout = ~66 s >NARRATIVE_TIMEOUT_MS = 60_000.Reproduced 3/4 consecutive mornings on the same install. On the 4th morning, ~25 min of inbound channel activity in the hour before the sweep kept the runtime warm, and the first workspace completed in 23 s.
The session transcript written for the timed-out run (sessionKey
dreaming-narrative-light-<hash>-<nowMs>) confirms that the Claude call itself, once it actually starts, completes in ~5-13 s — the bottleneck is everything in front of it, not the model.OpenClaw version
2026.4.14 (323493f)
Operating system
Ubuntu 24.04 LTS
Install method
npm global (
/usr/lib/node_modules/openclaw)Model
anthropic/claude-sonnet-4-6
Provider / routing chain
openclaw -> anthropic
Additional provider/model setup details
None relevant. Bug is runtime-init / plugin-loader timing, not model- or provider-dependent. Reproduces with stock
anthropicprovider config; prompt is the built-inNARRATIVE_SYSTEM_PROMPTinextensions/memory-core/src/dreaming-narrative.ts.Logs, screenshots, and evidence
Source references:
extensions/memory-core/src/dreaming-narrative.ts:87—const NARRATIVE_TIMEOUT_MS = 60_000;extensions/memory-core/src/dreaming-narrative.ts:877— thesubagent.waitForRun({ runId, timeoutMs: NARRATIVE_TIMEOUT_MS })callextensions/memory-core/src/dreaming-narrative.tsgenerateAndAppendDreamNarrative— the call site that emitsnarrative generation ended with status=timeoutwhen the wait elapsesLog excerpts already inlined above. Session transcripts corroborate: the Claude call only starts writing to the session jsonl ~40-65 s after
light dreaming staged, i.e. the fullNARRATIVE_TIMEOUT_MSbudget is spent ontsx-loading plugins, not on the narrative generation itself.Impact and severity
~/.openclaw/extensions/loaded from TypeScript sources viatsx, (b) runs thememory-corenightly dreaming cron, (c) has typical idle overnight before the cron fires. This is the default shape for any install that adds even one custom TypeScript extension and leaves the gateway up 24/7.Additional information
Suggested fixes (in rough order of preference, not prescriptive):
tsxcompile cost.plugins.allowalready exists as the trust boundary — it could also gate eager-load. Settingplugins.allowinopenclaw.jsonon an affected install already causes the gateway to list the expected plugins as loaded on startup (gateway] ready (N plugins: ...; 18.7s)), so the mechanism exists; it just doesn't preempt the per-session plugin-host init on the subagent runtime path.NARRATIVE_TIMEOUT_MSconfigurable viaplugins.entries["memory-core"].config.narrativeTimeoutMs(same shape as the existingdreaming.*config). This is a partial mitigation — it masks the slow cold-start instead of fixing it, but would at least make the first dream of the morning reliable for installs where eager-loading is undesirable.Current client-side workaround: a lightweight
agentTurncron scheduled 2 min before the dreaming cron, which forces the subagent runtime to initialize on a throwaway session. Works, but it's an ugly papering-over of a cold-start that should not be in the hot path of the built-in dreaming cron.