You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Behavior bug (silent state corruption, no crash, no error logs)
Beta release blocker
No (workaround exists)
Summary
A heartbeat run that returns any non-token text to a session whose origin.to
is the auto-reply pseudo-target "heartbeat" puts the agent's main session
into a permanent pendingFinalDelivery: true state. Every subsequent
heartbeat tick retries the dead delivery first — failing silently with pendingFinalDeliveryLastError: null despite climbing pendingFinalDeliveryAttemptCount — and bumps updatedAt = now, which keeps
the 30-second skip-window check (runHeartbeatOnce, line 866-870 of heartbeat-runner-DpQCcYf2.js) perpetually true. Result: the heartbeat
scheduler logs heartbeat: started intervalMs: 3600000 cleanly at every
gateway boot, but no actual heartbeat run ever happens again. We observed 64 consecutive hours of silence (2026-05-05 17:54 → 2026-05-08 13:53 WIB)
before investigating. cron list, doctor, and system heartbeat last all
surface nothing about the stuck state.
Reproduction (deterministic on v2026.5.7)
Configure an agent with heartbeat: { every: "60m" } and target unset
(defaults to "none"). Agent has no real Telegram/Discord delivery
target for its heartbeat.
Let the agent's heartbeat fire normally a few times. The session ~/.openclaw/agents/<agent>/sessions/sessions.json for agent:<id>:main
accumulates origin: { label: "heartbeat", from: "heartbeat", to: "heartbeat" }.
Force the heartbeat to return any text other than the bare token. In our
case the agent appended a preamble: "All clear.\n\nHEARTBEAT_OK" instead
of bare HEARTBEAT_OK. (See "Even bare HEARTBEAT_OK still triggers
pending" below — non-token preamble accelerates it but isn't required.)
The session enters pendingFinalDelivery: true with the preamble text.
Wait one heartbeat interval. Inspect:
python3 -c "import json; e=json.load(open('/home/openclaw/.openclaw/agents/<agent>/sessions/sessions.json'))['agent:<agent>:main']; print({k:e.get(k) for k in ['pendingFinalDelivery','pendingFinalDeliveryAttemptCount','pendingFinalDeliveryLastError','updatedAt']})"
pendingFinalDeliveryAttemptCount climbs by 1 every hour, but pendingFinalDeliveryLastError stays null. updatedAt matches each
retry timestamp.
In our reproduction we hit attemptCount: 64 before noticing.
Even bare HEARTBEAT_OK still triggers pending
After tightening the agent's HEARTBEAT.md to forbid preamble and forcing a
fresh session run (manual openclaw system event --mode now --text "test" --url ws://127.0.0.1:18789 --token $OPENCLAW_GATEWAY_TOKEN on a freshly-created
session entry), the agent returned the literal token HEARTBEAT_OK and still
ended with pendingFinalDelivery: true, pendingFinalDeliveryText: "HEARTBEAT_OK".
So the stripHeartbeatToken call in heartbeat-Dynyl6hI.js:52-87 either runs
after the pending-queue write or its empty-after-strip output isn't gating the
queueing. The runner should treat "output that strips to empty" as
effectively-empty and skip the final-delivery queue entirely.
The only thing that prevented the death loop reforming after the fresh
session was that the new session has origin: null and lastTo: null
(rather than origin.to: "heartbeat"), so there's nothing for dispatch to
retry against. Pending stays cosmetically true but updatedAt doesn't get
bumped, and the next 60m tick fires normally.
Two distinct issues compounding
Bug A — pending-delivery flag set even when output is the bare token
pendingFinalDeliveryText: pendingText whenever pendingText is
non-empty by the runner's metric.
For heartbeat sessions, "output that strips to the empty string" should
count as effectively-empty. Currently pendingText = "HEARTBEAT_OK"
reaches that block.
Bug B — silent retry against pseudo-target with no error captured
dispatch-8E8vi2HV.js:227-246 (clearPendingFinalDeliveryAfterSuccess)
only clears the flag on success. There is no corresponding recordPendingFinalDeliveryFailure that captures the error string into pendingFinalDeliveryLastError — so failures look identical to "still
trying" and never surface in logs.
is correct in principle, but combined with retry logic that bumps updatedAt = now on each silent failure, it becomes a perpetual block.
Workaround (proven on v2026.5.7)
Stop gateway → drop the entire agent:<agent>:main entry from sessions.json and remove its associated *.jsonl/*.trajectory.jsonl
files → restart gateway. The runner re-creates the session on the next
tick with origin: null, breaking the dispatch retry loop.
Clearing only the pendingFinalDelivery* fields is insufficient — we
verified within 8 minutes that the same heartbeat output re-creates the
stuck state, because origin.to: "heartbeat" is still on the session and
keeps re-dispatching.
Suggested fixes
(Bug A) In the agent-runner pending-delivery write, gate on isHeartbeatContentEffectivelyEmpty(stripHeartbeatToken(text).text). If
stripped output is empty, skip the pending queue write entirely.
(Bug B-1) Capture dispatch failures into pendingFinalDeliveryLastError so silent failures become visible.
(Bug B-2) When delivery.to === "heartbeat" and no channel plugin
resolves, treat as clearPendingFinalDeliveryAfterSuccess — the
pseudo-target acknowledges by reaching it; persistent retry is the bug.
(Hardening)openclaw doctor should warn when any session has pendingFinalDelivery: true AND now - pendingFinalDeliveryCreatedAt > 1h
AND pendingFinalDeliveryLastError === null. That's the diagnostic
triple that masks this bug.
Bug type
Behavior bug (silent state corruption, no crash, no error logs)
Beta release blocker
No (workaround exists)
Summary
A heartbeat run that returns any non-token text to a session whose
origin.tois the auto-reply pseudo-target
"heartbeat"puts the agent's main sessioninto a permanent
pendingFinalDelivery: truestate. Every subsequentheartbeat tick retries the dead delivery first — failing silently with
pendingFinalDeliveryLastError: nulldespite climbingpendingFinalDeliveryAttemptCount— and bumpsupdatedAt = now, which keepsthe 30-second skip-window check (
runHeartbeatOnce, line 866-870 ofheartbeat-runner-DpQCcYf2.js) perpetually true. Result: the heartbeatscheduler logs
heartbeat: started intervalMs: 3600000cleanly at everygateway boot, but no actual heartbeat run ever happens again. We observed
64 consecutive hours of silence (2026-05-05 17:54 → 2026-05-08 13:53 WIB)
before investigating.
cron list,doctor, andsystem heartbeat lastallsurface nothing about the stuck state.
Reproduction (deterministic on v2026.5.7)
heartbeat: { every: "60m" }andtargetunset(defaults to
"none"). Agent has no real Telegram/Discord deliverytarget for its heartbeat.
~/.openclaw/agents/<agent>/sessions/sessions.jsonforagent:<id>:mainaccumulates
origin: { label: "heartbeat", from: "heartbeat", to: "heartbeat" }.case the agent appended a preamble:
"All clear.\n\nHEARTBEAT_OK"insteadof bare
HEARTBEAT_OK. (See "Even bare HEARTBEAT_OK still triggerspending" below — non-token preamble accelerates it but isn't required.)
pendingFinalDelivery: truewith the preamble text.pendingFinalDeliveryAttemptCountclimbs by 1 every hour, butpendingFinalDeliveryLastErrorstaysnull.updatedAtmatches eachretry timestamp.
In our reproduction we hit
attemptCount: 64before noticing.Even bare HEARTBEAT_OK still triggers pending
After tightening the agent's HEARTBEAT.md to forbid preamble and forcing a
fresh session run (manual
openclaw system event --mode now --text "test" --url ws://127.0.0.1:18789 --token $OPENCLAW_GATEWAY_TOKENon a freshly-createdsession entry), the agent returned the literal token
HEARTBEAT_OKand stillended with
pendingFinalDelivery: true, pendingFinalDeliveryText: "HEARTBEAT_OK".So the
stripHeartbeatTokencall inheartbeat-Dynyl6hI.js:52-87either runsafter the pending-queue write or its empty-after-strip output isn't gating the
queueing. The runner should treat "output that strips to empty" as
effectively-empty and skip the final-delivery queue entirely.
The only thing that prevented the death loop reforming after the fresh
session was that the new session has
origin: nullandlastTo: null(rather than
origin.to: "heartbeat"), so there's nothing for dispatch toretry against. Pending stays cosmetically true but
updatedAtdoesn't getbumped, and the next 60m tick fires normally.
Two distinct issues compounding
Bug A — pending-delivery flag set even when output is the bare token
agent-runner.runtime-DQsCsHUA.js:4093-4095setspendingFinalDelivery: truependingFinalDeliveryText: pendingTextwheneverpendingTextisnon-empty by the runner's metric.
count as effectively-empty. Currently
pendingText = "HEARTBEAT_OK"reaches that block.
Bug B — silent retry against pseudo-target with no error captured
dispatch-8E8vi2HV.js:227-246(clearPendingFinalDeliveryAfterSuccess)only clears the flag on success. There is no corresponding
recordPendingFinalDeliveryFailurethat captures the error string intopendingFinalDeliveryLastError— so failures look identical to "stilltrying" and never surface in logs.
delivery.to === "heartbeat"(the auto-reply pseudo-channel set onthe session origin) and no real channel adapter resolves, the dispatch
path returns silently. Compare [Bug]: deliverySucceeded=true returned when no adapter was invoked (early returns in deliverOutboundPayloads masquerade as success) #78532 (closed 2026-05-07) which addressed
a similar
deliverySucceeded=true masquerade— this is the same familyof telemetry-vs-state mismatch on the failure side.
runHeartbeatOnce:updatedAt = nowon each silent failure, it becomes a perpetual block.Workaround (proven on v2026.5.7)
Stop gateway → drop the entire
agent:<agent>:mainentry fromsessions.jsonand remove its associated*.jsonl/*.trajectory.jsonlfiles → restart gateway. The runner re-creates the session on the next
tick with
origin: null, breaking the dispatch retry loop.Clearing only the
pendingFinalDelivery*fields is insufficient — weverified within 8 minutes that the same heartbeat output re-creates the
stuck state, because
origin.to: "heartbeat"is still on the session andkeeps re-dispatching.
Suggested fixes
isHeartbeatContentEffectivelyEmpty(stripHeartbeatToken(text).text). Ifstripped output is empty, skip the pending queue write entirely.
pendingFinalDeliveryLastErrorso silent failures become visible.delivery.to === "heartbeat"and no channel pluginresolves, treat as
clearPendingFinalDeliveryAfterSuccess— thepseudo-target acknowledges by reaching it; persistent retry is the bug.
openclaw doctorshould warn when any session haspendingFinalDelivery: trueANDnow - pendingFinalDeliveryCreatedAt > 1hAND
pendingFinalDeliveryLastError === null. That's the diagnostictriple that masks this bug.
Environment
fast— 60m heartbeat, notargetset, runsdeepseek-v4-flashviaopencode-goprovideruser-facing channel.
Related issues (not duplicates)
mechanism may be the underlying cause)
pending-deliverables queue durability)
exec-completion interrupt (CLOSED — related fix area)