-
-
Notifications
You must be signed in to change notification settings - Fork 52.6k
Closed
Closed
Copy link
Description
Bug Description
requestHeartbeatNow() with non-interval reasons (e.g. exec:SESSION:exit, exec-event) bypasses the heartbeat interval check in the runner's run() function, causing rapid-fire heartbeat runs instead of respecting the configured interval (e.g. 30 minutes).
Root Cause
In the heartbeat runner, the run() function only enforces the interval for reason === "interval":
const isInterval = reason === "interval";
// ...
for (const agent of state.agents.values()) {
if (isInterval && now < agent.nextDueMs) continue; // only checked for "interval"
// ... runOnce fires regardless of nextDueMs for non-interval reasons
}Meanwhile, in the subagent registry:
// Every exec tool completion triggers this:
requestHeartbeatNow({ reason: `exec:${session.id}:exit` });
// Every exec system event triggers this:
requestHeartbeatNow({ reason: "exec-event" });And in the schedule() function's finally block:
finally {
running = false;
if (pendingWake || scheduled) schedule(delay, "normal"); // immediately reschedules
}The Cascade
- Normal heartbeat fires (reason:
"interval") — agent runs, possibly usingexectool or spawning sub-agents - Exec tool completes →
requestHeartbeatNow({ reason: "exec:xxx:exit" })→ setspendingWake - Since
running === true(heartbeat is still in progress), theschedule()function defers - When the current heartbeat run finishes, the
finallyblock seespendingWakeand immediately reschedules - Next run fires with reason
"exec:xxx:exit"— not"interval"— so the interval check is skipped advanceAgentSchedulesetsnextDueMs = now + intervalMs, but this is irrelevant since the check is bypassed- If any concurrent sub-agents or exec processes are running, their completions keep feeding
requestHeartbeatNow() - Result: infinite loop of heartbeat runs, each completing in ~3-7 seconds
Observed Impact
- 244 heartbeat runs in ~2.5 hours instead of the expected ~5 (30-minute interval)
- All runs used
claude-opus-4-6-thinking(the configured main model) - Quota exhausted, estimated ~$90 in wasted tokens
- Normal heartbeat interval: 1,800,000ms (30 min), actual interval during flood: ~3-10 seconds
- Triggered by sub-agent exec completions feeding back into the heartbeat wake system
Evidence from Logs
# 244 heartbeat runs, all in the same session
244 messageChannel=heartbeat (sessionId=caa61517-...)
# Runs per minute during flood peak:
10 2026-02-16T03:40
21 2026-02-16T04:10
10 2026-02-16T04:38
10 2026-02-16T04:42
10 2026-02-16T04:44
# 352 exec tool calls across all runs that day, each generating exit events
Suggested Fix
Non-interval reasons like exec-event and exec:SESSION:exit should still respect the heartbeat interval (or at minimum a cooldown period). Options:
- Enforce
nextDueMsfor all reasons (simplest): Remove theisIntervalgate so all reasons respect the schedule - Add a minimum cooldown: Even for exec-event reasons, require at least N seconds since the last heartbeat run
- Debounce exec-event wakes: Coalesce multiple exec completions into a single heartbeat wake with a longer delay (e.g. 30-60 seconds)
- Separate exec-event handling: Don't use the heartbeat runner for exec-event delivery — handle it through a different path that doesn't trigger full heartbeat runs
Option 2 or 3 seems most appropriate since exec-events should eventually trigger a heartbeat check, just not at the rate of one per exec completion.
Environment
- OpenClaw version: latest beta (wizard version 2026.2.13)
- Node.js: v22.22.0
- OS: Ubuntu 24.04
- Config:
heartbeat.every = 1800000(30 min)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels