-
-
Notifications
You must be signed in to change notification settings - Fork 54.5k
Description
Bug: Cron timer stops firing after config change (v2026.2.3-1)
Environment: macOS Sequoia (arm64), Node 25.5.0, OpenClaw 2026.2.3-1
Description
After changing the default model in config (from claude-opus-4-5 to anthropic-cloud/claude-opus-4-6), all cron jobs stopped executing. The scheduler initializes correctly but the timer callback (onTimer) never fires.
Steps to Reproduce
- Have working cron jobs (isolated agentTurn, every 5min, model:
local-llm/gemini-2.5-pro) - Change
agents.defaults.model.primaryfrom one model to another - Restart gateway (SIGUSR1 or cold restart)
- Observe: cron jobs never execute again
Observed Behavior
cron: startedlog entry shows correctjobscount andnextWakeAtMscron.statusreturnsenabled: truewith validnextWakeAtMs- No timer tick ever fires — zero cron-related log entries after startup
cron run {jobId}returns{ ran: false, reason: "not-due" }even whennextRunAtMsis in the pastnextRunAtMsinjobs.jsonkeeps advancing forward (gets recomputed) without execution- No errors in logs
What I Tried (all failed)
- SIGUSR1 restart
- Full cold restart (
launchctl bootout+bootstrap) - Deleted ALL jobs +
runs/directory +jobs.json.bak, recreated fresh - Both
sessionTarget: "isolated"(agentTurn) and"main"(systemEvent) - Both
wakeMode: "now"and"next-heartbeat" - Increased
cron.maxConcurrentRunsto 4 - Minimal test job:
{ schedule: { kind: "every", everyMs: 60000 }, payload: { kind: "agentTurn", message: "Reply TEST_OK", model: "local-llm/gemini-2.5-pro" } }
Suspected Root Cause
In onTimer():
await locked(state, async () => {
await ensureLoaded(state, { forceReload: true });
await runDueJobs(state); // <-- never finds due jobs?
await persist(state);
armTimer(state);
});ensureLoaded with forceReload: true calls recomputeNextRuns(state) which recalculates all nextRunAtMs values to the future (via computeNextRunAtMs(schedule, nowMs)). By the time runDueJobs checks now >= nextRunAtMs, the values have already been pushed forward.
However, this code path existed before and worked — so something else may be involved (possibly related to the config change invalidating internal state).
Config (relevant parts)
{
"agents": {
"defaults": {
"model": {
"primary": "anthropic-cloud/claude-opus-4-6",
"fallbacks": ["anthropic-cloud/claude-opus-4-5-20251101", ...]
},
"heartbeat": { "every": "1h", "model": "anthropic-cloud/claude-haiku-4-5-20251101" }
}
},
"cron": { "maxConcurrentRuns": 4 }
}Workaround
Moved critical monitoring jobs to macOS system crontab (running Python scripts directly, no LLM needed). Daily tasks moved to HEARTBEAT.md with manual time-based checks.