Skip to content

Bug: Cron timer stops firing after config model change (v2026.2.3-1) #10584

@alexioklini

Description

@alexioklini

Bug: Cron timer stops firing after config change (v2026.2.3-1)

Environment: macOS Sequoia (arm64), Node 25.5.0, OpenClaw 2026.2.3-1

Description

After changing the default model in config (from claude-opus-4-5 to anthropic-cloud/claude-opus-4-6), all cron jobs stopped executing. The scheduler initializes correctly but the timer callback (onTimer) never fires.

Steps to Reproduce

  1. Have working cron jobs (isolated agentTurn, every 5min, model: local-llm/gemini-2.5-pro)
  2. Change agents.defaults.model.primary from one model to another
  3. Restart gateway (SIGUSR1 or cold restart)
  4. Observe: cron jobs never execute again

Observed Behavior

  • cron: started log entry shows correct jobs count and nextWakeAtMs
  • cron.status returns enabled: true with valid nextWakeAtMs
  • No timer tick ever fires — zero cron-related log entries after startup
  • cron run {jobId} returns { ran: false, reason: "not-due" } even when nextRunAtMs is in the past
  • nextRunAtMs in jobs.json keeps advancing forward (gets recomputed) without execution
  • No errors in logs

What I Tried (all failed)

  • SIGUSR1 restart
  • Full cold restart (launchctl bootout + bootstrap)
  • Deleted ALL jobs + runs/ directory + jobs.json.bak, recreated fresh
  • Both sessionTarget: "isolated" (agentTurn) and "main" (systemEvent)
  • Both wakeMode: "now" and "next-heartbeat"
  • Increased cron.maxConcurrentRuns to 4
  • Minimal test job: { schedule: { kind: "every", everyMs: 60000 }, payload: { kind: "agentTurn", message: "Reply TEST_OK", model: "local-llm/gemini-2.5-pro" } }

Suspected Root Cause

In onTimer():

await locked(state, async () => {
    await ensureLoaded(state, { forceReload: true });
    await runDueJobs(state);  // <-- never finds due jobs?
    await persist(state);
    armTimer(state);
});

ensureLoaded with forceReload: true calls recomputeNextRuns(state) which recalculates all nextRunAtMs values to the future (via computeNextRunAtMs(schedule, nowMs)). By the time runDueJobs checks now >= nextRunAtMs, the values have already been pushed forward.

However, this code path existed before and worked — so something else may be involved (possibly related to the config change invalidating internal state).

Config (relevant parts)

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "anthropic-cloud/claude-opus-4-6",
        "fallbacks": ["anthropic-cloud/claude-opus-4-5-20251101", ...]
      },
      "heartbeat": { "every": "1h", "model": "anthropic-cloud/claude-haiku-4-5-20251101" }
    }
  },
  "cron": { "maxConcurrentRuns": 4 }
}

Workaround

Moved critical monitoring jobs to macOS system crontab (running Python scripts directly, no LLM needed). Daily tasks moved to HEARTBEAT.md with manual time-based checks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions