Summary
applyJobResult in src/cron/service/timer.ts conflates two semantics on the MIN_REFIRE_GAP_MS constant:
- Lower bound for the next cron fire, protecting against the case where
computeJobNextRunAtMs returns a value in the same second as the run just finished (the #17821 fix).
- Fallback value for when
computeJobNextRunAtMs returns undefined entirely.
The second usage is wrong: when the schedule cannot be computed at all, falling back to endedAt + 2s causes the scheduler to re-fire the job 2 seconds later. That run then fails/skips for the same reason, and we're in a tight loop. The structurally identical bug exists in the error-backoff branch.
#17821 is the closed predecessor and covers the "landed in the same second" case. It does not cover the undefined case, and the existing regression tests (timer.regression.test.ts: spin-loop-17821 and spin-gap-17821) do not exercise it.
Root cause walkthrough
1. Silent undefined return from computeNextRunAtMs
src/cron/schedule.ts → computeNextRunAtMs:
const cron = resolveCronFromSchedule(schedule);
if (!cron) return;
let next = cron.nextRun(new Date(nowMs));
if (!next) return;
If croner's cron.nextRun() returns null (observed for certain tz/ICU combinations — see environment below), the function silently returns undefined. No exception is thrown.
2. Silent fall-through in computeStaggeredCronNextRunAtMs
src/cron/schedule.ts → computeStaggeredCronNextRunAtMs:
for (let attempt = 0; attempt < 4; attempt += 1) {
const baseNext = computeNextRunAtMs(job.schedule, cursorMs);
if (baseNext === undefined) return;
const shifted = baseNext + offsetMs;
if (shifted > nowMs) return shifted;
cursorMs = Math.max(cursorMs + 1, baseNext + 1000);
}
// implicit undefined return
The 4-attempt loop can exit without returning, implicitly giving undefined.
3. applyJobResult bad fallback in success path
src/cron/service/timer.ts, cron success branch:
} else if (isJobEnabled(job)) {
let naturalNext: number | undefined;
try {
naturalNext = opts?.preserveSchedule && job.schedule.kind === "every"
? computeNextWithPreservedLastRun(result.endedAt)
: computeJobNextRunAtMs(job, result.endedAt);
} catch (err) {
recordScheduleComputeError({ state, job, err });
// no return; execution continues
}
if (job.schedule.kind === "cron") {
const minNext = result.endedAt + MIN_REFIRE_GAP_MS;
job.state.nextRunAtMs =
naturalNext !== undefined ? Math.max(naturalNext, minNext) : minNext;
// ^^^^^^^ bug
} else {
job.state.nextRunAtMs = naturalNext;
}
}
When naturalNext === undefined, nextRunAtMs becomes result.endedAt + 2000. Next tick (2 seconds later) the scheduler re-fires the job.
There's also a subtle secondary issue: recordScheduleComputeError already sets job.state.nextRunAtMs = undefined for the THROW case, but because the catch does not return, the subsequent assignment in the same branch overwrites it back to minNext.
4. Same shape in the error-backoff branch
} else if (result.status === "error" && isJobEnabled(job)) {
// ...
const backoffNext = result.endedAt + backoff;
job.state.nextRunAtMs =
normalNext !== undefined ? Math.max(normalNext, backoffNext) : backoffNext;
// ^^^^^^^^^^^^ same issue
}
Here the fallback is errorBackoffMs(consecutiveErrors) rather than MIN_REFIRE_GAP_MS, which mitigates the tight loop via exponential backoff — but for the first few errors the interval is still short enough to cause noticeable channel spam, and the structural confusion is the same.
Observed symptom
A single cron job fires ~10 times in ~10 minutes. Each run completes in 4–6 seconds (agent turn duration); between runs the scheduler wakes up on its newly-set short nextRunAtMs and re-fires. Visible as channel spam (Telegram/WeChat messages repeating). On macOS specifically; same version on Ubuntu appears unaffected.
I don't have a deterministic minimal repro for naturalNext === undefined — a test * * * * * cron in Asia/Shanghai tz fired cleanly every 60s in my environment during verification. The undefined path appears to need a specific croner/ICU edge case that's not reliably triggered by a simple test job, but the structural bug is present in the code regardless of how often the path is hit.
Proposed fix
Treat undefined as "unknown, clear the schedule" rather than as "use the minimum gap". Success path:
if (job.schedule.kind === "cron") {
if (naturalNext !== undefined) {
const minNext = result.endedAt + MIN_REFIRE_GAP_MS;
job.state.nextRunAtMs = Math.max(naturalNext, minNext);
} else {
// Schedule computation failed; clearing nextRunAtMs avoids a
// respin loop. The job will be rescheduled on the next heartbeat /
// timer arm when the schedule can be recomputed from scratch.
job.state.nextRunAtMs = undefined;
state.deps.log.warn(
{ jobId: job.id, jobName: job.name },
"cron: nextRun unresolved — clearing to avoid respin loop",
);
}
} else {
job.state.nextRunAtMs = naturalNext;
}
The error-backoff branch should get a parallel treatment. Additionally the catch blocks around computeJobNextRunAtMs could early-return after recordScheduleComputeError (or set a flag checked below) so the subsequent assignment doesn't race with the error handler.
Suggested regression test
In src/cron/service/timer.regression.test.ts, alongside spin-loop-17821 / spin-gap-17821:
it("clears nextRunAtMs when schedule compute returns undefined (does not spin)", async () => {
// Arrange: enabled cron job whose computeJobNextRunAtMs path returns undefined
// (mock via a stubbed croner/schedule helper, or a test hook).
// ...
applyJobResult(state, job, { status: "ok", startedAt, endedAt });
expect(job.state.nextRunAtMs).toBeUndefined();
});
An analogous test for the error-backoff branch would prevent regression in both paths.
Environment
- OpenClaw:
2026.4.11 (installed via Homebrew, /opt/homebrew/lib/node_modules/openclaw/)
- OS: macOS 26.4.1 (Darwin 25.4.0), arm64, MacBook Pro 14" (M5)
- Node:
v25.9.0
- Affected job combinations observed:
sessionTarget: isolated + payload.kind: agentTurn and sessionTarget: main + payload.kind: systemEvent + wakeMode: next-heartbeat
- Not observed on Ubuntu running the same OpenClaw version, suggesting croner/ICU/tz edge case is platform-specific at the trigger layer — though the structural bug above is platform-independent.
Related
#17821 — closed; added the MIN_REFIRE_GAP_MS safety net for the same-second case. This issue reports the undefined case that fix missed.
#52097 — open; "Cron job runs repeatedly after manual trigger". Symptom similar, root cause not confirmed the same but worth cross-linking for triage.
Local patch (for reference)
I've been running the success-path fix above on the shipped bundle since the bug was diagnosed, with cron jobs behaving correctly. The new warn log line has not fired in my environment, consistent with the note above that the undefined path is rare on my specific setup — the patch functions as a safety net rather than a hot path.
Summary
applyJobResultinsrc/cron/service/timer.tsconflates two semantics on theMIN_REFIRE_GAP_MSconstant:computeJobNextRunAtMsreturns a value in the same second as the run just finished (the#17821fix).computeJobNextRunAtMsreturnsundefinedentirely.The second usage is wrong: when the schedule cannot be computed at all, falling back to
endedAt + 2scauses the scheduler to re-fire the job 2 seconds later. That run then fails/skips for the same reason, and we're in a tight loop. The structurally identical bug exists in the error-backoff branch.#17821is the closed predecessor and covers the "landed in the same second" case. It does not cover theundefinedcase, and the existing regression tests (timer.regression.test.ts:spin-loop-17821andspin-gap-17821) do not exercise it.Root cause walkthrough
1. Silent
undefinedreturn fromcomputeNextRunAtMssrc/cron/schedule.ts→computeNextRunAtMs:If
croner'scron.nextRun()returnsnull(observed for certain tz/ICU combinations — see environment below), the function silently returnsundefined. No exception is thrown.2. Silent fall-through in
computeStaggeredCronNextRunAtMssrc/cron/schedule.ts→computeStaggeredCronNextRunAtMs:The 4-attempt loop can exit without returning, implicitly giving
undefined.3.
applyJobResultbad fallback in success pathsrc/cron/service/timer.ts, cron success branch:When
naturalNext === undefined,nextRunAtMsbecomesresult.endedAt + 2000. Next tick (2 seconds later) the scheduler re-fires the job.There's also a subtle secondary issue:
recordScheduleComputeErroralready setsjob.state.nextRunAtMs = undefinedfor the THROW case, but because thecatchdoes notreturn, the subsequent assignment in the same branch overwrites it back tominNext.4. Same shape in the error-backoff branch
Here the fallback is
errorBackoffMs(consecutiveErrors)rather thanMIN_REFIRE_GAP_MS, which mitigates the tight loop via exponential backoff — but for the first few errors the interval is still short enough to cause noticeable channel spam, and the structural confusion is the same.Observed symptom
A single cron job fires ~10 times in ~10 minutes. Each run completes in 4–6 seconds (agent turn duration); between runs the scheduler wakes up on its newly-set short
nextRunAtMsand re-fires. Visible as channel spam (Telegram/WeChat messages repeating). On macOS specifically; same version on Ubuntu appears unaffected.I don't have a deterministic minimal repro for
naturalNext === undefined— a test* * * * *cron inAsia/Shanghaitz fired cleanly every 60s in my environment during verification. Theundefinedpath appears to need a specific croner/ICU edge case that's not reliably triggered by a simple test job, but the structural bug is present in the code regardless of how often the path is hit.Proposed fix
Treat
undefinedas "unknown, clear the schedule" rather than as "use the minimum gap". Success path:The error-backoff branch should get a parallel treatment. Additionally the
catchblocks aroundcomputeJobNextRunAtMscould early-returnafterrecordScheduleComputeError(or set a flag checked below) so the subsequent assignment doesn't race with the error handler.Suggested regression test
In
src/cron/service/timer.regression.test.ts, alongsidespin-loop-17821/spin-gap-17821:An analogous test for the error-backoff branch would prevent regression in both paths.
Environment
2026.4.11(installed via Homebrew,/opt/homebrew/lib/node_modules/openclaw/)v25.9.0sessionTarget: isolated + payload.kind: agentTurnandsessionTarget: main + payload.kind: systemEvent + wakeMode: next-heartbeatRelated
#17821— closed; added theMIN_REFIRE_GAP_MSsafety net for the same-second case. This issue reports theundefinedcase that fix missed.#52097— open; "Cron job runs repeatedly after manual trigger". Symptom similar, root cause not confirmed the same but worth cross-linking for triage.Local patch (for reference)
I've been running the success-path fix above on the shipped bundle since the bug was diagnosed, with cron jobs behaving correctly. The new
warnlog line has not fired in my environment, consistent with the note above that theundefinedpath is rare on my specific setup — the patch functions as a safety net rather than a hot path.