Bug Description
When the gateway is restarted while multiple cron jobs have nextRunAtMs values in the past (i.e., they were scheduled to fire while the gateway was down or being restarted), the gateway attempts to fire all overdue jobs simultaneously on startup. This overwhelms the gateway process and makes it completely unresponsive — the UI shows no cron jobs, openclaw cron list times out after 30s, and the WebSocket endpoint stops responding to cron-related requests.
Other gateway functions (e.g., openclaw status, sessions.list from UI) continue to work, suggesting the cron subsystem specifically is blocked.
Steps to Reproduce
- Set up multiple cron jobs with
nextRunAtMs state values
- Stop the gateway (or let it crash during a cron run)
- Wait until several cron jobs become overdue (their
nextRunAtMs is in the past)
- Restart the gateway
- Immediately try
openclaw cron list or check the UI → times out / shows no cron jobs
Expected Behavior
The gateway should either:
- Stagger overdue jobs — fire them one at a time with a configurable delay between each, rather than all at once
- Skip overdue jobs — detect that the scheduled time has passed and advance
nextRunAtMs to the next occurrence based on the cron expression
- Rate-limit concurrent cron executions — enforce a maximum number of simultaneous cron sessions (ideally configurable, default 1)
Actual Behavior
- All overdue cron jobs fire simultaneously on startup
- The gateway process (Node.js, ~430MB RSS) becomes completely unresponsive to cron-related WebSocket requests
openclaw cron list returns Error: gateway timeout after 30000ms
- UI shows no cron jobs
- Gateway logs show
shutdown timed out; exiting without full cleanup on subsequent restart attempts
- The only fix is to manually edit
jobs.json to push all nextRunAtMs values into the future, then restart
Workaround
Before restarting the gateway, push all overdue nextRunAtMs values into the future:
import json
from datetime import datetime, timezone, timedelta
with open('~/.openclaw/cron/jobs.json') as f:
data = json.load(f)
now = datetime.now(timezone.utc)
min_ms = int((now + timedelta(minutes=10)).timestamp() * 1000)
for job in data['jobs']:
nrm = job.get('state', {}).get('nextRunAtMs', 0)
if nrm < min_ms:
job['state']['nextRunAtMs'] = min_ms
with open('~/.openclaw/cron/jobs.json', 'w') as f:
json.dump(data, f, indent=2)
Then restart the gateway.
Environment
- OpenClaw version: 2026.2.15
- Node.js: 22.17.1
- OS: Ubuntu 24.04 (Linux 6.8.0-94-generic x64)
- Cron jobs: 45 jobs configured, 10 were overdue at time of restart
- Gateway config: local loopback, single instance
Bug Description
When the gateway is restarted while multiple cron jobs have
nextRunAtMsvalues in the past (i.e., they were scheduled to fire while the gateway was down or being restarted), the gateway attempts to fire all overdue jobs simultaneously on startup. This overwhelms the gateway process and makes it completely unresponsive — the UI shows no cron jobs,openclaw cron listtimes out after 30s, and the WebSocket endpoint stops responding to cron-related requests.Other gateway functions (e.g.,
openclaw status,sessions.listfrom UI) continue to work, suggesting the cron subsystem specifically is blocked.Steps to Reproduce
nextRunAtMsstate valuesnextRunAtMsis in the past)openclaw cron listor check the UI → times out / shows no cron jobsExpected Behavior
The gateway should either:
nextRunAtMsto the next occurrence based on the cron expressionActual Behavior
openclaw cron listreturnsError: gateway timeout after 30000msshutdown timed out; exiting without full cleanupon subsequent restart attemptsjobs.jsonto push allnextRunAtMsvalues into the future, then restartWorkaround
Before restarting the gateway, push all overdue
nextRunAtMsvalues into the future:Then restart the gateway.
Environment