[Bug]: Cron jobs enqueued but never execute — lane never dispatches, runningAtMs written on enqueue causes permanent stale marker loop

### Bug type

Regression (worked before, now fails)

### Summary

Cron jobs return enqueued: true from both scheduled triggers and openclaw cron run but never actually execute — no session start, no LLM call, no tool activity appears in logs.

### Steps to reproduce

1. Have one or more cron jobs configured in jobs.json with enabled: true
2. Run: openclaw cron run <jobId>
3. Observe response: {"ok": true, "enqueued": true, "runId": "manual:..."}
4. Wait 15-30 seconds
5. Run: openclaw logs --max-bytes 50000 | tail -40
6. Observe: only heartbeat and timer-armed lines — no session start, no LLM call, no execution
7. Run: openclaw gateway restart
8. Observe log: "cron: clearing stale running marker on startup" for the same jobId
9. Repeat steps 2-8 — cycle repeats indefinitely

### Expected behavior

After openclaw cron run <jobId> returns enqueued: true, the cron lane should pick up the job within seconds, start an isolated session, send the payload to the configured agent and model, and log execution activity including session start, LLM request, and completion.

### Actual behavior

The job is enqueued and runningAtMs is immediately written to jobs.json. The cron lane never dispatches it. No session is created, no LLM call is made. On the next gateway restart, OpenClaw logs "cron: clearing stale running marker on startup" for the job — then the same cycle repeats on the next enqueue. Deleting ~/.openclaw/cron/runs/ and manually removing runningAtMs from jobs.json does not resolve the issue — the marker is re-written on the next enqueue and the job still never executes.

### OpenClaw version

2026.3.8 (3caab92)

### Operating system

Ubuntu 24.04 LTS

### Install method

npm install -g openclaw@latest — running as systemd service via openclaw-gateway.service

### Model

vllm/Qwen/Qwen3.5-9B (self-hosted via vLLM on RunPod, OpenAI-compatible endpoint)

### Provider / routing chain

vllm provider → https://<pod-id>-8000.proxy.runpod.net/v1 → vLLM serving Qwen/Qwen3.5-9B

### Config file / key location

~/.openclaw/openclaw.json → models.providers.vllm, plugins.entries, agents.smith-vciso

### Additional provider/model setup details

Provider defined with "api": "openai-completions", "contextWindow": 131072, "maxTokens": 16000. Agent smith-vciso uses "model": "vllm/Qwen/Qwen3.5-9B". The same model works correctly for interactive chat sessions — only cron execution is broken. The issue reproduces with any cron job regardless of which agent or model is configured.

### Logs, screenshots, and evidence

```shell
07:43:36 warn cron clearing stale running marker on startup
  jobId: 57731dfe-e8d7-4ba7-92ef-eb6e408919be
  runningAtMs: 1773215045982

07:44:31 info { "ok": true, "enqueued": true,
  "runId": "manual:57731dfe-...:1773215071210:1" }

[next 5 minutes: only cron timer-armed + web heartbeat lines]
[no session start, no LLM call, no execution of any kind]
```

### Impact and severity

Affected: All cron jobs across all agents — 100% failure rate
Severity: Blocks workflow entirely — the entire scheduled automation layer is non-functional
Frequency: Always reproduces — every enqueue, every scheduled trigger
Consequence: All scheduled HR agent workflows (morning prompts, evening reminders, daily reports, escalations) fail silently. No WhatsApp messages sent to team members. No reports generated. Production automation completely down.

### Additional information

Interactive chat sessions via WhatsApp and Telegram work correctly — only the cron execution lane is affected. The bug persists across gateway restarts, runs/ directory deletion, and manual runningAtMs removal from jobs.json. Previously working cron jobs stopped executing after a period of LLM timeouts (RunPod pod was stopped while cron jobs were scheduled to run), suggesting the lane may have entered a broken state from unresolved LLM timeout errors and never recovered.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Cron jobs enqueued but never execute — lane never dispatches, runningAtMs written on enqueue causes permanent stale marker loop #42960

Bug type

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Config file / key location

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Cron jobs enqueued but never execute — lane never dispatches, runningAtMs written on enqueue causes permanent stale marker loop #42960

Description

Bug type

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Config file / key location

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions