Fix isolated cron cold runner setup timeout by dredozubov · Pull Request #86893 · openclaw/openclaw

dredozubov · 2026-05-26T12:41:56Z

Summary

let long isolated cron jobs wait up to 5 minutes for cold Codex runner setup
keep the existing 60s setup watchdog floor for short jobs
add regression coverage for the long-job watchdog window

Tests

node scripts/run-vitest.mjs src/cron/service/timer.regression.test.ts

Real behavior proof

Behavior addressed: isolated cron agent-turn setup no longer aborts at the old fixed 60s runner-start watchdog when the job has a longer timeout budget; the run stays alive past 60s and then reaches the new bounded setup timeout.

Real environment tested: local OpenClaw cron runtime on PR head 2c969858b6389c7e3e056c1a5e51ab894868bc6b, macOS, Node 26.0.0, isolated temp cron store, no external credentials or network services.

Exact steps or command run after this patch: ran a real-time node --import tsx --input-type=module script from the PR checkout that instantiated CronService, created an isolated agentTurn cron job with timeoutSeconds: 130, started a manual run, withheld the runner-start signal, sampled state after 61s, then let cron reach the derived setup bound.

Evidence after fix: copied terminal output from the real-time cron runtime run:

{
  "openclawHead": "2c969858b6389c7e3e056c1a5e51ab894868bc6b",
  "scenario": "isolated cron agent-turn waits before runner start",
  "configuredTimeoutSeconds": 130,
  "expectedPatchedSetupBoundMs": 65000,
  "after61s": {
    "elapsedMs": 61035,
    "runnerHookEnteredMs": 41,
    "abortObserved": false,
    "cleanupCalls": 0
  },
  "final": {
    "elapsedMs": 65067,
    "abortObservedAtMs": 65044,
    "cleanupCalls": 1,
    "cleanupTimeoutMs": 130000,
    "runResult": {
      "ok": true,
      "ran": true
    },
    "lastRunStatus": "error",
    "lastStatus": "error",
    "lastError": "cron: isolated agent setup timed out before runner start",
    "persistedJobs": 1
  },
  "events": [
    {
      "action": "started",
      "elapsedMs": 34,
      "jobId": "2ddaca0a-a12e-4b7c-b7d9-2ab0746ee121"
    },
    {
      "elapsedMs": 65061,
      "level": "warn",
      "message": "cron: job run returned error status",
      "payload": {
        "jobId": "2ddaca0a-a12e-4b7c-b7d9-2ab0746ee121",
        "jobName": "PR 86893 live setup-window proof",
        "error": "cron: isolated agent setup timed out before runner start",
        "diagnosticsSummary": "cron: isolated agent setup timed out before runner start"
      }
    },
    {
      "action": "finished",
      "elapsedMs": 65062,
      "jobId": "2ddaca0a-a12e-4b7c-b7d9-2ab0746ee121",
      "status": "error",
      "error": "cron: isolated agent setup timed out before runner start",
      "durationMs": 65014
    }
  ]
}

Observed result after fix: at 61.035s cron had not aborted and cleanup had not run; at about 65.044s cron aborted the pre-runner setup phase and recorded cron: isolated agent setup timed out before runner start, matching the patched derived bound for a 130s isolated job instead of the previous fixed 60s abort.

What was not tested: a full external Codex provider cold start with live model credentials was not run; this proof isolates the cron watchdog path before runner start, which is the changed runtime behavior.

clawsweeper · 2026-05-26T12:43:58Z

Codex review: needs real behavior proof before merge. Reviewed May 29, 2026, 1:14 AM ET / 05:14 UTC.

Summary
Review failed before ClawSweeper could summarize the requested change.

PR surface: Source +8, Tests +66. Total +74 across 2 files.

Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path.

Review metrics: none identified.

Merge readiness
Overall: 🌊 off-meta tidepool
Proof: 🌊 off-meta tidepool
Patch quality: 🌊 off-meta tidepool
Result: rating does not apply to this item.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

[P1] No close action taken because the review did not complete.

Maintainer options:

Decide the mitigation before merge
Retry the Codex review after fixing the execution failure.
Pause or close
Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

[P1] Review did not complete, so no work-lane recommendation was made.

Review details

Best possible solution:

Retry the Codex review after fixing the execution failure.

Do we have a high-confidence way to reproduce the issue?

Unclear. The review failed before ClawSweeper could establish a reproduction path.

Is this the best way to solve the issue?

Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction.

AGENTS.md: unclear because the file could not be read completely.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 8eb5ff08c86b.

Label changes

Label changes:

add rating: 🌊 off-meta tidepool: Overall readiness is 🌊 off-meta tidepool; proof is 🌊 off-meta tidepool and patch quality is 🌊 off-meta tidepool.
remove P1: Current review triage priority is none.
remove rating: 🦪 silver shellfish: Current PR rating is rating: 🌊 off-meta tidepool, so this older rating label is no longer current.
remove merge-risk: 🚨 availability: Current PR review selected no merge-risk labels.
remove status: 📣 needs proof: Current PR status no longer selects a status label.

Label justifications:

rating: 🌊 off-meta tidepool: Overall readiness is 🌊 off-meta tidepool; proof is 🌊 off-meta tidepool and patch quality is 🌊 off-meta tidepool.

Evidence reviewed

PR surface:

Source +8, Tests +66. Total +74 across 2 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	1	10	2	+8
Tests	1	66	0	+66
Docs	0	0	0	0
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	2	76	2	+74

What I checked:

failure reason: timeout.
codex failure detail: Codex review failed for this PR with exit 1.
codex stdout: Per-item Codex failure; continuing with the rest of the shard.

Likely related people:

unknown: Codex failed before it could trace repository history. (role: review did not complete; confidence: low)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

clawsweeper · 2026-05-26T12:49:06Z

ClawSweeper PR egg: ✨ hatched ✨ glimmer Cosmic Shellbean. Rarity: ✨ glimmer. Trait: polishes edge cases.

Details

Share on X: post this hatch
Copy: My PR egg hatched a ✨ glimmer Cosmic Shellbean in ClawSweeper.
Hatchability:

Merged PRs are hatchable.
Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

About:

Eggs appear after real-behavior proof passes. They are collectible flavor only.
Review momentum changes the shell state: follow-up work warms it, re-review makes it wobble, and a clean final review lets it hatch.
The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

dredozubov · 2026-05-28T06:57:21Z

@clawsweeper re-review

clawsweeper · 2026-05-28T06:57:24Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26559789885
Updated: 2026-05-28T07:04:06.531Z

clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 26, 2026

openclaw-barnacle Bot added the triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. label May 26, 2026

clawsweeper Bot added P1 High-priority user-facing bug, regression, or broken workflow. merge-risk: 🚨 availability 🚨 May cause crashes, hangs, restart loops, stalls, or process outages. labels May 26, 2026

Joshtt23 mentioned this pull request May 26, 2026

fix(runtime): isolate workers and bound cron top-off #86955

Open

RomneyDa mentioned this pull request May 28, 2026

test(cron): speed up isolated fallback tests #87520

Merged

dredozubov force-pushed the codex/cron-codex-isolation branch from 634e73c to 2c96985 Compare May 28, 2026 06:37

openclaw-barnacle Bot added size: S proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 28, 2026

fix(cron): allow cold isolated runner setup

e864cf5

RomneyDa added the dependency-guard-backfill label May 29, 2026

dredozubov force-pushed the codex/cron-codex-isolation branch from 2c96985 to e864cf5 Compare May 29, 2026 04:52

RomneyDa removed the dependency-guard-backfill label May 29, 2026

This was referenced May 30, 2026

feat(cron): make isolated-agent setup watchdog configurable #88396

Open

fix: restart gateway after isolated cron setup timeout #89055

Open

mbelinky mentioned this pull request Jun 3, 2026

feat(cron): support command jobs #89712

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix isolated cron cold runner setup timeout#86893

Fix isolated cron cold runner setup timeout#86893
dredozubov wants to merge 1 commit into
openclaw:mainfrom
dredozubov:codex/cron-codex-isolation

dredozubov commented May 26, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 26, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 26, 2026 •

edited

Loading

Uh oh!

dredozubov commented May 28, 2026

Uh oh!

clawsweeper Bot commented May 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

dredozubov commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tests

Real behavior proof

Uh oh!

clawsweeper Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clawsweeper Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dredozubov commented May 28, 2026

Uh oh!

clawsweeper Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dredozubov commented May 26, 2026 •

edited

Loading

clawsweeper Bot commented May 26, 2026 •

edited

Loading

clawsweeper Bot commented May 26, 2026 •

edited

Loading

clawsweeper Bot commented May 28, 2026 •

edited

Loading