Skip to content

fix(cron): respect recovered tool errors#84156

Open
Elarwei001 wants to merge 1 commit into
openclaw:mainfrom
Elarwei001:fix/cron-recovered-tool-error
Open

fix(cron): respect recovered tool errors#84156
Elarwei001 wants to merge 1 commit into
openclaw:mainfrom
Elarwei001:fix/cron-recovered-tool-error

Conversation

@Elarwei001

@Elarwei001 Elarwei001 commented May 19, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Problem: cron isolated runs can mark a recovered task as failed when the final payload list only contains isError tool payloads, even though the agent produced terminal assistant-visible output.
  • Solution: separate fatal run signals from recoverable tool-error payloads before choosing the cron final status/output.
  • What changed: resolveCronPayloadOutcome now records errorPayloadRecovery and treats later payload output, message-delivery warnings, or terminal assistant-visible text as recovery paths when no typed fatal signal exists.
  • What did NOT change (scope boundary): meta.error, fatal failureSignal, denial markers, structured/media delivery payload handling, and real unrecovered trailing errors remain fatal.

Motivation

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Real behavior proof (required for external PRs)

  • Behavior or issue addressed: recovered cron output should persist as ok instead of error when the only remaining error evidence is recoverable tool payload text.
  • Real environment tested: local OpenClaw checkout on macOS, Node from the repository toolchain, branch fix/cron-recovered-tool-error at commit 3cb013e77a.
  • Exact steps or command run after this patch:
node --import tsx - <<'NODE'
import { resolveCronPayloadOutcome } from './src/cron/isolated-agent/helpers.ts';

const outcome = resolveCronPayloadOutcome({
  payloads: [
    { text: 'redacted tool failure', isError: true },
    { text: 'redacted retry failure', isError: true },
  ],
  finalAssistantVisibleText: 'Recovered final answer from the cron task.',
  preferFinalAssistantVisibleText: true,
});

console.log(JSON.stringify({
  hasFatalErrorPayload: outcome.hasFatalErrorPayload,
  status: outcome.hasFatalErrorPayload ? 'error' : 'ok',
  outputText: outcome.outputText,
  deliveryPayloads: outcome.deliveryPayloads,
  errorPayloadRecovery: outcome.errorPayloadRecovery,
  embeddedRunError: outcome.embeddedRunError ?? null,
}, null, 2));
NODE
  • Evidence after fix (copied live output):
{
  "hasFatalErrorPayload": false,
  "status": "ok",
  "outputText": "Recovered final answer from the cron task.",
  "deliveryPayloads": [
    {
      "text": "Recovered final answer from the cron task."
    }
  ],
  "errorPayloadRecovery": {
    "hadErrorPayload": true,
    "recovered": true,
    "recoveredBy": "terminalAssistantText"
  },
  "embeddedRunError": null
}
  • Observed result after fix: the recovered output is selected, hasFatalErrorPayload is false, and the final status derived by cron is ok.
  • What was not tested: a full scheduled daemon run against a live private cron job; the proof uses a redacted local reproduction of the outcome classifier to avoid exposing private task content.
  • Before evidence (optional but encouraged): the prior regression test expected the same all-error-payload shape to return only the last error payload even when finalAssistantVisibleText was present.

Root Cause (if applicable)

  • Root cause: resolveCronPayloadOutcome classified any last isError payload as fatal unless a later non-error payload existed, so all-error-payload runs could not be recovered by terminal assistant-visible text.
  • Missing detection / guardrail: there was no explicit recovery state distinguishing recoverable tool payload errors from fatal run-level signals.
  • Contributing context (if known): cron delivery already receives finalAssistantVisibleText, but fatal payload classification ran before that text could act as recovered terminal output.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/cron/isolated-agent.helpers.test.ts
  • Scenario the test should lock in: all payloads are isError, final assistant-visible text is present and preferred, no typed fatal signal exists, and cron resolves an ok outcome with recovered final text.
  • Why this is the smallest reliable guardrail: the bug is in the pure outcome classifier used by cron finalization and interim retry logic.
  • Existing test that already covers this (if any): existing tests covered later non-error payload recovery, message-delivery warning recovery, run-level fatal errors, and denial markers; this PR updates the all-error-payload regression case and adds a denial guard.
  • If no new test is added, why not: N/A.

User-visible / Behavior Changes

Cron runs that recover from intermediate tool errors can now be recorded as successful and surface the final assistant result instead of the last recoverable tool error.

Diagram (if applicable)

Before:
[tool error payloads] -> [final assistant recovered] -> [last error payload wins] -> [cron status: error]

After:
[tool error payloads] -> [final assistant recovered] -> [recovery classified] -> [cron status: ok]

Security Impact (required)

  • New permissions/capabilities? (Yes/No): No
  • Secrets/tokens handling changed? (Yes/No): No
  • New/changed network calls? (Yes/No): No
  • Command/tool execution surface changed? (Yes/No): No
  • Data access scope changed? (Yes/No): No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS local checkout
  • Runtime/container: repository Node/pnpm toolchain
  • Model/provider: N/A for classifier reproduction
  • Integration/channel (if any): cron isolated-agent outcome classifier
  • Relevant config (redacted): preferFinalAssistantVisibleText: true

Steps

  1. Pass redacted all-error payloads plus terminal assistant-visible text into resolveCronPayloadOutcome.
  2. Verify hasFatalErrorPayload resolves false and output/delivery use final assistant text.
  3. Run focused cron/helper validation and repository checks.

Expected

  • Recovered terminal assistant-visible text should produce a non-fatal cron outcome when no typed fatal signal exists.

Actual

  • After this patch, the redacted reproduction returns status: "ok", recoveredBy: "terminalAssistantText", and no embedded run error.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios:
    • Recovered all-error payloads now prefer terminal assistant text and resolve non-fatal.
    • Later successful payload recovery still resolves non-fatal.
    • Message delivery warning recovery still resolves non-fatal.
    • Denial text, run-level errors, and fatal failure signals remain fatal.
  • Edge cases checked:
    • Structured/media payloads are not collapsed into final text.
    • Earlier partial output plus a real trailing error remains fatal when final assistant text is only the same partial output.
  • What you did not verify:
    • Full live private cron daemon run after patch.
    • Full pnpm test; focused tests and pnpm build/pnpm check were run instead.
    • codex review --base origin/main; the installed Codex CLI does not expose a review subcommand, and a read-only codex exec review attempt failed with 401 Unauthorized.

AI-assisted: yes. I understand the changed classifier path and the added tests.

Validation run locally:

node scripts/test-projects.mjs src/cron/isolated-agent.helpers.test.ts src/cron/isolated-agent/run.meta-error-status.test.ts src/cron/isolated-agent/run.message-tool-policy.test.ts src/agents/pi-embedded-runner/run/payloads.errors.test.ts
pnpm tsgo:test:src
pnpm check
pnpm build

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

No bot review conversations were open when this PR body was updated.

Compatibility / Migration

  • Backward compatible? (Yes/No): Yes
  • Config/env changes? (Yes/No): No
  • Migration needed? (Yes/No): No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: terminal assistant text could hide a real fatal condition.
    • Mitigation: recovery is blocked when meta.error, fatal failureSignal, denial markers, structured delivery payloads, or an unrecovered real trailing error should stay fatal.

@clawsweeper

clawsweeper Bot commented May 19, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge.

Latest ClawSweeper review: 2026-05-19 14:18 UTC / May 19, 2026, 10:18 AM ET.

Workflow note: Future ClawSweeper reviews update this same comment in place.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Summary
The PR updates resolveCronPayloadOutcome and its focused tests so recovered isError tool payloads can be marked nonfatal when terminal assistant-visible text, later payload output, or presentation-warning recovery proves the cron run recovered.

Reproducibility: yes. Current main has a focused resolver test and helper logic showing all-error payloads with final assistant text still select the last error, and finalizeCronRun maps that fatal helper result to persisted cron status: "error".

PR rating
Overall: 🐚 platinum hermit
Proof: 🐚 platinum hermit
Patch quality: 🐚 platinum hermit
Summary: The patch is focused and has sufficient helper-level real output proof, with the remaining concern being maintainer acceptance of the cron status compatibility change.

Rank-up moves:

  • none
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

Real behavior proof
Sufficient (live_output): The updated PR body includes after-fix terminal/live output from a local checkout invoking the actual resolver and showing the recovered status/output path; this is sufficient for the helper-level behavior, though not full cron daemon proof.

Risk before merge

  • Merging changes cron status semantics: some all-error-payload shapes that currently persist as error can become ok when terminal assistant-visible text proves recovery, which may affect users who alert on the older status behavior.
  • The proof now exercises the real resolver in a local checkout, but it does not show a full scheduled daemon run persisting the recovered status through openclaw cron runs.
  • The semantic boundary is subtle: maintainers should explicitly accept that terminal assistant-visible text can recover tool-error payloads while typed run failures, fatal failure signals, denial markers, structured delivery payloads, and unrecovered trailing errors remain fatal.

Maintainer options:

  1. Accept Recovered Status Contract (recommended)
    Merge after maintainer review if the intended contract is that terminal assistant-visible recovery can turn previously fatal all-error payload lists into successful cron runs.
  2. Ask For End-To-End Cron Proof
    Before merge, request a redacted openclaw cron run --wait or openclaw cron runs proof showing the recovered case persists as ok and a denial or typed failure still persists as error.
  3. Pause For Narrower Status Policy
    Pause this PR if maintainers want all tool-error payloads to remain fatal unless a stronger explicit status sentinel or configuration contract is added.

Next step before merge
No automated repair is needed; the implementation path is the open PR, and the remaining decision is maintainer acceptance of the compatibility-sensitive cron status contract.

Security
Cleared: The diff only changes cron TypeScript helper logic and tests; it does not add dependencies, workflows, secrets handling, downloaded code, permissions, or package-resolution changes.

Review details

Best possible solution:

Land the resolver-level fix once maintainers accept the recovered-terminal-assistant status contract for cron monitoring and are satisfied that the focused helper proof covers the risk.

Do we have a high-confidence way to reproduce the issue?

Yes. Current main has a focused resolver test and helper logic showing all-error payloads with final assistant text still select the last error, and finalizeCronRun maps that fatal helper result to persisted cron status: "error".

Is this the best way to solve the issue?

Yes, with a maintainer policy check. The PR fixes the implicated resolver instead of only changing summary text, while preserving typed run failures, fatal failure signals, denial markers, and structured delivery safeguards.

Label justifications:

  • P2: This is a normal-priority cron monitoring correctness fix with limited blast radius but real operator impact.
  • merge-risk: 🚨 compatibility: The patch intentionally changes persisted cron statuses for a class of recovered tool-error runs that existing monitoring may currently treat as failures.
  • rating: 🐚 platinum hermit: Current PR rating is 🐚 platinum hermit because proof is 🐚 platinum hermit, patch quality is 🐚 platinum hermit, and The patch is focused and has sufficient helper-level real output proof, with the remaining concern being maintainer acceptance of the cron status compatibility change.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The updated PR body includes after-fix terminal/live output from a local checkout invoking the actual resolver and showing the recovered status/output path; this is sufficient for the helper-level behavior, though not full cron daemon proof.
  • proof: sufficient: Contributor real behavior proof is sufficient. The updated PR body includes after-fix terminal/live output from a local checkout invoking the actual resolver and showing the recovered status/output path; this is sufficient for the helper-level behavior, though not full cron daemon proof.

What I checked:

  • Current-main bug path: Current main treats any error payload with no later successful payload and no presentation-warning exception as fatal before final assistant-visible text can be selected, which matches the false-negative shape described by the linked reports. (src/cron/isolated-agent/helpers.ts:295, 13c97c5a8d04)
  • Regression expectation on main: The existing resolver test explicitly expects all-error payloads plus finalAssistantVisibleText to return the last error payload, confirming the current behavior the PR changes. (src/cron/isolated-agent.helpers.test.ts:284, 13c97c5a8d04)
  • Status mapping: finalizeCronRun maps hasFatalErrorPayload directly into cron run status: "error" and persisted error diagnostics, so the helper classification affects cron runs and cron run --wait behavior. (src/cron/isolated-agent/run.ts:956, 13c97c5a8d04)
  • PR diff: The PR adds errorPayloadRecovery, moves typed run/failure normalization ahead of recovery checks, permits terminal assistant-visible recovery when no fatal signal or structured delivery exists, and updates resolver tests for later-payload, presentation-warning, terminal-text, and denial cases. (src/cron/isolated-agent/helpers.ts:279, 3cb013e77a12)
  • Contributor proof: The PR body now includes after-fix terminal/live output from a local OpenClaw checkout invoking the actual resolver at head, showing hasFatalErrorPayload: false, status: "ok", recovered terminal assistant text, and no embedded run error for the all-error-payload recovery shape. (3cb013e77a12)
  • Feature history: Recoverable transient tool-error handling dates back to d509a81a..., final assistant-visible cron delivery behavior was expanded by 81c7304a..., and the current helper/test split was created by 48d9966a.... (src/cron/isolated-agent/helpers.ts:252, 48d9966aa1de)

Likely related people:

  • takhoffman: Co-authored the earlier transient tool-error recovery fix and approved/co-authored the current helper split commit that now owns this path. (role: adjacent owner and reviewer; confidence: high; commits: d509a81a1289, 48d9966aa1de; files: src/cron/isolated-agent/run.ts, src/cron/isolated-agent/helpers.ts, src/cron/isolated-agent.helpers.test.ts)
  • Sid-Qin: The recoverable transient tool-error policy appears to date to the March cron fix credited to Sid-Qin, which changed status handling when later successful payloads follow tool errors. (role: introduced behavior; confidence: medium; commits: d509a81a1289; files: src/cron/isolated-agent/run.ts, src/cron/isolated-agent.uses-last-non-empty-agent-text-as.test.ts)
  • welfo-beo: Authored the final assistant-visible cron announce delivery change that added the relevant finalAssistantVisibleText preference path and tests. (role: feature contributor; confidence: medium; commits: 81c7304a18b8; files: src/cron/isolated-agent/helpers.ts, src/cron/isolated-agent.helpers.test.ts, src/cron/isolated-agent/run.ts)
  • jalehman: Reviewed/co-authored the final assistant-visible cron announce delivery change, so they are a useful routing candidate for the channel-output side of this behavior. (role: reviewer and adjacent contributor; confidence: medium; commits: 81c7304a18b8; files: src/cron/isolated-agent/helpers.ts, src/cron/isolated-agent.helpers.test.ts)
  • TurboTheTurtle: Co-authored the current helper/test split commit that introduced the files now touched by this PR. (role: recent area contributor; confidence: medium; commits: 48d9966aa1de; files: src/cron/isolated-agent/helpers.ts, src/cron/isolated-agent.helpers.test.ts, src/cron/isolated-agent/run.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against 13c97c5a8d04.

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal backlog priority with limited blast radius. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. labels May 19, 2026
@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 19, 2026
@Elarwei001

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 19, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 19, 2026
@clawsweeper

clawsweeper Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

✨ Hatched: 🌱 uncommon Brave Shellbean

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🌱 uncommon.
Trait: purrs at green checks.
Image traits: location proof lagoon; accessory lint brush; palette seafoam, black, and opal; mood focused; pose stepping out of a freshly hatched shell; shell woven fiber shell; lighting calm overcast light; background tiny shells and proof notes.
Share on X: post this hatch
Copy: My PR egg hatched a 🌱 uncommon Brave Shellbean in ClawSweeper.

What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. P2 Normal backlog priority with limited blast radius. proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: S status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Cron task summary ignores agent self-correction - picks tool error over final assistant reply

1 participant