fix #90452: Regression: Heartbeat exec completion still shows generic fallback text instead of actual output by mushuiyu886 · Pull Request #90897 · openclaw/openclaw

mushuiyu886 · 2026-06-06T09:43:41Z

Summary

Problem: heartbeat-triggered exec task failures could still be summarized as a generic 🛠️ ... failed fallback instead of showing the command's captured error/output.
Solution: mark embedded runner payload construction when the trigger is heartbeat and allow exec-like heartbeat tool failures to include the collected error details.
What changed: buildEmbeddedRunPayloads now receives isHeartbeatTrigger, threads it into the tool-error warning policy, and has a regression test for the HEARTBEAT.md tail failure shape from the issue.
What did NOT change: cron timeout handling, non-heartbeat compact tool-error behavior, channel delivery, and heartbeat scheduling are unchanged.

Real behavior proof

Behavior or issue addressed: Heartbeat exec task failures now surface the actual exec error details in the generated payload instead of only emitting the generic fallback text.
Real environment tested: Local Linux OpenClaw worktree at /media/vdc/code/ai/aispace/openclaw-worktrees/issue-90452, calling the production embedded-runner payload builder with node --import tsx.

Exact steps or command run after this patch:

env -C "/media/vdc/code/ai/aispace/openclaw-worktrees/issue-90452" node --import tsx -e 'import { buildEmbeddedRunPayloads } from "./src/agents/embedded-agent-runner/run/payloads.ts"; const payloads = buildEmbeddedRunPayloads({ assistantTexts: [], toolMetas: [], lastAssistant: undefined, currentAssistant: undefined, lastToolError: { toolName: "exec", meta: "show last 20 lines of ~/.openclaw/workspace/memory/2026-06-04.md", error: "tail: cannot open '\''/home/user/.openclaw/workspace/memory/2026-06-04.md'\'' for reading: No such file or directory" }, isCronTrigger: false, isHeartbeatTrigger: true, sessionKey: "session:heartbeat", inlineToolResultsAllowed: false, verboseLevel: "off", reasoningLevel: "off", toolResultFormat: "plain" }); console.log(JSON.stringify(payloads, null, 2)); if (!payloads[0]?.text?.includes("No such file or directory")) process.exit(1);'

Evidence after fix:

[
  {
    "text": "⚠️ 🛠️ show last 20 lines of ~/.openclaw/workspace/memory/2026-06-04.md failed: tail: cannot open '/home/user/.openclaw/workspace/memory/2026-06-04.md' for reading: No such file or directory",
    "isError": true
  }
]

Observed result after fix: The heartbeat exec failure payload includes the real tail error and No such file or directory detail that the previous generic fallback omitted.
What was not tested: I did not wait for a wall-clock HEARTBEAT.md interval or deliver through an external messaging channel; the proof exercises the local embedded-runner heartbeat payload path that formats the exec failure before channel delivery.

Regression Test Plan

Coverage level: focused regression test plus related heartbeat/agent shards.
Target test file: src/agents/embedded-agent-runner/run/payloads.test.ts.
Scenario locked in: heartbeat-triggered exec failure for show last 20 lines of ~/.openclaw/workspace/memory/2026-06-04.md includes No such file or directory in the payload.
Why this is the smallest reliable guardrail: the bug was in the embedded-runner payload warning policy, so this directly covers the formatting decision that previously dropped the exec error details.

Commands run:

env -C "/media/vdc/code/ai/aispace/openclaw-worktrees/issue-90452" node scripts/run-vitest.mjs src/agents/embedded-agent-runner/run/payloads.test.ts src/infra/heartbeat-events-filter.test.ts src/infra/heartbeat-runner.ghost-reminder.test.ts src/agents/bash-tools.test.ts
env -C "/media/vdc/code/ai/aispace/openclaw-worktrees/issue-90452" ./node_modules/.bin/oxfmt --check --threads=1 src/agents/embedded-agent-runner/run.ts src/agents/embedded-agent-runner/run/payloads.ts src/agents/embedded-agent-runner/run/payloads.test.ts
env -C "/media/vdc/code/ai/aispace/openclaw-worktrees/issue-90452" node scripts/run-tsgo.mjs -p tsconfig.core.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core.tsbuildinfo
git -C "/media/vdc/code/ai/aispace/openclaw-worktrees/issue-90452" diff --check

Observed test output:

[test] passed 3 Vitest shards in 34.97s

Checking formatting...

All matched files use the correct format.

tsgo:core and git diff --check completed with no output and exit code 0.

Merge risk

Why it matters / User impact: HEARTBEAT.md exec tasks that fail should tell the user what the command reported, especially for filesystem or shell errors where the command output is the actionable part.
Risk labels considered: merge-risk:runtime-behavior, merge-risk:low.
Risk explanation: The change only affects embedded-runner tool-error payload formatting when the active trigger is heartbeat; cron timeout handling and ordinary compact tool-error behavior keep their existing conditions.
Why acceptable: The output was already collected in lastToolError.error; this only makes heartbeat exec-like failures eligible to surface that existing detail instead of dropping it.
Fix classification: Root-cause regression fix with a focused regression test.
Maintainer-ready confidence: High for the local payload-formatting path: production function proof, focused regression test, related heartbeat tests, formatting, typecheck, and diff whitespace checks all pass.
Patch quality notes: Small 3-file patch, no public API, config, schema, dependency, channel delivery, or scheduler changes.
Why this is root-cause fix: The missing trigger classification in the payload warning policy caused the real exec error to be omitted; threading isHeartbeatTrigger through that policy fixes the decision point directly.

Root Cause

Root cause: embedded runner payloads only included exec tool error details for verbose mode or cron timeout cases; heartbeat-triggered exec failures were not marked as a detail-eligible trigger, so lastToolError.error was dropped.
Missing detection / guardrail: there was no regression test covering heartbeat-triggered exec-like tool failures with captured stderr/stdout details.

clawsweeper · 2026-06-06T09:45:41Z

Codex review: passed. Reviewed June 8, 2026, 1:47 AM ET / 05:47 UTC.

Summary
The PR threads heartbeat trigger state into embedded-runner payload formatting so heartbeat exec-like failures include captured error details, with a focused regression test.

PR surface: Source +12, Tests +18. Total +30 across 3 files.

Reproducibility: yes. Source inspection shows current main and v2026.6.1 only include raw exec details for verbose mode or timed-out cron cases, so a non-timeout heartbeat exec failure follows the generic warning path; I did not run the wall-clock heartbeat scenario in this read-only review.

Review metrics: none identified.

Merge readiness
Overall: 🦞 diamond lobster
Proof: 🦞 diamond lobster
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Next step before merge

[P2] No repair lane is needed because the automerge-opted PR has no actionable review findings and sufficient proof.

Security
Cleared: The diff only threads a trigger flag through payload formatting and adds a test; it does not change dependencies, secrets, permissions, or command execution.

Review details

Best possible solution:

Land the narrow payload-builder fix after the usual automerge gates, keeping scheduler and channel delivery behavior unchanged.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection shows current main and v2026.6.1 only include raw exec details for verbose mode or timed-out cron cases, so a non-timeout heartbeat exec failure follows the generic warning path; I did not run the wall-clock heartbeat scenario in this read-only review.

Is this the best way to solve the issue?

Yes. The payload builder is the narrow decision point for whether collected tool-error details are surfaced, and threading the existing trigger into that boundary is cleaner than changing scheduler or channel delivery code.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against a4f0e508dfcb.

Label changes

Label justifications:

P1: The linked regression hides actionable heartbeat exec failure output from users in an agent automation workflow.
rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
status: 🚀 automerge armed: This PR is in ClawSweeper's automerge lane. Sufficient (live_output): The PR body supplies after-fix live output from the production payload builder showing the heartbeat exec failure includes the captured error detail.
proof: sufficient: Contributor real behavior proof is sufficient. The PR body supplies after-fix live output from the production payload builder showing the heartbeat exec failure includes the captured error detail.

Evidence reviewed

PR surface:

Source +12, Tests +18. Total +30 across 3 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	2	13	1	+12
Tests	1	18	0	+18
Docs	0	0	0	0
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	3	31	1	+30

What I checked:

Repository policy: Read the full root AGENTS.md plus scoped agent and embedded-runner AGENTS.md files; the review applied the embedded-runner guidance to prefer focused helper coverage for payload assembly. (AGENTS.md:1, a4f0e508dfcb)
Current main behavior: On current main, tool error details are included only for full verbose mode or timed-out exec-like cron/session-key cases; heartbeat trigger state is not available to this policy. (src/agents/embedded-agent-runner/run/payloads.ts:131, a4f0e508dfcb)
PR implementation: The branch adds isHeartbeatTrigger to the tool-error detail policy and returns true for exec-like heartbeat failures before preserving the existing timed-out cron condition. (src/agents/embedded-agent-runner/run/payloads.ts:131, 85c5d6fb9fe7)
Runtime caller: runEmbeddedAgent passes params.trigger === "heartbeat" into buildEmbeddedRunPayloads at the existing payload boundary. (src/agents/embedded-agent-runner/run.ts:2972, 85c5d6fb9fe7)
Regression coverage: The new test covers a heartbeat exec failure with a captured tail error and asserts that the payload includes the actionable error detail. (src/agents/embedded-agent-runner/run/payloads.test.ts:450, 85c5d6fb9fe7)
Current-main merge check: merge-tree against current main shows the same focused three-file patch without conflict or current-main behavior deletion.

Likely related people:

vincentkoc: Current main blame and file history for the payload builder, tool-error summary, mutation classifier, and runEmbeddedAgent caller point to recent commits by Vincent Koc. (role: recent area contributor; confidence: high; commits: 5d7e0b73a7b0, 2e08f0f4221f; files: src/agents/embedded-agent-runner/run/payloads.ts, src/agents/embedded-agent-runner/run.ts, src/agents/tool-error-summary.ts)
Michael Appel: Commit 19a2e9d changed heartbeat exec completion detection and tests, which is adjacent to the reported heartbeat exec output path. (role: adjacent exec-completion contributor; confidence: medium; commits: 19a2e9ddb5a8; files: src/infra/heartbeat-events-filter.ts, src/infra/heartbeat-events-filter.test.ts)
Chinar Amrutkar: Commit 103bebd added HEARTBEAT.md task batching, the user-facing automation surface involved in the linked regression report. (role: heartbeat task feature contributor; confidence: medium; commits: 103bebd651a5; files: src/auto-reply/heartbeat.ts, src/infra/heartbeat-runner.ts)
steipete: Recent heartbeat event-filter and ghost-reminder history includes repeated commits by Peter Steinberger around heartbeat routing and prompt filtering. (role: adjacent heartbeat history contributor; confidence: medium; commits: c1b3a49320, e2362d352d, 77876bd05c; files: src/infra/heartbeat-events-filter.ts, src/infra/heartbeat-runner.ghost-reminder.test.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Takhoffman · 2026-06-08T01:54:38Z

@clawsweeper automerge

clawsweeper · 2026-06-08T01:54:41Z

🦞✅
ClawSweeper merged this PR after the passing review.

Source: clawsweeper[bot]
Feedback: structured ClawSweeper verdict: pass (sha=85c5d6fb9fe7f9ba0c224ea8d65ac02f33a08840)
Merge status: merged by ClawSweeper automerge
Merged at: 2026-06-08T05:47:54Z
Merge commit: b2c1de77acc7

What merged:

The PR threads heartbeat trigger state into embedded-runner payload formatting so heartbeat exec-like failures include captured error details, with a focused regression test.
PR surface: Source +12, Tests +18. Total +30 across 3 files.
Reproducibility: yes. Source inspection shows current main and v2026.6.1 only include raw exec details for v ... follows the generic warning path; I did not run the wall-clock heartbeat scenario in this read-only review.

Automerge notes:

PR branch already contained follow-up commit before automerge: fix Regression: Heartbeat exec completion still shows generic fallback text instead of actual output #90452: Regression: Heartbeat exec completion still shows generic…

The automerge loop is complete.

Automerge progress:

2026-06-08 01:55:16 UTC review queued 0e7a0e4cc94a (queued)

2026-06-08 02:03:08 UTC review passed 0e7a0e4cc94a (structured ClawSweeper verdict: pass (sha=0e7a0e4cc94ad1bfc04b873901960fca98509...)

2026-06-08 02:03:28 UTC repair queued 0e7a0e4cc94a (autonomous) Run: https://github.com/openclaw/clawsweeper/actions/runs/27111936592

2026-06-08 02:05:00 UTC repair started (running) in 1s Run: https://github.com/openclaw/clawsweeper/actions/runs/27111936592 automerge-openclaw-openclaw-90897

2026-06-08 02:05:23 UTC validation plan (passed) in 23s Run: https://github.com/openclaw/clawsweeper/actions/runs/27111936592 pnpm check:changed; pnpm lint; pnpm check:test-types

2026-06-08 02:05:40 UTC Codex write preflight (passed) in 41s Run: https://github.com/openclaw/clawsweeper/actions/runs/27111936592 danger-full-access

2026-06-08 02:09:46 UTC Codex edit 1 1134bb4f9c3c (complete) in 4m 46s Run: https://github.com/openclaw/clawsweeper/actions/runs/27111936592 exit 0

2026-06-08 02:13:10 UTC validation and review 1 85c5d6fb9fe7 (base moved) in 8m 10s Run: https://github.com/openclaw/clawsweeper/actions/runs/27111936592 rebased

2026-06-08 02:13:16 UTC repair completed 85c5d6fb9fe7 (branch updated) in 8m 16s Run: https://github.com/openclaw/clawsweeper/actions/runs/27111936592 initial automerge rebase is delegated to Codex repair

2026-06-08 02:13:15 UTC review queued 85c5d6fb9fe7 (after repair)

2026-06-08 02:20:44 UTC automerge wait 85c5d6fb9fe7 (ready) in 15m 45s Run: https://github.com/openclaw/clawsweeper/actions/runs/27111936592 checks and exact-head review are ready

2026-06-08 05:47:41 UTC review passed 85c5d6fb9fe7 (structured ClawSweeper verdict: pass (sha=85c5d6fb9fe7f9ba0c224ea8d65ac02f33a08...)

2026-06-08 02:20:46 UTC repair finished 85c5d6fb9fe7 (pushed) in 15m 46s Run: https://github.com/openclaw/clawsweeper/actions/runs/27111936592 repair_contributor_branch

2026-06-08 05:38:19 UTC review queued 85c5d6fb9fe7 (queued)

2026-06-08 05:47:56 UTC merged 85c5d6fb9fe7 (merged by ClawSweeper automerge)

… generic fallback text instead of actual output

… generic fallback text instead of actual output (openclaw#90897) Summary: - The PR threads heartbeat trigger state into embedded-runner payload formatting so heartbeat exec-like failures include captured error details, with a focused regression test. - PR surface: Source +12, Tests +18. Total +30 across 3 files. - Reproducibility: yes. Source inspection shows current main and v2026.6.1 only include raw exec details for v ... follows the generic warning path; I did not run the wall-clock heartbeat scenario in this read-only review. Automerge notes: - PR branch already contained follow-up commit before automerge: fix openclaw#90452: Regression: Heartbeat exec completion still shows generic… Validation: - ClawSweeper review passed for head 85c5d6f. - Required merge gates passed before the squash merge. Prepared head SHA: 85c5d6f Review: openclaw#90897 (comment) Co-authored-by: 杨浩宇0668001029 <yang.haoyu@xydigit.com> Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com> Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com> Approved-by: takhoffman Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>

openclaw-barnacle Bot added agents Agent runtime and tooling size: XS proof: supplied External PR includes structured after-fix real behavior proof. labels Jun 6, 2026

mushuiyu886 and others added 2 commits June 8, 2026 02:13

fix(agents): surface heartbeat exec error details

b4d1d8d

fix openclaw#90452: Regression: Heartbeat exec completion still shows…

85c5d6f

… generic fallback text instead of actual output

clawsweeper Bot force-pushed the feat/issue-90452 branch from 0e7a0e4 to 85c5d6f Compare June 8, 2026 02:13

clawsweeper Bot added rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. labels Jun 8, 2026

clawsweeper Bot merged commit b2c1de7 into openclaw:main Jun 8, 2026
167 of 169 checks passed

github-actions Bot mentioned this pull request Jun 8, 2026

📡 Upstream Digest — 2026-06-08 08:46 UTC curtismercier/openclaw-mods#1037

Open

clawsweeper Bot mentioned this pull request Jun 8, 2026

fix(reply-queue): remove the drained item by reference instead of front index #91450

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix #90452: Regression: Heartbeat exec completion still shows generic fallback text instead of actual output#90897

fix #90452: Regression: Heartbeat exec completion still shows generic fallback text instead of actual output#90897
clawsweeper[bot] merged 2 commits into
openclaw:mainfrom
mushuiyu886:feat/issue-90452

mushuiyu886 commented Jun 6, 2026

Uh oh!

clawsweeper Bot commented Jun 6, 2026 •

edited

Loading

Uh oh!

Takhoffman commented Jun 8, 2026

Uh oh!

clawsweeper Bot commented Jun 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

mushuiyu886 commented Jun 6, 2026

Summary

Real behavior proof

Regression Test Plan

Merge risk

Root Cause

Uh oh!

clawsweeper Bot commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Takhoffman commented Jun 8, 2026

Uh oh!

clawsweeper Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

clawsweeper Bot commented Jun 6, 2026 •

edited

Loading

clawsweeper Bot commented Jun 8, 2026 •

edited

Loading