Skip to content

fix(heartbeat): suppress stale final replay#81313

Merged
steipete merged 6 commits into
openclaw:mainfrom
kesslerio:fix/heartbeat-stale-replay
May 27, 2026
Merged

fix(heartbeat): suppress stale final replay#81313
steipete merged 6 commits into
openclaw:mainfrom
kesslerio:fix/heartbeat-stale-replay

Conversation

@kesslerio

@kesslerio kesslerio commented May 13, 2026

Copy link
Copy Markdown
Contributor

What

  • Prevent heartbeat turns from directly replaying durable pending final text.
  • Keep ack-only pending heartbeat state cleanup intact.
  • Skip heartbeat runs when the base or isolated heartbeat session still has an active reply run.
  • Simplify duplicate heartbeat skip/event handling with a local helper.

Why

A stale pending final answer can belong to a prior user task and route, while a later heartbeat may deliver to the agent's default heartbeat target. Returning that pending text from the heartbeat reply path can leak a previous answer into the wrong channel.

This keeps durable pending final recovery available for paths that opt into it, but makes heartbeat runs suppress direct pending-final replay.

Related to #74257.

Evidence

Live reproduction from a Telegram deployment:

  1. A user task was run in one Telegram topic and produced a long final answer there.
  2. In the same minute, the same OpenClaw agent emitted a shortened/stale version of that prior final answer in a different Telegram topic.
  3. The leaked message content matched the previous task result, but it was delivered through the heartbeat/default route rather than the original task route.
  4. This is consistent with the heartbeat path later picking up pendingFinalDeliveryText from session state and returning it as the heartbeat reply.

The reproduction is the same failure class as #74257: asynchronous/heartbeat follow-up delivery can surface prior task output in the wrong Telegram context.

Real behavior proof

  • Behavior or issue addressed: Heartbeat/default-route delivery must not replay a non-ack pending final answer that was created by a previous user task in another Telegram context.

  • Real environment tested: Local OpenClaw checkout for PR fix(heartbeat): suppress stale final replay #81313 at head b2a620dd5044e975aba215edd2edbbb0437814b1, running under Node on Linux. The live proof used @kesslerClawBot in the real Telegram staging forum group -1003908474243, #General topic 1, with a temporary isolated session store so production state was not modified.

  • Exact steps or command run after this patch: Ran a redacted runtime proof script from /tmp/openclaw-pr81313 that seeded a heartbeat session with a unique non-ack pending final marker, routed heartbeat delivery to the staging Telegram topic, invoked runHeartbeatOnce, and used the real Telegram sender through the heartbeat outbound adapter. The model reply was stubbed to HEARTBEAT_OK so the proof focused on stale pending-final suppression and actual Telegram delivery behavior.

  • Evidence after fix: Redacted terminal output from the post-patch branch; public proof comment: fix(heartbeat): suppress stale final replay #81313 (comment)

    head=b2a620dd5044e975aba215edd2edbbb0437814b1
    sessionKey=agent:main:main
    chat=-1003908474243
    topic=1
    staleMarker=PR81313_STALE_FINAL_MUST_NOT_SEND_1779920060790
    before.pendingFinalDelivery=true
    before.pendingFinalDeliveryText=PR81313_STALE_FINAL_MUST_NOT_SEND_1779920060790
    getReplyFromConfig.isHeartbeat=true
    getReplyFromConfig.suppressPendingFinalDeliveryReplay=true
    [telegram] outbound send ok accountId=default chatId=-1003908474243 messageId=18 operation=sendMessage deliveryKind=text chunkCount=1
    heartbeatResult={"status":"ran","durationMs":891}
    sent[0].to=-1003908474243
    sent[0].text=HEARTBEAT_OK
    sent[0].messageThreadId=1
    sent[0].messageId=18
    sentCount=1
    staleMarkerSent=false
    after.pendingFinalDelivery=true
    after.pendingFinalDeliveryText=PR81313_STALE_FINAL_MUST_NOT_SEND_1779920060790
    after.pendingFinalDeliveryAttemptCount=0
    
  • Observed result after fix: The heartbeat path passed suppressPendingFinalDeliveryReplay=true, sent only HEARTBEAT_OK through the real Telegram transport, and did not send the unique stale pending-final marker. The marker remained in durable pending state with attempt count 0, so heartbeat did not consume or replay it.

  • What was not tested: I did not deploy this patch to the private production Telegram gateway before opening the PR; the original live reproduction contains private customer data and is summarized rather than pasted. The live staging proof uses a stubbed heartbeat model reply to keep the proof deterministic while still exercising the heartbeat runner, pending-final suppression flag, outbound adapter, and real Telegram send.

  • Proof limitations or environment constraints: The proof uses a temporary isolated session store and a deterministic heartbeat reply stub, so it proves the stale-final suppression flag, heartbeat runner path, outbound adapter, and real Telegram send without modifying production state.

Tested against

  • Upstream base: 1eb27da
  • Branch head: kesslerio:fix/heartbeat-stale-replay@b2a620d

Tests

  • pnpm test src/auto-reply/reply/get-reply.fast-path.test.ts src/infra/heartbeat-runner.skips-busy-session-lane.test.ts
  • pnpm exec oxfmt --check --threads=1 src/auto-reply/get-reply-options.types.ts src/auto-reply/reply/get-reply.ts src/auto-reply/reply/get-reply.fast-path.test.ts src/infra/heartbeat-runner.ts src/infra/heartbeat-runner.skips-busy-session-lane.test.ts src/infra/heartbeat-runner.test-utils.ts
  • pnpm check:changed --base upstream/main
  • pnpm lint --threads=8
  • pnpm run lint:extensions:bundled
  • pnpm check:test-types

Notes

  • Fork-local CI had a clean code/test signal for the affected shards after retargeting to a clean upstream snapshot base.
  • A repeated fork-local CI timeout occurred in unrelated test/scripts/prompt-snapshots.test.ts during build-artifacts; this PR does not touch prompt snapshot code or fixtures.

AI Assistance

Implemented with AI assistance. Reviewed with focused correctness, test-adequacy, and ce-simplify subagent passes; one isolated-heartbeat guard finding and one duplicate skip-return simplification were fixed before opening this PR.

@openclaw-barnacle openclaw-barnacle Bot added size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. triage: refactor-only Candidate: refactor/cleanup-only PR without maintainer context. labels May 13, 2026
@clawsweeper

clawsweeper Bot commented May 13, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed May 27, 2026, 7:06 PM ET / 23:06 UTC.

Summary
The branch removes heartbeat direct replay of non-ack pending final text, keeps ack cleanup, adds an isolated active-run guard, and updates focused heartbeat/reply tests.

PR surface: Source -20, Tests +104. Total +84 across 5 files.

Reproducibility: yes. Current main has a source-clear path where heartbeat returns non-ack pending final text, and the PR includes sanitized live incident context plus after-fix Telegram terminal proof; I did not rerun the live Telegram scenario.

Review metrics: 1 noteworthy metric.

  • Heartbeat replay behavior: 1 direct replay path removed; 0 new config knobs. This is an unconditional message-delivery behavior change, so maintainers should consciously accept the recovery tradeoff before merge.

Merge readiness
Overall: 🦞 diamond lobster
Proof: 🦞 diamond lobster
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Mantis proof suggestion
A live Telegram transcript would materially confirm the message-delivery invariant that stale pending finals are not emitted to the heartbeat target. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

telegram live QA: verify a heartbeat with pending stale final text sends only the heartbeat result and does not replay prior topic text.

Risk before merge

  • Suppressing heartbeat replay fixes wrong-target leakage, but it can leave a non-ack pending final durable until a non-heartbeat or future route-aware recovery path handles it.
  • The inspected PR head still had required checks in progress, so merge should wait for normal CI completion.

Maintainer options:

  1. Merge after delivery acceptance (recommended)
    Once required checks pass, maintainers can accept that heartbeats suppress non-ack pending-final replay and rely on non-heartbeat or future route-aware recovery for durable final delivery.
  2. Require route-aware recovery first
    If heartbeat recovery must preserve pending finals, require the PR to deliver through the captured pendingFinalDeliveryContext instead of the heartbeat default target.

Next step before merge
No repair lane is needed because no actionable code finding remains; maintainers need to accept the heartbeat delivery tradeoff and wait for required checks.

Security
Cleared: The diff only changes heartbeat/reply runtime logic and tests; I found no dependency, workflow, secret, or supply-chain regression in the patch.

Review details

Best possible solution:

Land this narrow suppression only if maintainers accept that heartbeats should never directly replay pending user finals; otherwise add route-aware recovery through the captured delivery context.

Do we have a high-confidence way to reproduce the issue?

Yes. Current main has a source-clear path where heartbeat returns non-ack pending final text, and the PR includes sanitized live incident context plus after-fix Telegram terminal proof; I did not rerun the live Telegram scenario.

Is this the best way to solve the issue?

Yes. Removing direct heartbeat replay is the narrowest fix for wrong-target leakage, with route-aware durable recovery as the safer long-term alternative if maintainers need heartbeat recovery to deliver pending finals.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against da5fe990d8b3.

Label changes

Label changes:

  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR supplies after-fix terminal proof from a real Telegram staging send showing the stale marker was not delivered and only the heartbeat result went out.

Label justifications:

  • P1: The PR targets a high-priority user-visible channel bug where stale final text can be replayed into the wrong Telegram context.
  • merge-risk: 🚨 message-delivery: Merging changes how heartbeat recovers pending final text and could suppress a pending final unless another recovery path handles it.
  • rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR supplies after-fix terminal proof from a real Telegram staging send showing the stale marker was not delivered and only the heartbeat result went out.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR supplies after-fix terminal proof from a real Telegram staging send showing the stale marker was not delivered and only the heartbeat result went out.
  • mantis: telegram-visible-proof: Mantis should capture Telegram visible proof. The PR changes visible Telegram heartbeat delivery behavior, so a short Telegram proof can directly show stale text is not replayed.
Evidence reviewed

PR surface:

Source -20, Tests +104. Total +84 across 5 files.

View PR surface stats
Area Files Added Removed Net
Source 3 13 33 -20
Tests 2 128 24 +104
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 5 141 57 +84

What I checked:

  • Current main replay behavior: Current main still returns non-ack pending final text directly from the heartbeat reply path after updating replay attempt fields, which is the behavior this PR targets. (src/auto-reply/reply/get-reply.ts:496, da5fe990d8b3)
  • PR suppresses direct heartbeat replay: At PR head, the non-ack replay branch is removed; heartbeat only clears ack-only pending state and otherwise continues into the normal heartbeat reply path. (src/auto-reply/reply/get-reply.ts:499, e1d6f32aef81)
  • Isolated heartbeat active-run guard: The PR adds an isReplyRunActive check for the isolated heartbeat session key before creating the isolated heartbeat run. (src/infra/heartbeat-runner.ts:1555, e1d6f32aef81)
  • Focused regression coverage: The PR updates get-reply fast-path tests for non-ack pending finals and adds heartbeat-runner cases for active reply runs, isolated active runs, and stale pending final suppression. (src/infra/heartbeat-runner.skips-busy-session-lane.test.ts:327, e1d6f32aef81)
  • Real Telegram proof supplied: The PR body and proof comment report a staging Telegram run where a seeded stale marker was not sent, only the heartbeat result was sent, and the pending final remained durable with no replay attempt. (e1d6f32aef81)
  • Telegram review standard applied: The Telegram maintainer note requires real Telegram proof for transport, topic, reply-context, and visible behavior changes; this shaped the proof assessment. (.agents/maintainer-notes/telegram.md:37, da5fe990d8b3)

Likely related people:

  • steipete: Local blame attributes the current pending-final replay and heartbeat deferral blocks in the shallow checkout to recent main history, GitHub path history shows recent session/reply work by this handle, and the live PR is assigned to this reviewer. (role: recent area contributor and assigned reviewer; confidence: high; commits: c71c49c46028, 15b1e99df310, e1d6f32aef81; files: src/auto-reply/reply/get-reply.ts, src/infra/heartbeat-runner.ts, src/infra/heartbeat-runner.skips-busy-session-lane.test.ts)
  • EronFan: The related active-reply heartbeat guard PR credits this handle for the original fix and regression test class that this PR preserves and extends. (role: adjacent source-fix author; confidence: medium; commits: dd5bd5d0357b; files: src/infra/heartbeat-runner.ts, src/infra/heartbeat-runner.skips-busy-session-lane.test.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@kesslerio

Copy link
Copy Markdown
Contributor Author

Added the live Telegram reproduction summary to the PR body and linked this to #74257. I kept the reproduction sanitized because the original live trace included private deployment/customer data, but the important shape is captured: one topic gets the intended task reply, then a heartbeat/default route re-emits stale prior task output into another Telegram context in the same minute.

@kesslerio

kesslerio commented May 26, 2026

Copy link
Copy Markdown
Contributor Author

I rechecked this PR against current origin/main on 2026-05-26. The branch still has unique patch content, but GitHub currently reports it as merge-conflicted, so a rebase/refresh is likely needed before maintainer review.

@clawsweeper re-review to refresh the current author-facing status and blockers, please.

@clawsweeper

clawsweeper Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

🦞👀
ClawSweeper picked this up.

Command router queued. I will update this comment with the next step.

@kesslerio kesslerio requested review from a team as code owners May 26, 2026 19:27
@kesslerio

kesslerio commented May 26, 2026

Copy link
Copy Markdown
Contributor Author

I have refactored this branch against the current origin/main with a merge commit, resolving any PR conflicts without rewriting the branch history.

I have retained both sides of the conflicted heartbeat changes: the current main's pending-final sanitization coverage and this PR's stale final replay suppression. Additionally, I have preserved the newer reply-run registry skip path and wired the isolated heartbeat active-run guard through the current isReplyRunActive path.

I have verified the changes on the head branch fb7eaba67a0ccb08feb6b483a821c9c4ae9abfdd using the following commands:

pnpm test src/auto-reply/reply/get-reply.fast-path.test.ts src/infra/heartbeat-runner.skips-busy-session-lane.test.ts
pnpm exec oxfmt --check --threads=1 src/auto-reply/get-reply-options.types.ts src/auto-reply/reply/get-reply.fast-path.test.ts src/auto-reply/reply/get-reply.ts src/infra/heartbeat-runner.skips-busy-session-lane.test.ts src/infra/heartbeat-runner.test-utils.ts src/infra/heartbeat-runner.ts

Please review the changes.

@clawsweeper

@openclaw-barnacle openclaw-barnacle Bot added channel: discord Channel integration: discord channel: nostr Channel integration: nostr app: web-ui App: web-ui extensions: diagnostics-otel Extension: diagnostics-otel extensions: memory-core Extension: memory-core cli CLI command changes scripts Repository scripts commands Command implementations docker Docker and sandbox tooling agents Agent runtime and tooling labels May 26, 2026
@kesslerio

kesslerio commented May 26, 2026

Copy link
Copy Markdown
Contributor Author

The branch has been narrowed against the current origin/main and the unrelated generated/browser/runtime churn flagged by ClawSweeper has been removed. The PR diff is now limited to the heartbeat replay suppression and its focused tests.

Head now: 78e770c896334a7a75e42dcbf697d6b8c5870df1

Current PR diff files:

  • src/auto-reply/get-reply-options.types.ts
  • src/auto-reply/reply/get-reply.fast-path.test.ts
  • src/auto-reply/reply/get-reply.ts
  • src/infra/heartbeat-runner.skips-busy-session-lane.test.ts
  • src/infra/heartbeat-runner.test-utils.ts
  • src/infra/heartbeat-runner.ts

Validation run locally:

/tmp/openclaw-pr76920-merge/node_modules/.bin/oxfmt --check --threads=1 \\
src/auto-reply/get-reply-options.types.ts \\
src/auto-reply/reply/get-reply.fast-path.test.ts \\
src/auto-reply/reply/get-reply.ts \\
src/infra/heartbeat-runner.skips-busy-session-lane.test.ts \\
src/infra/heartbeat-runner.test-utils.ts \\
src/infra/heartbeat-runner.ts
node scripts/test-projects.mjs \\
src/auto-reply/reply/get-reply.fast-path.test.ts \\
src/infra/heartbeat-runner.skips-busy-session-lane.test.ts

Results: formatter passed; focused Vitest passed 2 shards / 29 tests.

@clawsweeper re-review

@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 26, 2026
@clawsweeper

clawsweeper Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@kesslerio

Copy link
Copy Markdown
Contributor Author

Updated this PR to address the latest ClawSweeper finding.

Head: c2443d2090a7bb46cdebd7e18e6c928b83868c5c7

What changed:

  • Restored the active-run compatibility coverage that ClawSweeper called out in src/infra/heartbeat-runner.skips-busy-session-lane.test.ts.
  • Re-added the same-agent scheduled heartbeat deferral case.
  • Re-added the unscoped active-key ignored case.
  • Re-added the immediate heartbeat wake bypass case.
  • Runtime code unchanged; this is test coverage only.

Reviewer/bot feedback addressed:

  • ClawSweeper P3: “Restore the active-run compatibility cases”.

Proof:

  • public artifact or excerpt: this PR diff now includes the restored tests directly in src/infra/heartbeat-runner.skips-busy-session-lane.test.ts.
  • raw local audit source, if any: none needed.
  • proof limitation, if any: local Vitest and tsgo commands timed out during runner startup in the temporary worktree before producing assertion output; CI is the authoritative execution proof for this pushed head.

Validation:

pnpm exec oxfmt --write src/infra/heartbeat-runner.skips-busy-session-lane.test.ts
pnpm exec oxfmt --check --threads=1 src/infra/heartbeat-runner.skips-busy-session-lane.test.ts
git diff --check

timeout -k5s 90s pnpm vitest src/infra/heartbeat-runner.skips-busy-session-lane.test.ts -t "active reply run|immediate heartbeat wakes|unscoped active reply"
timeout -k5s 120s node scripts/run-vitest.mjs src/infra/heartbeat-runner.skips-busy-session-lane.test.ts -t "active reply run|immediate heartbeat wakes|unscoped active reply"
timeout -k5s 30s node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.core.test.json --noEmit --pretty false

Result:

  • formatter write/check passed
  • git diff --check passed
  • focused Vitest timed out during local runner startup with Rolldown timing warnings only
  • scripts/run-vitest.mjs timed out without test output
  • run-tsgo.mjs timed out before output
  • CI is running for the pushed head

Current state:

  • conflicts: none known before this push; CI will verify current merge state
  • CI: running
  • proof: supplied / sufficient from prior review; this update is a test-only feedback repair
  • rating target: platinum minimum, diamond preferred
  • current rating before this push: platinum hermit, but durable verdict still had needs-changes because of the restored-test finding
  • remaining blocker, if any: CI/re-review result

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 27, 2026
@kesslerio

Copy link
Copy Markdown
Contributor Author

Refreshed this branch against current origin/main with a merge commit: b2a620dd5044e975aba215edd2edbbb0437814b1.

This was a CI cleanup refresh, not a behavior expansion. The effective PR diff is still limited to the six heartbeat/reply files, and the stale CI failures are addressed by carrying current main forward:

  • the previous Number.NaN oxlint failures are gone
  • the previous image-tool.test.ts inboundRoots type error is gone
  • GitHub should rerun the failed CI shards on the refreshed head

Validation run locally on the refreshed head:

pnpm exec oxfmt --check --threads=1 src/auto-reply/get-reply-options.types.ts src/auto-reply/reply/get-reply.fast-path.test.ts src/auto-reply/reply/get-reply.ts src/infra/heartbeat-runner.skips-busy-session-lane.test.ts src/infra/heartbeat-runner.test-utils.ts src/infra/heartbeat-runner.ts
pnpm lint --threads=8
pnpm run lint:extensions:bundled
pnpm check:test-types
pnpm test src/auto-reply/reply/get-reply.fast-path.test.ts src/infra/heartbeat-runner.skips-busy-session-lane.test.ts

All passed on b2a620dd5044e975aba215edd2edbbb0437814b1.

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 27, 2026
@kesslerio

Copy link
Copy Markdown
Contributor Author

Live Telegram heartbeat stale-final proof, head b2a620dd5044e975aba215edd2edbbb0437814b1

I added fresh staging proof for the heartbeat stale-final case using @kesslerClawBot in the real Telegram staging forum group.

Environment:

OpenClaw PR head: b2a620dd5044e975aba215edd2edbbb0437814b1
Gateway/proof worktree: /tmp/openclaw-pr81313
Staging group: -1003908474243
#General topic: 1
Bot: @kesslerClawBot

Proof shape: seed a temporary session store with a non-ack pending final marker, run runHeartbeatOnce, route heartbeat delivery to the staging Telegram topic, stub the heartbeat model reply to HEARTBEAT_OK, and use the real Telegram sender through the heartbeat outbound adapter.

Observed transcript:

head=b2a620dd5044e975aba215edd2edbbb0437814b1
sessionKey=agent:main:main
chat=-1003908474243
topic=1
staleMarker=PR81313_STALE_FINAL_MUST_NOT_SEND_1779920060790
before.pendingFinalDelivery=true
before.pendingFinalDeliveryText=PR81313_STALE_FINAL_MUST_NOT_SEND_1779920060790
getReplyFromConfig.isHeartbeat=true
getReplyFromConfig.suppressPendingFinalDeliveryReplay=true
[telegram] outbound send ok accountId=default chatId=-1003908474243 messageId=18 operation=sendMessage deliveryKind=text chunkCount=1
heartbeatResult={"status":"ran","durationMs":891}
sent[0].to=-1003908474243
sent[0].text=HEARTBEAT_OK
sent[0].messageThreadId=1
sent[0].messageId=18
sentCount=1
staleMarkerSent=false
after.pendingFinalDelivery=true
after.pendingFinalDeliveryText=PR81313_STALE_FINAL_MUST_NOT_SEND_1779920060790
after.pendingFinalDeliveryAttemptCount=0

What this proves:

  • Heartbeat called the reply path with suppressPendingFinalDeliveryReplay=true.
  • The only live Telegram send was HEARTBEAT_OK to the staging topic.
  • The stale pending-final marker was not sent.
  • The pending final remained durable with attempt count 0, so heartbeat did not consume or replay it.

I also refreshed the PR body proof section with this transcript.

@kesslerio

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

🦞👀
ClawSweeper picked this up.

Command router queued. I will update this comment with the next step.

Re-review progress:

@steipete

Copy link
Copy Markdown
Contributor

Landing proof for head e1d6f32:

Behavior addressed: heartbeat runs no longer directly replay non-ack durable pending final text; ack-only heartbeat pending state still clears, and isolated heartbeat runs now skip when their reply run is active.

Real environment tested: local OpenClaw checkout on the PR branch plus GitHub CI for the PR head.

Exact steps or command run after this patch:

  • node scripts/run-vitest.mjs src/auto-reply/reply/get-reply.fast-path.test.ts src/infra/heartbeat-runner.skips-busy-session-lane.test.ts
  • git diff --check
  • /Users/steipete/Projects/agent-scripts/skills/autoreview/scripts/autoreview --mode local
  • /Users/steipete/Projects/agent-scripts/skills/autoreview/scripts/autoreview --mode branch --base origin/main

Evidence after fix:

  • Local focused Vitest: 2 files, 31 tests passed.
  • Local diff whitespace check: clean.
  • Autoreview local: clean, no accepted/actionable findings.
  • Autoreview branch vs origin/main: clean, no accepted/actionable findings.
  • GitHub CI run 26543804437: success on head e1d6f32.
  • CodeQL run 26543804438: success.
  • CodeQL Critical Quality run 26543804441: success.
  • OpenGrep PR Diff run 26543804440 rerun job 78197443511: success.
  • Real behavior proof run 26544027357: success.

Observed result after fix: stale non-ack pending final text remains durable but is not returned as the heartbeat reply; heartbeat ack-only pending delivery is still cleared; heartbeat runner passes only normal heartbeat options and avoids replacing an active isolated heartbeat session.

What was not tested: no new live production Telegram incident replay was run during landing; the PR body retains the existing sanitized live/staging proof.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

mantis: telegram-visible-proof Mantis should capture Telegram visible proof. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. P1 High-priority user-facing bug, regression, or broken workflow. proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. size: S status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. triage: refactor-only Candidate: refactor/cleanup-only PR without maintainer context.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants