Skip to content

Fix restart sentinel internal continuations#88161

Merged
joshavant merged 8 commits into
mainfrom
fix/restart-sentinel-internal-continuation
May 30, 2026
Merged

Fix restart sentinel internal continuations#88161
joshavant merged 8 commits into
mainfrom
fix/restart-sentinel-internal-continuation

Conversation

@joshavant

@joshavant joshavant commented May 29, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #87792.

This makes restart-sentinel continuations internal by default instead of letting synthetic restart turns auto-deliver final assistant output back to the originating channel. It also preserves current channel/thread/message context for CLI, MCP, and queued follow-up paths so an explicit post-restart continuation can still speak only when it deliberately uses the message tool.

Verification

  • git diff --check
  • .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main: autoreview clean: no accepted/actionable findings reported
  • pnpm build
  • node scripts/run-vitest.mjs src/infra/restart-sentinel.test.ts src/gateway/server-restart-sentinel.test.ts src/agents/tools/gateway-tool.test.ts src/auto-reply/reply/commands-session-restart.test.ts src/gateway/server-methods/update.test.ts src/gateway/mcp-http.test.ts src/agents/cli-runner/prepare.test.ts src/gateway/tool-resolution.test.ts src/agents/cli-runner.spawn.test.ts src/auto-reply/reply/agent-runner-utils.test.ts src/auto-reply/reply/followup-runner.test.ts src/auto-reply/reply/get-reply-run.media-only.test.ts test/scripts/label-open-issues.test.ts test/scripts/gh-read.test.ts src/security/audit-sandbox-browser.test.ts: passed, 23 files / 541 tests
  • Blacksmith Testbox regression proof: provider blacksmith-testbox, focused build + regression suite passed, 23 files / 541 tests
  • Local live WhatsApp proof with Convex-leased whatsapp QA credential and mock-openai provider mode: passed; credential identifier redacted

Real Behavior Proof

Behavior addressed: Restart-sentinel agentTurn continuations no longer auto-deliver plain final assistant output into the originating channel. The restart notice itself still delivers to the live channel route.

Real environment tested: Local source checkout with the real WhatsApp transport, a Convex-leased whatsapp QA credential, and mock-openai provider mode. Credential identifiers and live routing details are redacted from this PR body.

Exact steps or command run after this patch: Built the current branch, started the QA WhatsApp driver and SUT gateway, sent a real WhatsApp group seed turn, wrote a restart sentinel with an explicit agentTurn continuation marker, restarted the gateway via restartAfterStateMutation, then watched the live group for the restart notice and for absence of the continuation marker.

Evidence after fix: Local live proof passed with ISSUE_87792_LIVE_FIX_PROOF pass credential=<redacted> seedMs=5182 noticeMs=10379 and ISSUE_87792_LIVE_REGRESSION_REPRO pass continuationMarkerAbsentMs=30000.

Observed result after fix: The seed WhatsApp group turn replied, the restart notice was delivered after restart, and the explicit continuation final marker did not appear in the live WhatsApp group during the regression window.

What was not tested: I did not run a live Telegram or Discord restart-sentinel path. Blacksmith could run the focused regression suite, but the exact custom live proof could not run there because the delegated Testbox path cannot forward local credential env; a workflow-backed WhatsApp live lane was blocked by a logged-out pooled WhatsApp session before scenario execution.

@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime agents Agent runtime and tooling size: M maintainer Maintainer-authored PR labels May 29, 2026
@clawsweeper

clawsweeper Bot commented May 29, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed May 29, 2026, 9:59 PM ET / 01:59 UTC.

Summary
The branch makes restart-sentinel agent-turn continuations internal/message-tool-only, removes inferred success continuations, preserves channel/thread/message context for CLI and MCP loopback paths, and updates focused regression tests and prompt snapshots.

PR surface: Source +40, Tests +82. Total +122 across 36 files.

Reproducibility: yes. Source inspection of current main shows session-scoped restarts infer an agentTurn continuation and route final delivery back to the channel; Mantis also captured baseline Telegram behavior for the visible restart-continuation path.

Review metrics: 1 noteworthy metric.

  • Restart Continuation Defaults: 1 inferred default removed; 1 automatic final-delivery path disabled. This is the compatibility-sensitive part of the PR because existing restart/update flows may stop emitting visible success continuations by default.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster ✨ media proof bonus
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P2] Get explicit maintainer approval for the internal/message-tool-only restart-continuation semantics before merge.

Risk before merge

  • [P1] Existing restart or update workflows that relied on inferred post-restart success continuations will stop auto-posting unless they provide an explicit continuation and use the message tool for visible output.
  • [P1] The user-visible delivery behavior changes for channel-bound restart continuations: WhatsApp and Telegram have real proof, while Discord remains covered by focused tests rather than live transport proof.

Maintainer options:

  1. Approve Internal-Only Continuations (recommended)
    A maintainer can approve the intentional behavior change and merge with the documented expectation that visible post-restart output must use the message tool.
  2. Preserve A Compatibility Path
    If existing auto-posted success continuations are still a supported contract, require a compatibility-preserving default or explicit opt-in strict mode before merge.
  3. Wait For Broader Transport Proof
    If maintainers need live parity across all channel transports, pause merge until Discord restart-continuation behavior is proven outside focused tests.

Next step before merge

  • [P2] The remaining action is maintainer approval of the protected, compatibility-sensitive delivery semantics; there is no narrow automated repair left.

Security
Cleared: Cleared: the diff fixes a message-leak class without adding dependencies, workflow permissions, package-resolution changes, or new secret handling; the new MCP context headers remain behind the existing loopback bearer check.

Review details

Best possible solution:

Land the internal/message-tool-only restart-continuation behavior only after a maintainer explicitly approves the compatibility and message-delivery semantics for existing restart/update workflows.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection of current main shows session-scoped restarts infer an agentTurn continuation and route final delivery back to the channel; Mantis also captured baseline Telegram behavior for the visible restart-continuation path.

Is this the best way to solve the issue?

Yes, with maintainer approval. The patch addresses the implicated runtime path directly and keeps explicit visible follow-up available through the message tool, but the compatibility semantics are intentional enough to require human approval.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against f870beac85ec.

Label changes

Label justifications:

  • P1: The PR fixes a real channel-bound restart workflow leak where synthetic restart turns can emit unintended outbound messages.
  • merge-risk: 🚨 compatibility: Merging changes the default behavior for existing restart/update continuations that previously inferred and auto-posted success replies.
  • merge-risk: 🚨 message-delivery: Merging changes whether restart-continuation assistant output is delivered automatically or only through explicit message-tool sends.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (recording): The PR body includes live WhatsApp proof with redacted credential details, and Mantis produced inspected Telegram Desktop recordings/contact sheets for the visible restart-continuation path.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes live WhatsApp proof with redacted credential details, and Mantis produced inspected Telegram Desktop recordings/contact sheets for the visible restart-continuation path.
  • proof: 🎥 video: Contributor real behavior proof includes video or recording evidence. The PR body includes live WhatsApp proof with redacted credential details, and Mantis produced inspected Telegram Desktop recordings/contact sheets for the visible restart-continuation path.
  • mantis: telegram-visible-proof: Mantis should capture Telegram visible proof. The PR changes visible Telegram restart-continuation behavior, and the existing Mantis Telegram Desktop proof is the right proof lane for that surface.
Evidence reviewed

PR surface:

Source +40, Tests +82. Total +122 across 36 files.

View PR surface stats
Area Files Added Removed Net
Source 16 100 60 +40
Tests 20 233 151 +82
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 36 333 211 +122

What I checked:

  • Repository policy applied: Root and scoped AGENTS.md guidance was read; root policy treats provider routing, session state, fallback behavior, and message delivery as compatibility-sensitive PR review surfaces. (AGENTS.md:20, f870beac85ec)
  • Current-main provenance for old default continuation: Current main still inferred an agentTurn continuation from any sessionKey via DEFAULT_RESTART_SUCCESS_CONTINUATION_MESSAGE before this PR changes that behavior. (src/infra/restart-sentinel.ts:173, 5f89fbe6699e)
  • Current-main provenance for old routed final delivery: Current main built restart continuation context with ExplicitDeliverRoute true and durable final delivery back to the route, which is the central behavior this PR changes. (src/gateway/server-restart-sentinel.ts:316, 5f89fbe6699e)
  • PR suppresses automatic final delivery: The PR sets ExplicitDeliverRoute false, forces sourceReplyDeliveryMode to message_tool_only, disables durable final delivery, and leaves visible follow-up to the message tool. (src/gateway/server-restart-sentinel.ts:321, dccbe0782bb8)
  • PR removes inferred success continuation: buildRestartSuccessContinuation now returns an agentTurn only for an explicit continuationMessage and otherwise returns null. (src/infra/restart-sentinel.ts:162, dccbe0782bb8)
  • PR preserves message-tool context through CLI/MCP: The branch forwards current channel, thread, and message ids into CLI bundle MCP env and scoped tool resolution so explicit message-tool continuations can target the originating route. (src/agents/cli-runner/prepare.ts:270, dccbe0782bb8)

Likely related people:

  • keshavbotagent: Blame and log point the current restart-sentinel default continuation and routed final delivery code to commit 5f89fbe in the checked-out main history. (role: introduced current restart-sentinel behavior in checkout history; confidence: medium; commits: 5f89fbe6699e; files: src/infra/restart-sentinel.ts, src/gateway/server-restart-sentinel.ts, src/auto-reply/reply/agent-runner-utils.ts)
  • Peter Steinberger: Feature history shows repeated recent work in reply, gateway, CLI loopback, and followup routing seams adjacent to this patch. (role: recent adjacent owner; confidence: medium; commits: 43e6c923de, 3de09fbe74, 54648a9cf1; files: src/auto-reply/reply/followup-runner.ts, src/gateway/tool-resolution.ts, src/gateway/server-restart-sentinel.ts)
  • Gustavo Madeira Santana: Recent history includes restart-sentinel route preservation work that is directly adjacent to this PR's delivery-context changes. (role: recent restart-sentinel route contributor; confidence: medium; commits: 9b44929f28, 9c44f10026; files: src/gateway/server-restart-sentinel.ts, src/infra/restart-sentinel.ts)
  • Josh Avant: Beyond authoring this PR, prior merged history shows relevant reply-run channel/account SecretRef work on the same routing surface. (role: recent reply-run routing contributor; confidence: medium; commits: 731d4666d2, dccbe0782bb8; files: src/auto-reply/reply/followup-runner.ts, src/auto-reply/reply/agent-runner-utils.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. mantis: telegram-visible-proof Mantis should capture Telegram visible proof. P1 High-priority user-facing bug, regression, or broken workflow. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. labels May 29, 2026
@joshavant joshavant requested a review from a team as a code owner May 29, 2026 23:45
@openclaw-mantis

Copy link
Copy Markdown

Mantis Telegram Desktop Proof

Summary: Mantis captured native Telegram Desktop before/after GIF evidence for restart continuation delivery.

Main screenshot This PR screenshot
Baseline native Telegram Desktop screenshot Candidate native Telegram Desktop screenshot
Main This PR
Baseline native Telegram Desktop proof GIF Candidate native Telegram Desktop proof GIF

Motion-trimmed clips:

Raw QA files: https://artifacts.openclaw.ai/mantis/telegram-desktop/pr-88161/run-26667359644-1/index.json

@clawsweeper clawsweeper Bot added the proof: 🎥 video Contributor real behavior proof includes video or recording evidence. label May 30, 2026
@joshavant joshavant force-pushed the fix/restart-sentinel-internal-continuation branch from 9d3e8e5 to 8500b54 Compare May 30, 2026 00:29
@joshavant

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 30, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@joshavant joshavant force-pushed the fix/restart-sentinel-internal-continuation branch from 8500b54 to dccbe07 Compare May 30, 2026 01:50
@joshavant

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 30, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@joshavant joshavant merged commit 584fa32 into main May 30, 2026
169 of 170 checks passed
@joshavant joshavant deleted the fix/restart-sentinel-internal-continuation branch May 30, 2026 02:06
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 30, 2026
* fix restart sentinel internal continuations

* update gateway prompt snapshots

* stabilize sandbox browser audit timer tests

* drive sandbox audit timeouts deterministically

* drive gh-read timeout tests deterministically

* drive label-open-issues timeout tests deterministically

* document deterministic timeout test timers

* test: preserve deterministic timer setup after rebase
SYU8384 pushed a commit to SYU8384/openclaw that referenced this pull request Jun 3, 2026
* fix restart sentinel internal continuations

* update gateway prompt snapshots

* stabilize sandbox browser audit timer tests

* drive sandbox audit timeouts deterministically

* drive gh-read timeout tests deterministically

* drive label-open-issues timeout tests deterministically

* document deterministic timeout test timers

* test: preserve deterministic timer setup after rebase
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
* fix restart sentinel internal continuations

* update gateway prompt snapshots

* stabilize sandbox browser audit timer tests

* drive sandbox audit timeouts deterministically

* drive gh-read timeout tests deterministically

* drive label-open-issues timeout tests deterministically

* document deterministic timeout test timers

* test: preserve deterministic timer setup after rebase
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling gateway Gateway runtime maintainer Maintainer-authored PR mantis: telegram-visible-proof Mantis should capture Telegram visible proof. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. P1 High-priority user-facing bug, regression, or broken workflow. proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: 🎥 video Contributor real behavior proof includes video or recording evidence. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: L status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Restart-sentinel turn on channel-bound session emits outbound reply to source chat

1 participant