Skip to content

Fix stale visible reply recovery#91840

Merged
joshavant merged 4 commits into
mainfrom
fix-visible-reply-stale-recovery
Jun 10, 2026
Merged

Fix stale visible reply recovery#91840
joshavant merged 4 commits into
mainfrom
fix-visible-reply-stale-recovery

Conversation

@joshavant

Copy link
Copy Markdown
Contributor

Summary

Fixes #90535.

  • Adds a bounded recovery retry for visible reply dispatch when the reply run registry is stuck behind stale active work.
  • Routes visible stale recovery through the existing diagnostic recovery coordinator so diagnostic session state is cleaned up consistently.
  • Preserves the diagnostics enable gate and keeps disabled-diagnostics behavior as wait-for-active, not recovery/abort.
  • Releases inbound dedupe only for busy admission before agent/model work starts; once processing begins, busy completion still commits dedupe to avoid duplicate work.

Verification

  • node scripts/run-vitest.mjs src/auto-reply/reply/dispatch-from-config.stale-recovery.test.ts src/logging/diagnostic.test.ts src/logging/diagnostic-stuck-session-recovery.runtime.test.ts src/logging/diagnostic-stuck-session-recovery.integration.test.ts
  • node --import tsx scripts/check-import-cycles.ts
  • pnpm -s tsgo:core:test
  • .agents/skills/autoreview/scripts/autoreview --mode local

Real behavior proof

Behavior addressed: A visible channel turn can wait behind stale replyRunRegistry work forever or return busy without retrying after stale recovery.

Real environment tested: Final patch was verified with focused local regression tests; AWS Crabbox live Telegram bot-to-bot proof was run earlier on this branch before the final coordinator/gating cleanup.

Exact steps or command run after this patch: node scripts/run-vitest.mjs src/auto-reply/reply/dispatch-from-config.stale-recovery.test.ts src/logging/diagnostic.test.ts src/logging/diagnostic-stuck-session-recovery.runtime.test.ts src/logging/diagnostic-stuck-session-recovery.integration.test.ts

Evidence after fix: Focused final-patch tests passed: 92 diagnostic tests and 6 stale visible recovery tests, including disabled diagnostics, active lane task, already-in-flight recovery, released-lane wait, stale abort retry, and inbound dedupe release coverage.

Observed result after fix: Stale visible dispatch retries after coordinator recovery when diagnostics are enabled; active recovery outcomes keep waiting; disabled diagnostics keep waiting without invoking recovery; pre-processing busy paths release inbound dedupe for retry.

What was not tested: The live Telegram channel proof was not rerun after the final coordinator/gating cleanup. Earlier proof on this branch passed on AWS Crabbox run_c8e3fd6f041f / lease cbx_dd5487ad6a36: Telegram canary passed in 1018ms and telegram-reply-chain-exact-marker passed in 1783ms with 2/2 scenarios passing.

@openclaw-barnacle openclaw-barnacle Bot added size: L maintainer Maintainer-authored PR labels Jun 10, 2026
@clawsweeper

clawsweeper Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed June 10, 2026, 3:48 AM ET / 07:48 UTC.

Summary
The PR adds diagnostic-coordinated stale visible reply recovery, retries reply admission after cleared stale work, adjusts busy-path inbound dedupe handling, and adds focused regression tests.

PR surface: Source +159, Tests +722. Total +881 across 7 files.

Reproducibility: yes. source-reproducible: current main sends visible turns into an unbounded waitForIdle when a reply operation remains active, and waitForIdle only times out for finite timeoutMs. I did not run a failing repro because this review is read-only.

Review metrics: none identified.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Have a maintainer explicitly accept the session-state and message-delivery recovery risk before landing.

Risk before merge

  • [P1] The patch can force-clear an exact per-session ReplyOperation after diagnostic recovery; a wrong recovery classification would affect session ownership and visible reply delivery.
  • [P1] The live proof covers the packaged Telegram reply path after the runtime cleanup, while the actual stale replyRunRegistry recovery branch is proven by focused tests rather than a naturally wedged live session.

Maintainer options:

  1. Land after session-recovery review (recommended)
    Maintainers can accept the recovery behavior if they agree that diagnostics-gated force-clear of the unchanged ReplyOperation is the right boundary for stale visible replies.
  2. Ask for stale-lock live proof
    If higher assurance is needed, ask for a packaged proof or diagnostic log artifact that injects or naturally hits a stale replyRunRegistry lock and shows recovery.
  3. Pause for registry-owned design
    If visible dispatch should not invoke diagnostic recovery, pause this PR and move the fix into a narrower reply-run-registry timeout or reaper design.

Next step before merge

  • [P2] The PR is protected by the maintainer label and has deliberate session/message-delivery merge risk, so the next action is human maintainer review rather than an automated repair lane.

Security
Cleared: The diff only changes TypeScript runtime and tests under reply dispatch/logging, with no dependency, workflow, secret, package, or supply-chain surface changes.

Review details

Best possible solution:

A human maintainer should review and explicitly accept the bounded recovery semantics, then land this PR if the session/message-delivery tradeoff is acceptable.

Do we have a high-confidence way to reproduce the issue?

Yes, source-reproducible: current main sends visible turns into an unbounded waitForIdle when a reply operation remains active, and waitForIdle only times out for finite timeoutMs. I did not run a failing repro because this review is read-only.

Is this the best way to solve the issue?

Yes, this appears to be an acceptable narrow fix: it reuses the existing diagnostic recovery coordinator and preserves disabled-diagnostics behavior instead of adding a separate global registry reaper. The maintainer decision is whether that recovery boundary is acceptable for visible dispatch.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against fd7e1815006a.

Label changes

Label justifications:

  • P1: The PR targets a broken visible channel reply workflow where subsequent inbound messages can hang or be skipped behind stale session state.
  • merge-risk: 🚨 message-delivery: The changed admission/recovery path decides whether visible inbound messages wait, retry, or return busy.
  • merge-risk: 🚨 session-state: The diff can clear per-session reply registry state after diagnostic recovery and therefore changes session ownership behavior.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (logs): The PR has live Telegram Docker RTT proof for the branch runtime plus focused stale-recovery regression tests; the final head commit is test-only.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR has live Telegram Docker RTT proof for the branch runtime plus focused stale-recovery regression tests; the final head commit is test-only.
  • mantis: telegram-visible-proof: Mantis should capture Telegram visible proof. The PR changes visible channel reply delivery behavior, so Telegram-visible proof is useful and the PR already provides live Telegram lane evidence.
Evidence reviewed

PR surface:

Source +159, Tests +722. Total +881 across 7 files.

View PR surface stats
Area Files Added Removed Net
Source 4 207 48 +159
Tests 3 722 0 +722
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 7 929 48 +881

What I checked:

  • Repository policy read: Root AGENTS.md was read fully; no scoped AGENTS.md exists under the touched src/auto-reply or src/logging paths, and the Telegram maintainer note requires real Telegram proof for visible reply-context behavior. (AGENTS.md:1, fd7e1815006a)
  • Current-main stale wait path: Current main lets visible turns fall through to replyRunRegistry.waitForIdle with no timeout when a reply operation is already active; waitForIdle only installs a timer when timeoutMs is finite. (src/auto-reply/reply/reply-turn-admission.ts:65, fd7e1815006a)
  • PR recovery implementation: The PR gates stale visible recovery on enabled diagnostics, waits only to the resolved stuck-session abort threshold, requests diagnostic recovery, and force-clears only when the same ReplyOperation remains after a recovery outcome that says active work was cleared. (src/auto-reply/reply/dispatch-from-config.ts:1269, fc29c452bb65)
  • Regression coverage: The added stale-recovery test covers successful retry, pure stale registry reclaim, fresh-operation identity protection, active-work keep-waiting outcomes, failed recovery, disabled diagnostics, released lanes, and pre-processing dedupe release. (src/auto-reply/reply/dispatch-from-config.stale-recovery.test.ts:52, fc29c452bb65)
  • Coordinator return-path change: The diagnostic recovery coordinator now exposes requestStuckSessionRecoveryOutcome while preserving the existing fire-and-forget requestStuckSessionRecovery wrapper, giving visible dispatch an auditable recovery result. (src/logging/diagnostic-session-recovery-coordinator.ts:176, fc29c452bb65)
  • Real behavior proof: The PR discussion includes live AWS Crabbox Telegram Docker RTT proof for a tarball built from the branch at 6be4195, and the final fc29c45 commit only adds a regression test for failed recovery admission. (fc29c452bb65)

Likely related people:

  • steipete: Peter Steinberger is the top shortlog contributor across the touched reply/logging files and authored the reply-dispatch hook surface that sits in this dispatch path. (role: feature-history owner; confidence: medium; commits: 82ce30b7895a; files: src/auto-reply/reply/dispatch-from-config.ts, src/plugins/hooks.ts)
  • vincentkoc: Vincent Koc recently changed inbound dedupe behavior after dispatch errors, which is adjacent to this PR's busy-path dedupe release change. (role: adjacent owner; confidence: medium; commits: 7c91d0dbc985, 5181e4f7c82b; files: src/auto-reply/reply/dispatch-from-config.ts)
  • shakkernerd: Current-main blame in this shallow checkout attributes the central reply admission and diagnostic recovery lines to Shakker's latest touch of the affected files. (role: recent area contributor; confidence: medium; commits: 21104cd52eb1; files: src/auto-reply/reply/dispatch-from-config.ts, src/auto-reply/reply/reply-turn-admission.ts, src/logging/diagnostic-session-recovery-coordinator.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. P1 High-priority user-facing bug, regression, or broken workflow. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. labels Jun 10, 2026
@joshavant

Copy link
Copy Markdown
Contributor Author

Addressed the stale reply-registry gap from the review in 6be4195bacac4ce22306589c3c670306334dd053.

What changed:

  • After visible stuck-session recovery reports that active work was cleared (aborted, released, or noop/no_active_work), dispatch now rechecks replyRunRegistry and force-clears only the exact same ReplyOperation object that blocked admission.
  • The identity guard intentionally avoids clearing a fresh operation that reused the same session key/session id while recovery was awaited.
  • Recovery outcomes that prove live work (active_reply_work, active_embedded_run, active_lane_task, already_in_flight) still keep waiting instead of clearing.

Proof run after the patch:

  • node scripts/run-vitest.mjs src/auto-reply/reply/dispatch-from-config.stale-recovery.test.ts src/auto-reply/reply/reply-run-registry.test.ts
  • node scripts/run-vitest.mjs src/auto-reply/reply/dispatch-from-config.stale-recovery.test.ts src/logging/diagnostic.test.ts src/logging/diagnostic-stuck-session-recovery.runtime.test.ts src/logging/diagnostic-stuck-session-recovery.integration.test.ts
  • pnpm -s tsgo:core:test
  • node --import tsx scripts/check-import-cycles.ts
  • git diff --check
  • pnpm -s format:check src/auto-reply/reply/dispatch-from-config.ts src/auto-reply/reply/dispatch-from-config.stale-recovery.test.ts
  • node scripts/run-oxlint.mjs src/auto-reply/reply/dispatch-from-config.ts src/auto-reply/reply/dispatch-from-config.stale-recovery.test.ts
  • .agents/skills/autoreview/scripts/autoreview --mode local -> clean, no accepted/actionable findings

CI on 6be4195bacac4ce22306589c3c670306334dd053: all non-Real behavior proof checks are green; PR merge state is CLEAN.

@joshavant

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review my crab

@clawsweeper

clawsweeper Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels Jun 10, 2026
@joshavant

Copy link
Copy Markdown
Contributor Author

Fresh live channel proof at current PR head:

  • Package under test: OpenClaw 2026.6.2 (6be4195) installed from a tarball built from this branch, not the published beta package.
  • Environment: AWS Crabbox run_c87dc9b4b4a7, lease cbx_efb1f68c1bcd, Docker Telegram RTT lane, exit 0, lease stopped.
  • Credential source: Convex broker telegram kind, role maintainer; no telegram-user credential was leased.
  • Result: package Telegram RTT Docker E2E passed (pr-91840-6be4195bac).
  • Artifact summary:
    • telegram-canary: pass, observed SUT message 20223, RTT 1338ms.
    • telegram-mentioned-message-reply: pass, 1/1 strict warm sample passed, observed SUT message 20225, RTT 1994ms, reply text exactly OPENCLAW_E2E_OK_1.

This covers the live channel path after the final cleanup commit: installable package -> gateway startup -> Telegram provider startup -> canary -> normal mentioned reply through the channel.

@joshavant

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

🦞👀
ClawSweeper picked this up.

Command router queued. I will update this comment with the next step.

Re-review progress:

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. mantis: telegram-visible-proof Mantis should capture Telegram visible proof. and removed rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. labels Jun 10, 2026
@blacksmith-sh

This comment has been minimized.

@joshavant joshavant force-pushed the fix-visible-reply-stale-recovery branch from fc29c45 to 80e151f Compare June 10, 2026 07:49
@joshavant joshavant merged commit cfdabfb into main Jun 10, 2026
158 of 159 checks passed
@joshavant joshavant deleted the fix-visible-reply-stale-recovery branch June 10, 2026 07:56
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request Jun 11, 2026
* fix visible reply stale recovery

* fix visible recovery lint loop

* fix visible reply registry recovery

* test: cover failed visible recovery admission
eleboucher pushed a commit to eleboucher/homelab that referenced this pull request Jun 12, 2026
…26.6.6) (#1040)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [ghcr.io/openclaw/openclaw](https://openclaw.ai) ([source](https://github.com/openclaw/openclaw)) | patch | `2026.6.5` → `2026.6.6` |

---

### Release Notes

<details>
<summary>openclaw/openclaw (ghcr.io/openclaw/openclaw)</summary>

### [`v2026.6.6`](https://github.com/openclaw/openclaw/blob/HEAD/CHANGELOG.md#202666)

[Compare Source](openclaw/openclaw@v2026.6.5...v2026.6.6)

##### Highlights

- Security boundaries are substantially tighter across transcripts, sandbox binds, host environment inheritance, MCP stdio, Codex HTTP access, native search policy, elevated sender checks, deleted-agent ACP bypasses, loopback tools, Discord moderation, and Teams group actions; exec approvals now fail closed on timeout. ([#&#8203;91529](openclaw/openclaw#91529), [#&#8203;91618](openclaw/openclaw#91618), [#&#8203;91615](openclaw/openclaw#91615), [#&#8203;91619](openclaw/openclaw#91619), [#&#8203;91741](openclaw/openclaw#91741), [#&#8203;91745](openclaw/openclaw#91745), [#&#8203;91746](openclaw/openclaw#91746), [#&#8203;91748](openclaw/openclaw#91748), [#&#8203;91749](openclaw/openclaw#91749), [#&#8203;91750](openclaw/openclaw#91750), [#&#8203;91751](openclaw/openclaw#91751), [#&#8203;91752](openclaw/openclaw#91752), [#&#8203;91763](openclaw/openclaw#91763), [#&#8203;89938](openclaw/openclaw#89938)) Thanks [@&#8203;joshavant](https://github.com/joshavant), [@&#8203;pgondhi987](https://github.com/pgondhi987), [@&#8203;mmaps](https://github.com/mmaps), [@&#8203;eleqtrizit](https://github.com/eleqtrizit), [@&#8203;shakkernerd](https://github.com/shakkernerd), and [@&#8203;drobison00](https://github.com/drobison00).
- Telegram delivery is safer and more coherent: account-scoped topics route to the right agent, streamed text survives tool calls, `/compact` works on generic ingress, callback handling uses concrete APIs, draft chunking is shared, durable dispatch dedupe moved into the SDK, and unauthorized DM text stays out of cache and prompt context. ([#&#8203;91189](openclaw/openclaw#91189), [#&#8203;88682](openclaw/openclaw#88682), [#&#8203;89588](openclaw/openclaw#89588), [#&#8203;90212](openclaw/openclaw#90212), [#&#8203;91876](openclaw/openclaw#91876), [#&#8203;91874](openclaw/openclaw#91874), [#&#8203;91904](openclaw/openclaw#91904), [#&#8203;91478](openclaw/openclaw#91478), [#&#8203;91915](openclaw/openclaw#91915)) Thanks [@&#8203;codysai001](https://github.com/codysai001), [@&#8203;alexzhu0](https://github.com/alexzhu0), [@&#8203;joelnishanth](https://github.com/joelnishanth), [@&#8203;snowzlm](https://github.com/snowzlm), [@&#8203;obviyus](https://github.com/obviyus), and [@&#8203;sallyom](https://github.com/sallyom).
- iMessage recovery and delivery now cover always-on inbound restart, durable echo markers, block streaming, idle approval discovery, hardened outbound transport, and actionable inbound startup diagnostics. ([#&#8203;91335](openclaw/openclaw#91335), [#&#8203;91449](openclaw/openclaw#91449), [#&#8203;88969](openclaw/openclaw#88969), [#&#8203;88530](openclaw/openclaw#88530), [#&#8203;91783](openclaw/openclaw#91783), [#&#8203;91785](openclaw/openclaw#91785)) Thanks [@&#8203;omarshahine](https://github.com/omarshahine), [@&#8203;jmissig](https://github.com/jmissig), and [@&#8203;colmbrogan](https://github.com/colmbrogan).
- Browser and MCP connectivity gained existing-session CDP support, discovered WebSocket validation, default-profile `cdpUrl` handling, safer browser-output boundaries, Streamable HTTP loopback transport, corrected OAuth/SSE authorization handling, and broader schema compatibility. ([#&#8203;91422](openclaw/openclaw#91422), [#&#8203;89851](openclaw/openclaw#89851), [#&#8203;91736](openclaw/openclaw#91736), [#&#8203;91747](openclaw/openclaw#91747), [#&#8203;91451](openclaw/openclaw#91451), [#&#8203;80143](openclaw/openclaw#80143)) Thanks [@&#8203;pgondhi987](https://github.com/pgondhi987), [@&#8203;anagnorisis2peripeteia](https://github.com/anagnorisis2peripeteia), [@&#8203;lifuyue](https://github.com/lifuyue), [@&#8203;eleqtrizit](https://github.com/eleqtrizit), [@&#8203;LiuwqGit](https://github.com/LiuwqGit), and [@&#8203;HemantSudarshan](https://github.com/HemantSudarshan).
- Control UI startup and first-reply latency are lower through cached model metadata, removal of the startup catalog wait, lazy slash-command loading, and first-event tracing with slow-reply diagnostics. ([#&#8203;91531](openclaw/openclaw#91531), [#&#8203;91538](openclaw/openclaw#91538), [#&#8203;91568](openclaw/openclaw#91568), [#&#8203;91583](openclaw/openclaw#91583), [#&#8203;91598](openclaw/openclaw#91598))
- Provider support expands with OpenRouter OAuth onboarding and Claude Fable 5 adaptive thinking, while Codex sessions keep correct compaction ownership, local models skip guardian review, dynamic tool progress normalizes cleanly, and Gemma 4 reasoning replay is preserved. ([#&#8203;91830](openclaw/openclaw#91830), [#&#8203;91882](openclaw/openclaw#91882), [#&#8203;91590](openclaw/openclaw#91590), [#&#8203;88630](openclaw/openclaw#88630), [#&#8203;88768](openclaw/openclaw#88768), [#&#8203;91696](openclaw/openclaw#91696)) Thanks [@&#8203;Patrick-Erichsen](https://github.com/Patrick-Erichsen), [@&#8203;joshavant](https://github.com/joshavant), [@&#8203;bdjben](https://github.com/bdjben), and [@&#8203;Coder-Wangyankun](https://github.com/Coder-Wangyankun).

##### Changes

- CLI progress: emit Claude CLI commentary progress events and bridge inter-tool commentary into channel progress without exposing internal protocol scaffolding. ([#&#8203;89834](openclaw/openclaw#89834), [#&#8203;90883](openclaw/openclaw#90883)) Thanks [@&#8203;anagnorisis2peripeteia](https://github.com/anagnorisis2peripeteia).
- Observability: allow trusted diagnostics channels to capture tool input/output content, add first-assistant-event traces, and warn on slow initial replies. ([#&#8203;91256](openclaw/openclaw#91256), [#&#8203;91568](openclaw/openclaw#91568), [#&#8203;91583](openclaw/openclaw#91583)) Thanks [@&#8203;amknight](https://github.com/amknight).
- Plugins/ClawHub: dogfood reusable package publishing, let dry runs skip publish approval, allow declared installed trusted hooks, report managed plugin version drift, and warn instead of failing on retired Skill Workshop configuration. ([#&#8203;91574](openclaw/openclaw#91574), [#&#8203;91591](openclaw/openclaw#91591), [#&#8203;90004](openclaw/openclaw#90004), [#&#8203;90927](openclaw/openclaw#90927), [#&#8203;90838](openclaw/openclaw#90838)) Thanks [@&#8203;Patrick-Erichsen](https://github.com/Patrick-Erichsen), [@&#8203;brokemac79](https://github.com/brokemac79), and [@&#8203;lonexreb](https://github.com/lonexreb).
- Memory/providers: move the local llama.cpp runtime into its provider plugin, batch embeddings across files, persist the agent model catalog cache, and keep QMD JSON search one-shot while filtering stale REM recall previews. ([#&#8203;91324](openclaw/openclaw#91324), [#&#8203;89138](openclaw/openclaw#89138), [#&#8203;90457](openclaw/openclaw#90457), [#&#8203;91837](openclaw/openclaw#91837), [#&#8203;91851](openclaw/openclaw#91851)) Thanks [@&#8203;osolmaz](https://github.com/osolmaz), [@&#8203;mushuiyu886](https://github.com/mushuiyu886), [@&#8203;ai-hpc](https://github.com/ai-hpc), and [@&#8203;TurboTheTurtle](https://github.com/TurboTheTurtle).
- Channels/mobile: add the QQBot group mention toggle, improve iPad and iPhone control surfaces, and expose the active connection host in the TUI footer. ([#&#8203;91423](openclaw/openclaw#91423), [#&#8203;91557](openclaw/openclaw#91557), [#&#8203;89909](openclaw/openclaw#89909)) Thanks [@&#8203;cxyhhhhh](https://github.com/cxyhhhhh), [@&#8203;Solvely-Colin](https://github.com/Solvely-Colin), and [@&#8203;baskduf](https://github.com/baskduf).
- Performance: prewarm TUI runtime plugins, deduplicate plugin auto-enable fanout, trim dense text-delta snapshots, and reuse prepared startup model metadata. ([#&#8203;90782](openclaw/openclaw#90782), [#&#8203;89978](openclaw/openclaw#89978), [#&#8203;91580](openclaw/openclaw#91580), [#&#8203;91531](openclaw/openclaw#91531)) Thanks [@&#8203;RomneyDa](https://github.com/RomneyDa) and [@&#8203;ai-hpc](https://github.com/ai-hpc).

##### Fixes

- Agent/session recovery: drop stale approval follow-ups after session rebind, remove drained reply-queue items by identity, recover stale main and visible replies, preserve Codex context-engine compaction ownership, lower the default compaction timeout to 180 seconds while respecting explicit configuration, and keep provider-failure terminal lifecycle state correct. ([#&#8203;85679](openclaw/openclaw#85679), [#&#8203;91450](openclaw/openclaw#91450), [#&#8203;91566](openclaw/openclaw#91566), [#&#8203;91840](openclaw/openclaw#91840), [#&#8203;91590](openclaw/openclaw#91590), [#&#8203;91361](openclaw/openclaw#91361), [#&#8203;91895](openclaw/openclaw#91895)) Thanks [@&#8203;openperf](https://github.com/openperf), [@&#8203;yetval](https://github.com/yetval), [@&#8203;joshavant](https://github.com/joshavant), [@&#8203;wangmiao0668000666](https://github.com/wangmiao0668000666), and [@&#8203;TurboTheTurtle](https://github.com/TurboTheTurtle).
- User-visible content boundaries: suppress Codex/Harmony protocol artifacts, neutralize browser and LanceDB memory media directives, redact transcript images, and preserve native `/compact` replies through source suppression. ([#&#8203;89151](openclaw/openclaw#89151), [#&#8203;91422](openclaw/openclaw#91422), [#&#8203;91425](openclaw/openclaw#91425), [#&#8203;91529](openclaw/openclaw#91529), [#&#8203;90212](openclaw/openclaw#90212)) Thanks [@&#8203;joelnishanth](https://github.com/joelnishanth), [@&#8203;pgondhi987](https://github.com/pgondhi987), [@&#8203;joshavant](https://github.com/joshavant), and [@&#8203;snowzlm](https://github.com/snowzlm).
- Channel delivery: keep WhatsApp captured replies attached to the successor controller after restart, retry Feishu rate limits, preserve Mattermost thread replies, canonicalize LINE webhook paths, restore Discord reply hydration and runtime timeout exports, and show OpenAI Realtime WebRTC assistant transcripts. ([#&#8203;85823](openclaw/openclaw#85823), [#&#8203;89659](openclaw/openclaw#89659), [#&#8203;91684](openclaw/openclaw#91684), [#&#8203;91649](openclaw/openclaw#91649), [#&#8203;90263](openclaw/openclaw#90263), [#&#8203;91686](openclaw/openclaw#91686), [#&#8203;90426](openclaw/openclaw#90426)) Thanks [@&#8203;itsuzef](https://github.com/itsuzef), [@&#8203;ladygege](https://github.com/ladygege), [@&#8203;jacobtomlinson](https://github.com/jacobtomlinson), [@&#8203;fuller-stack-dev](https://github.com/fuller-stack-dev), and [@&#8203;shushushv](https://github.com/shushushv).
- Cron: cancel active task runs cleanly, preserve terminal timeout/cancel state, and recover no-deliver tool warnings instead of silently losing the outcome. ([#&#8203;90666](openclaw/openclaw#90666), [#&#8203;90678](openclaw/openclaw#90678)) Thanks [@&#8203;ai-hpc](https://github.com/ai-hpc).
- Gateway/config/auth: share the approval runtime socket token, replace arrays explicitly in `config.patch`, skip the deleted-agent guard only for valid ACP harness sessions, surface headless LaunchAgent state, verify SQLite auth migration before cleanup, and arm QMD startup maintenance. ([#&#8203;87105](openclaw/openclaw#87105), [#&#8203;91551](openclaw/openclaw#91551), [#&#8203;91219](openclaw/openclaw#91219), [#&#8203;91614](openclaw/openclaw#91614), [#&#8203;91740](openclaw/openclaw#91740), [#&#8203;91978](openclaw/openclaw#91978)) Thanks [@&#8203;fuller-stack-dev](https://github.com/fuller-stack-dev) and [@&#8203;scotthuang](https://github.com/scotthuang).
- Providers/Codex: clarify quota errors, restore the Codex synthetic usage line, canonicalize Codex protocol assets, require API-key auth for realtime voice, normalize ACP model refs, preserve Gemma 4 `reasoning_content`, and avoid guardian review for local models. ([#&#8203;91390](openclaw/openclaw#91390), [#&#8203;91709](openclaw/openclaw#91709), [#&#8203;91507](openclaw/openclaw#91507), [#&#8203;91567](openclaw/openclaw#91567), [#&#8203;88630](openclaw/openclaw#88630), [#&#8203;91696](openclaw/openclaw#91696)) Thanks [@&#8203;hxy91819](https://github.com/hxy91819), [@&#8203;brokemac79](https://github.com/brokemac79), [@&#8203;RomneyDa](https://github.com/RomneyDa), [@&#8203;joshavant](https://github.com/joshavant), and [@&#8203;Coder-Wangyankun](https://github.com/Coder-Wangyankun).
- Updates/builds: recover package Gateway restarts after refresh failure, expose plugin convergence repair, fall back to Corepack in PATH-less pnpm environments, seed the correct Docker store packages, and keep ClawHub dry-run and publish paths reusable. ([#&#8203;91581](openclaw/openclaw#91581), [#&#8203;91599](openclaw/openclaw#91599), [#&#8203;91547](openclaw/openclaw#91547), [#&#8203;91591](openclaw/openclaw#91591)) Thanks [@&#8203;fuller-stack-dev](https://github.com/fuller-stack-dev), [@&#8203;sallyom](https://github.com/sallyom), and [@&#8203;Patrick-Erichsen](https://github.com/Patrick-Erichsen).
- UI: require explicit user intent before opening chat sessions and drain restored chat queues after session switches. ([#&#8203;91480](openclaw/openclaw#91480)) Thanks [@&#8203;TurboTheTurtle](https://github.com/TurboTheTurtle).
- Android: avoid the `dataSync` foreground-service type for persistent nodes. ([#&#8203;80082](openclaw/openclaw#80082)) Thanks [@&#8203;davelutztx](https://github.com/davelutztx).
- Native hooks: bound relay lifetimes so abandoned native hook connections cannot linger indefinitely. ([#&#8203;91550](openclaw/openclaw#91550)) Thanks [@&#8203;joshavant](https://github.com/joshavant).

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about these updates again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMDEuMSIsInVwZGF0ZWRJblZlciI6IjQzLjEwMS4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9jb250YWluZXIiLCJ0eXBlL3BhdGNoIl19-->

Reviewed-on: https://git.erwanleboucher.dev/eleboucher/homelab/pulls/1040
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

maintainer Maintainer-authored PR mantis: telegram-visible-proof Mantis should capture Telegram visible proof. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. P1 High-priority user-facing bug, regression, or broken workflow. proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: L status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Stale replyRunRegistry lock causes indefinite inbound dispatch hang — no timeout on waitForIdle() for visible messages

1 participant