Skip to content

Fix stale main session startup recovery#91566

Merged
joshavant merged 1 commit into
mainfrom
fix/stale-main-session-startup-recovery
Jun 9, 2026
Merged

Fix stale main session startup recovery#91566
joshavant merged 1 commit into
mainfrom
fix/stale-main-session-startup-recovery

Conversation

@joshavant

Copy link
Copy Markdown
Contributor

Summary

  • mark startup-orphaned running main sessions for restart recovery before channels start
  • run restart-aborted main-session recovery after gateway startup methods are available
  • guard recovery against current-process active runs and duplicate-key rows from non-routable stores

Fixes #90525.

Verification

  • .agents/skills/autoreview/scripts/autoreview --mode local
  • node scripts/run-vitest.mjs src/agents/main-session-restart-recovery.test.ts src/gateway/server-startup-post-attach.test.ts
  • AWS Crabbox live E2E: provider aws, run run_494e506ae17b, lease cbx_ec076ff48d6b, machine c7a.8xlarge, exit 0

Real behavior proof

Behavior addressed: Telegram group sessions could remain stuck after a hard gateway restart when the persisted main-session row was running but no embedded run was active.

Real environment tested: AWS Crabbox Linux worker using live Convex-leased Telegram credential and a mock OpenAI-compatible model server.

Exact steps or command run after this patch: Ran the Crabbox reproduction script from this branch; it built OpenClaw, started the gateway, sent a Telegram group message that hung the first model call, hard-killed the gateway, restarted it, and sent a follow-up Telegram message without sessions.reset.

Evidence after fix: Crabbox provider aws, run run_494e506ae17b, lease cbx_ec076ff48d6b, exit 0.

Observed result after fix: The stale Telegram main session was reproduced as running with hasActiveRun:false; after restart, automatic startup recovery cleared it to a terminal state and the follow-up Telegram turn completed with an outbound send.

What was not tested: Other live chat providers were not exercised in the E2E run; focused unit tests cover the generic recovery path, duplicate-key routing, active-run preservation, and startup scheduling.

@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime agents Agent runtime and tooling size: L maintainer Maintainer-authored PR labels Jun 9, 2026
@clawsweeper

clawsweeper Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed June 8, 2026, 10:47 PM ET / 02:47 UTC.

Summary
The PR adds generic startup orphan marking/recovery for running main-session rows, moves restart-aborted recovery until gateway methods are available, and adds focused agent/gateway tests.

PR surface: Source +217, Tests +357. Total +574 across 4 files.

Reproducibility: yes. The linked issue and current source show a stale persisted running row can have hasActiveRun:false while marker-gated recovery skips it, and the PR body reports a live Crabbox restart reproduction after the patch.

Review metrics: none identified.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P2] Let required CI finish and have maintainers explicitly accept the session-state/message-delivery merge risk before landing.

Mantis proof suggestion
A live Telegram transcript would materially help show the stale running session is released and a follow-up message completes after restart. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

telegram live: verify that after a hard gateway restart with a stale running main session, a follow-up Telegram message completes without sessions.reset.

Risk before merge

  • [P1] Merging this changes startup behavior from marker-only recovery to automatically marking persisted running main-session rows, so maintainers should consciously accept the session-state upgrade behavior.
  • [P1] The live proof exercises Telegram plus focused generic tests; other live chat providers were not exercised, so cross-channel delivery assurance relies on the generic recovery boundary.

Maintainer options:

  1. Land with session-state sign-off (recommended)
    If maintainers accept automatic startup recovery of stale running main-session rows, the PR has focused tests and live Telegram Crabbox proof to support landing after CI finishes.
  2. Require broader live transport proof
    If maintainers want more upgrade assurance, ask for one additional live provider or Telegram direct-session proof before landing.
  3. Pause for multi-process ownership rules
    If shared state across simultaneous gateway processes is a supported deployment, pause until the startup marker can distinguish another process's live owner.

Next step before merge

  • No automated repair lane is needed; maintainers should review the startup session-state risk and land or request broader live proof.

Security
Cleared: No concrete security or supply-chain concern found; the diff changes TypeScript session recovery and tests without workflow, dependency, secret, or code-execution surface changes.

Review details

Best possible solution:

Land the generic startup orphan recovery at the agents/gateway boundary once maintainers accept the session-state risk and required CI remains clean.

Do we have a high-confidence way to reproduce the issue?

Yes. The linked issue and current source show a stale persisted running row can have hasActiveRun:false while marker-gated recovery skips it, and the PR body reports a live Crabbox restart reproduction after the patch.

Is this the best way to solve the issue?

Yes. The generic main-session recovery boundary is the right layer; a Telegram-only reset path would fix one channel symptom while leaving the persisted session-state invariant broken.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 329fa44d23f4.

Label changes

Label changes:

  • add P1: The PR fixes a user-facing Telegram/session workflow where stale persisted running state can block replies until manual reset.
  • add merge-risk: 🚨 message-delivery: Recovery can resume or fail interrupted main sessions and affects whether follow-up channel messages are delivered after restart.
  • add merge-risk: 🚨 session-state: The patch changes startup recovery to mark and recover persisted running main-session rows, which directly mutates session state on upgrade/startup.
  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix live Crabbox proof with a Telegram credential, hard gateway restart scenario, reproduced stale state, and observed successful follow-up delivery.
  • add rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes after-fix live Crabbox proof with a Telegram credential, hard gateway restart scenario, reproduced stale state, and observed successful follow-up delivery.
  • add mantis: telegram-visible-proof: Mantis should capture Telegram visible proof. The bug is Telegram-visible because the user-facing proof path is a Telegram message after hard gateway restart, and a short Telegram recording could demonstrate the fixed reply delivery.

Label justifications:

  • P1: The PR fixes a user-facing Telegram/session workflow where stale persisted running state can block replies until manual reset.
  • merge-risk: 🚨 session-state: The patch changes startup recovery to mark and recover persisted running main-session rows, which directly mutates session state on upgrade/startup.
  • merge-risk: 🚨 message-delivery: Recovery can resume or fail interrupted main sessions and affects whether follow-up channel messages are delivered after restart.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes after-fix live Crabbox proof with a Telegram credential, hard gateway restart scenario, reproduced stale state, and observed successful follow-up delivery.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix live Crabbox proof with a Telegram credential, hard gateway restart scenario, reproduced stale state, and observed successful follow-up delivery.
  • mantis: telegram-visible-proof: Mantis should capture Telegram visible proof. The bug is Telegram-visible because the user-facing proof path is a Telegram message after hard gateway restart, and a short Telegram recording could demonstrate the fixed reply delivery.
Evidence reviewed

PR surface:

Source +217, Tests +357. Total +574 across 4 files.

View PR surface stats
Area Files Added Removed Net
Source 2 250 33 +217
Tests 2 357 0 +357
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 4 607 33 +574

What I checked:

  • PR diff reviewed: The patch adds startup orphan marking before channels start, recovery after startup methods are available, active-run guards, duplicate-store routing checks, and regression coverage for the new recovery paths. (src/agents/main-session-restart-recovery.ts:217, 9c799c541dcf)
  • Current main recovery gap: Current main only recovers running main-session rows when abortedLastRun === true, which matches the linked issue's stale unmarked running plus hasActiveRun:false gap. (src/agents/main-session-restart-recovery.ts:525, 329fa44d23f4)
  • Current main startup ordering: Current main schedules main-session restart recovery as a post-ready sidecar, before the PR's move to schedule it after startup-unavailable gateway methods are cleared. (src/gateway/server-startup-post-attach.ts:950, 329fa44d23f4)
  • Active-run projection checked: sessions.list computes hasActiveRun independently from persisted session status, so current source can represent a persisted running row with no tracked active run. (src/gateway/server-methods/sessions.ts:982, 329fa44d23f4)
  • Linked issue evidence read: The linked reporter added redacted terminal evidence showing no active gateway work, Telegram health OK, a stale direct session row with status: running and hasActiveRun:false, and successful manual recovery with sessions.reset.
  • Real behavior proof reviewed: The PR body reports AWS Crabbox live E2E run run_494e506ae17b with a live Telegram credential and mock OpenAI-compatible model server; the stale running/no-active-run state was reproduced and a follow-up Telegram turn completed without sessions.reset. (9c799c541dcf)

Likely related people:

  • pfrederiksen: Authored recent main-session restart recovery behavior for notifying chat when recovery fails, directly adjacent to this PR's recovery path. (role: feature-history contributor; confidence: medium; commits: cf61b876ec5a; files: src/agents/main-session-restart-recovery.ts)
  • samzong: Authored recent restart-recovery reply delivery work in the same main-session recovery module, making them relevant for recovery-delivery behavior. (role: recent area contributor; confidence: medium; commits: 4decdf6245a1; files: src/agents/main-session-restart-recovery.ts)
  • anyech: Authored topic-suffixed restart lock recovery, an adjacent recovery edge case in the same module. (role: feature-history contributor; confidence: medium; commits: 228e5a238c1d; files: src/agents/main-session-restart-recovery.ts)
  • vincentkoc: Local blame for the current checkout points to a grafted Vincent Koc release-validation commit, and GitHub path history shows recent gateway startup/session-method work by the same contributor. (role: recent gateway area contributor; confidence: medium; commits: 4b55a0e04d41, eb5d6c7294b6, c68291980880; files: src/gateway/server-startup-post-attach.ts, src/gateway/server-methods/sessions.ts)
  • steipete: GitHub path history shows repeated recent commits and commits/merges around gateway startup, session docs, linting, and recovery-adjacent changes in these files. (role: recent area contributor-style area contributor; confidence: medium; commits: 27dde7a4d69b, dc23e924efce, a6ecc4bd89f4; files: src/agents/main-session-restart-recovery.ts, src/gateway/server-startup-post-attach.ts, src/gateway/server-methods/sessions.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. mantis: telegram-visible-proof Mantis should capture Telegram visible proof. P1 High-priority user-facing bug, regression, or broken workflow. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. labels Jun 9, 2026
@clawsweeper clawsweeper Bot temporarily deployed to qa-live-shared June 9, 2026 02:49 Inactive
@openclaw-mantis

Copy link
Copy Markdown

Mantis Telegram Desktop Proof

Summary: Mantis captured native Telegram Desktop before/after GIF evidence with Convex-leased Telegram credentials.

Main screenshot This PR screenshot
Baseline native Telegram Desktop screenshot Candidate native Telegram Desktop screenshot
Main This PR
Baseline native Telegram Desktop proof GIF Candidate native Telegram Desktop proof GIF

Motion-trimmed clips:

Raw QA files: https://artifacts.openclaw.ai/mantis/telegram-desktop/pr-91566/run-27180628783-1/index.json

@joshavant

Copy link
Copy Markdown
Contributor Author

Additional scale sanity check from Molty, a long-running high-volume OpenClaw agent:

  • openclaw sessions cleanup --all-agents --dry-run --json: 4 stores, 529 rows before / 503 after preview, 1.54s end-to-end.
  • Separate read-only session-store scan across all found sessions.json stores: 32 stores, 539 rows, ~5.8 MB total store data.
  • Direct store read/parse/filter timing: 18.7 ms total, including 7.8 ms actual file read time.
  • Current states: 253 done, 4 failed, 4 running, 1 timeout, 277 unknown/legacy.
  • SQLite aggregate checks: 5 queries in 24.2 ms.
  • The expensive dry-run cost was artifact discovery: ~9,981 files scanned, 4,832 removable, ~1.77 GB reclaimable.

Takeaway: for this real high-volume instance, startup-wide session reconciliation over persisted session rows appears cheap; the expensive operation is full cleanup/artifact traversal, which this PR does not add. This supports the current startup reconciliation approach as reasonable from a performance standpoint, while still avoiding any per-message hot-path scan.

@joshavant joshavant merged commit e1978cf into main Jun 9, 2026
240 of 251 checks passed
@joshavant joshavant deleted the fix/stale-main-session-startup-recovery branch June 9, 2026 05:37
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request Jun 10, 2026
eleboucher pushed a commit to eleboucher/homelab that referenced this pull request Jun 12, 2026
…26.6.6) (#1040)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [ghcr.io/openclaw/openclaw](https://openclaw.ai) ([source](https://github.com/openclaw/openclaw)) | patch | `2026.6.5` → `2026.6.6` |

---

### Release Notes

<details>
<summary>openclaw/openclaw (ghcr.io/openclaw/openclaw)</summary>

### [`v2026.6.6`](https://github.com/openclaw/openclaw/blob/HEAD/CHANGELOG.md#202666)

[Compare Source](openclaw/openclaw@v2026.6.5...v2026.6.6)

##### Highlights

- Security boundaries are substantially tighter across transcripts, sandbox binds, host environment inheritance, MCP stdio, Codex HTTP access, native search policy, elevated sender checks, deleted-agent ACP bypasses, loopback tools, Discord moderation, and Teams group actions; exec approvals now fail closed on timeout. ([#&#8203;91529](openclaw/openclaw#91529), [#&#8203;91618](openclaw/openclaw#91618), [#&#8203;91615](openclaw/openclaw#91615), [#&#8203;91619](openclaw/openclaw#91619), [#&#8203;91741](openclaw/openclaw#91741), [#&#8203;91745](openclaw/openclaw#91745), [#&#8203;91746](openclaw/openclaw#91746), [#&#8203;91748](openclaw/openclaw#91748), [#&#8203;91749](openclaw/openclaw#91749), [#&#8203;91750](openclaw/openclaw#91750), [#&#8203;91751](openclaw/openclaw#91751), [#&#8203;91752](openclaw/openclaw#91752), [#&#8203;91763](openclaw/openclaw#91763), [#&#8203;89938](openclaw/openclaw#89938)) Thanks [@&#8203;joshavant](https://github.com/joshavant), [@&#8203;pgondhi987](https://github.com/pgondhi987), [@&#8203;mmaps](https://github.com/mmaps), [@&#8203;eleqtrizit](https://github.com/eleqtrizit), [@&#8203;shakkernerd](https://github.com/shakkernerd), and [@&#8203;drobison00](https://github.com/drobison00).
- Telegram delivery is safer and more coherent: account-scoped topics route to the right agent, streamed text survives tool calls, `/compact` works on generic ingress, callback handling uses concrete APIs, draft chunking is shared, durable dispatch dedupe moved into the SDK, and unauthorized DM text stays out of cache and prompt context. ([#&#8203;91189](openclaw/openclaw#91189), [#&#8203;88682](openclaw/openclaw#88682), [#&#8203;89588](openclaw/openclaw#89588), [#&#8203;90212](openclaw/openclaw#90212), [#&#8203;91876](openclaw/openclaw#91876), [#&#8203;91874](openclaw/openclaw#91874), [#&#8203;91904](openclaw/openclaw#91904), [#&#8203;91478](openclaw/openclaw#91478), [#&#8203;91915](openclaw/openclaw#91915)) Thanks [@&#8203;codysai001](https://github.com/codysai001), [@&#8203;alexzhu0](https://github.com/alexzhu0), [@&#8203;joelnishanth](https://github.com/joelnishanth), [@&#8203;snowzlm](https://github.com/snowzlm), [@&#8203;obviyus](https://github.com/obviyus), and [@&#8203;sallyom](https://github.com/sallyom).
- iMessage recovery and delivery now cover always-on inbound restart, durable echo markers, block streaming, idle approval discovery, hardened outbound transport, and actionable inbound startup diagnostics. ([#&#8203;91335](openclaw/openclaw#91335), [#&#8203;91449](openclaw/openclaw#91449), [#&#8203;88969](openclaw/openclaw#88969), [#&#8203;88530](openclaw/openclaw#88530), [#&#8203;91783](openclaw/openclaw#91783), [#&#8203;91785](openclaw/openclaw#91785)) Thanks [@&#8203;omarshahine](https://github.com/omarshahine), [@&#8203;jmissig](https://github.com/jmissig), and [@&#8203;colmbrogan](https://github.com/colmbrogan).
- Browser and MCP connectivity gained existing-session CDP support, discovered WebSocket validation, default-profile `cdpUrl` handling, safer browser-output boundaries, Streamable HTTP loopback transport, corrected OAuth/SSE authorization handling, and broader schema compatibility. ([#&#8203;91422](openclaw/openclaw#91422), [#&#8203;89851](openclaw/openclaw#89851), [#&#8203;91736](openclaw/openclaw#91736), [#&#8203;91747](openclaw/openclaw#91747), [#&#8203;91451](openclaw/openclaw#91451), [#&#8203;80143](openclaw/openclaw#80143)) Thanks [@&#8203;pgondhi987](https://github.com/pgondhi987), [@&#8203;anagnorisis2peripeteia](https://github.com/anagnorisis2peripeteia), [@&#8203;lifuyue](https://github.com/lifuyue), [@&#8203;eleqtrizit](https://github.com/eleqtrizit), [@&#8203;LiuwqGit](https://github.com/LiuwqGit), and [@&#8203;HemantSudarshan](https://github.com/HemantSudarshan).
- Control UI startup and first-reply latency are lower through cached model metadata, removal of the startup catalog wait, lazy slash-command loading, and first-event tracing with slow-reply diagnostics. ([#&#8203;91531](openclaw/openclaw#91531), [#&#8203;91538](openclaw/openclaw#91538), [#&#8203;91568](openclaw/openclaw#91568), [#&#8203;91583](openclaw/openclaw#91583), [#&#8203;91598](openclaw/openclaw#91598))
- Provider support expands with OpenRouter OAuth onboarding and Claude Fable 5 adaptive thinking, while Codex sessions keep correct compaction ownership, local models skip guardian review, dynamic tool progress normalizes cleanly, and Gemma 4 reasoning replay is preserved. ([#&#8203;91830](openclaw/openclaw#91830), [#&#8203;91882](openclaw/openclaw#91882), [#&#8203;91590](openclaw/openclaw#91590), [#&#8203;88630](openclaw/openclaw#88630), [#&#8203;88768](openclaw/openclaw#88768), [#&#8203;91696](openclaw/openclaw#91696)) Thanks [@&#8203;Patrick-Erichsen](https://github.com/Patrick-Erichsen), [@&#8203;joshavant](https://github.com/joshavant), [@&#8203;bdjben](https://github.com/bdjben), and [@&#8203;Coder-Wangyankun](https://github.com/Coder-Wangyankun).

##### Changes

- CLI progress: emit Claude CLI commentary progress events and bridge inter-tool commentary into channel progress without exposing internal protocol scaffolding. ([#&#8203;89834](openclaw/openclaw#89834), [#&#8203;90883](openclaw/openclaw#90883)) Thanks [@&#8203;anagnorisis2peripeteia](https://github.com/anagnorisis2peripeteia).
- Observability: allow trusted diagnostics channels to capture tool input/output content, add first-assistant-event traces, and warn on slow initial replies. ([#&#8203;91256](openclaw/openclaw#91256), [#&#8203;91568](openclaw/openclaw#91568), [#&#8203;91583](openclaw/openclaw#91583)) Thanks [@&#8203;amknight](https://github.com/amknight).
- Plugins/ClawHub: dogfood reusable package publishing, let dry runs skip publish approval, allow declared installed trusted hooks, report managed plugin version drift, and warn instead of failing on retired Skill Workshop configuration. ([#&#8203;91574](openclaw/openclaw#91574), [#&#8203;91591](openclaw/openclaw#91591), [#&#8203;90004](openclaw/openclaw#90004), [#&#8203;90927](openclaw/openclaw#90927), [#&#8203;90838](openclaw/openclaw#90838)) Thanks [@&#8203;Patrick-Erichsen](https://github.com/Patrick-Erichsen), [@&#8203;brokemac79](https://github.com/brokemac79), and [@&#8203;lonexreb](https://github.com/lonexreb).
- Memory/providers: move the local llama.cpp runtime into its provider plugin, batch embeddings across files, persist the agent model catalog cache, and keep QMD JSON search one-shot while filtering stale REM recall previews. ([#&#8203;91324](openclaw/openclaw#91324), [#&#8203;89138](openclaw/openclaw#89138), [#&#8203;90457](openclaw/openclaw#90457), [#&#8203;91837](openclaw/openclaw#91837), [#&#8203;91851](openclaw/openclaw#91851)) Thanks [@&#8203;osolmaz](https://github.com/osolmaz), [@&#8203;mushuiyu886](https://github.com/mushuiyu886), [@&#8203;ai-hpc](https://github.com/ai-hpc), and [@&#8203;TurboTheTurtle](https://github.com/TurboTheTurtle).
- Channels/mobile: add the QQBot group mention toggle, improve iPad and iPhone control surfaces, and expose the active connection host in the TUI footer. ([#&#8203;91423](openclaw/openclaw#91423), [#&#8203;91557](openclaw/openclaw#91557), [#&#8203;89909](openclaw/openclaw#89909)) Thanks [@&#8203;cxyhhhhh](https://github.com/cxyhhhhh), [@&#8203;Solvely-Colin](https://github.com/Solvely-Colin), and [@&#8203;baskduf](https://github.com/baskduf).
- Performance: prewarm TUI runtime plugins, deduplicate plugin auto-enable fanout, trim dense text-delta snapshots, and reuse prepared startup model metadata. ([#&#8203;90782](openclaw/openclaw#90782), [#&#8203;89978](openclaw/openclaw#89978), [#&#8203;91580](openclaw/openclaw#91580), [#&#8203;91531](openclaw/openclaw#91531)) Thanks [@&#8203;RomneyDa](https://github.com/RomneyDa) and [@&#8203;ai-hpc](https://github.com/ai-hpc).

##### Fixes

- Agent/session recovery: drop stale approval follow-ups after session rebind, remove drained reply-queue items by identity, recover stale main and visible replies, preserve Codex context-engine compaction ownership, lower the default compaction timeout to 180 seconds while respecting explicit configuration, and keep provider-failure terminal lifecycle state correct. ([#&#8203;85679](openclaw/openclaw#85679), [#&#8203;91450](openclaw/openclaw#91450), [#&#8203;91566](openclaw/openclaw#91566), [#&#8203;91840](openclaw/openclaw#91840), [#&#8203;91590](openclaw/openclaw#91590), [#&#8203;91361](openclaw/openclaw#91361), [#&#8203;91895](openclaw/openclaw#91895)) Thanks [@&#8203;openperf](https://github.com/openperf), [@&#8203;yetval](https://github.com/yetval), [@&#8203;joshavant](https://github.com/joshavant), [@&#8203;wangmiao0668000666](https://github.com/wangmiao0668000666), and [@&#8203;TurboTheTurtle](https://github.com/TurboTheTurtle).
- User-visible content boundaries: suppress Codex/Harmony protocol artifacts, neutralize browser and LanceDB memory media directives, redact transcript images, and preserve native `/compact` replies through source suppression. ([#&#8203;89151](openclaw/openclaw#89151), [#&#8203;91422](openclaw/openclaw#91422), [#&#8203;91425](openclaw/openclaw#91425), [#&#8203;91529](openclaw/openclaw#91529), [#&#8203;90212](openclaw/openclaw#90212)) Thanks [@&#8203;joelnishanth](https://github.com/joelnishanth), [@&#8203;pgondhi987](https://github.com/pgondhi987), [@&#8203;joshavant](https://github.com/joshavant), and [@&#8203;snowzlm](https://github.com/snowzlm).
- Channel delivery: keep WhatsApp captured replies attached to the successor controller after restart, retry Feishu rate limits, preserve Mattermost thread replies, canonicalize LINE webhook paths, restore Discord reply hydration and runtime timeout exports, and show OpenAI Realtime WebRTC assistant transcripts. ([#&#8203;85823](openclaw/openclaw#85823), [#&#8203;89659](openclaw/openclaw#89659), [#&#8203;91684](openclaw/openclaw#91684), [#&#8203;91649](openclaw/openclaw#91649), [#&#8203;90263](openclaw/openclaw#90263), [#&#8203;91686](openclaw/openclaw#91686), [#&#8203;90426](openclaw/openclaw#90426)) Thanks [@&#8203;itsuzef](https://github.com/itsuzef), [@&#8203;ladygege](https://github.com/ladygege), [@&#8203;jacobtomlinson](https://github.com/jacobtomlinson), [@&#8203;fuller-stack-dev](https://github.com/fuller-stack-dev), and [@&#8203;shushushv](https://github.com/shushushv).
- Cron: cancel active task runs cleanly, preserve terminal timeout/cancel state, and recover no-deliver tool warnings instead of silently losing the outcome. ([#&#8203;90666](openclaw/openclaw#90666), [#&#8203;90678](openclaw/openclaw#90678)) Thanks [@&#8203;ai-hpc](https://github.com/ai-hpc).
- Gateway/config/auth: share the approval runtime socket token, replace arrays explicitly in `config.patch`, skip the deleted-agent guard only for valid ACP harness sessions, surface headless LaunchAgent state, verify SQLite auth migration before cleanup, and arm QMD startup maintenance. ([#&#8203;87105](openclaw/openclaw#87105), [#&#8203;91551](openclaw/openclaw#91551), [#&#8203;91219](openclaw/openclaw#91219), [#&#8203;91614](openclaw/openclaw#91614), [#&#8203;91740](openclaw/openclaw#91740), [#&#8203;91978](openclaw/openclaw#91978)) Thanks [@&#8203;fuller-stack-dev](https://github.com/fuller-stack-dev) and [@&#8203;scotthuang](https://github.com/scotthuang).
- Providers/Codex: clarify quota errors, restore the Codex synthetic usage line, canonicalize Codex protocol assets, require API-key auth for realtime voice, normalize ACP model refs, preserve Gemma 4 `reasoning_content`, and avoid guardian review for local models. ([#&#8203;91390](openclaw/openclaw#91390), [#&#8203;91709](openclaw/openclaw#91709), [#&#8203;91507](openclaw/openclaw#91507), [#&#8203;91567](openclaw/openclaw#91567), [#&#8203;88630](openclaw/openclaw#88630), [#&#8203;91696](openclaw/openclaw#91696)) Thanks [@&#8203;hxy91819](https://github.com/hxy91819), [@&#8203;brokemac79](https://github.com/brokemac79), [@&#8203;RomneyDa](https://github.com/RomneyDa), [@&#8203;joshavant](https://github.com/joshavant), and [@&#8203;Coder-Wangyankun](https://github.com/Coder-Wangyankun).
- Updates/builds: recover package Gateway restarts after refresh failure, expose plugin convergence repair, fall back to Corepack in PATH-less pnpm environments, seed the correct Docker store packages, and keep ClawHub dry-run and publish paths reusable. ([#&#8203;91581](openclaw/openclaw#91581), [#&#8203;91599](openclaw/openclaw#91599), [#&#8203;91547](openclaw/openclaw#91547), [#&#8203;91591](openclaw/openclaw#91591)) Thanks [@&#8203;fuller-stack-dev](https://github.com/fuller-stack-dev), [@&#8203;sallyom](https://github.com/sallyom), and [@&#8203;Patrick-Erichsen](https://github.com/Patrick-Erichsen).
- UI: require explicit user intent before opening chat sessions and drain restored chat queues after session switches. ([#&#8203;91480](openclaw/openclaw#91480)) Thanks [@&#8203;TurboTheTurtle](https://github.com/TurboTheTurtle).
- Android: avoid the `dataSync` foreground-service type for persistent nodes. ([#&#8203;80082](openclaw/openclaw#80082)) Thanks [@&#8203;davelutztx](https://github.com/davelutztx).
- Native hooks: bound relay lifetimes so abandoned native hook connections cannot linger indefinitely. ([#&#8203;91550](openclaw/openclaw#91550)) Thanks [@&#8203;joshavant](https://github.com/joshavant).

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about these updates again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMDEuMSIsInVwZGF0ZWRJblZlciI6IjQzLjEwMS4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9jb250YWluZXIiLCJ0eXBlL3BhdGNoIl19-->

Reviewed-on: https://git.erwanleboucher.dev/eleboucher/homelab/pulls/1040
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling gateway Gateway runtime maintainer Maintainer-authored PR mantis: telegram-visible-proof Mantis should capture Telegram visible proof. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. P1 High-priority user-facing bug, regression, or broken workflow. proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: L status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Telegram direct session remains running after restart with hasActiveRun=false

1 participant