Skip to content

fix(codex): recover raw missing-thread compaction failures#87738

Merged
steipete merged 1 commit into
openclaw:mainfrom
pfrederiksen:fix-codex-unstructured-thread-not-found
May 28, 2026
Merged

fix(codex): recover raw missing-thread compaction failures#87738
steipete merged 1 commit into
openclaw:mainfrom
pfrederiksen:fix-codex-unstructured-thread-not-found

Conversation

@pfrederiksen

@pfrederiksen pfrederiksen commented May 28, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #87736.

This is a narrow follow-up to #86211 / #86602. The existing recovery path handles structured Codex native harness binding failures when failure.reason is missing_thread_binding or stale_thread_binding. A live regression on OpenClaw 2026.5.27 still surfaced the raw preflight error:

Preflight compaction required but failed: thread not found: <codex-thread-id>

This patch classifies the raw reason text for the same missing/stale binding shapes as recoverable too, so preflight compaction continues into the existing recovery path instead of surfacing a generic Telegram failure.

User-visible problem

A Telegram group inbound was accepted, but dispatch failed before a normal assistant turn. The user saw the generic channel failure message rather than automatic session recovery. The session later rotated/recovered and subsequent dispatches worked, which matches the same operational class as #86211.

All issue/PR evidence is sanitized: no private chat ids, message ids, session ids, bot handles, or raw Codex thread ids.

Implementation

  • Adds a shared native harness recovery classifier in src/agents/harness/compaction-recovery.ts.
  • Classifies both structured binding reasons and raw missing-thread reason strings as recoverable.
  • Uses the shared classifier from auto-reply preflight compaction, CLI compaction, and queued embedded compaction so sibling Codex recovery paths handle the same raw result shape.
  • Adds regression coverage for:
    • auto-reply preflight { ok:false, compacted:false, reason:"thread not found: <codex-thread-id>" }
    • CLI native compaction fallback when the harness returns a raw top-level thread not found reason with no failure.reason.

Real behavior proof

  • Behavior addressed: Raw missing-thread preflight compaction results should be treated as recoverable, matching the structured missing_thread_binding / stale_thread_binding recovery path from fix(codex): recover stale preflight bindings #86602.
  • Real environment tested: Local OpenClaw source checkout for this PR branch on Linux with Node 22.22.2, using the real runPreflightCompactionIfNeeded runtime helper from src/auto-reply/reply/agent-runner-memory.ts.
  • Exact steps or command run after this patch: Ran a local Node/tsx runtime invocation that created a redacted Telegram-group session entry, invoked runPreflightCompactionIfNeeded, and returned a raw compaction result shaped as { ok:false, compacted:false, reason:"thread not found: <codex-thread-id>" }.
  • Evidence after fix:
{
  "proof": "raw thread-not-found preflight compaction is recoverable",
  "returnedSameSessionEntry": true,
  "compactCalls": 1,
  "incrementCalls": 0,
  "threw": false
}
  • Observed result after fix: The runtime helper returned the active session entry and did not throw Preflight compaction required but failed: thread not found: <codex-thread-id>.
  • What was not tested: Live Telegram delivery against a production gateway running this branch was not tested; the proof covers the runtime recovery branch directly with sanitized local session state.

Validation

Passed on rebased head 24ea4a100b2e0877c4eec97e7e3eca2ebbce4cce:

node --import tsx --input-type=module -e "await import('./src/agents/harness/compaction-recovery.ts'); console.log('import-ok')"
pnpm exec oxlint src/agents/harness/compaction-recovery.ts src/auto-reply/reply/agent-runner-memory.ts src/auto-reply/reply/agent-runner-memory.test.ts src/agents/command/cli-compaction.ts src/agents/command/cli-compaction.test.ts src/agents/embedded-agent-runner/compact.queued.ts
pnpm exec oxfmt --check src/agents/harness/compaction-recovery.ts src/auto-reply/reply/agent-runner-memory.ts src/auto-reply/reply/agent-runner-memory.test.ts src/agents/command/cli-compaction.ts src/agents/command/cli-compaction.test.ts src/agents/embedded-agent-runner/compact.queued.ts
git diff --check

Attempted but not completed locally:

node scripts/run-vitest.mjs src/auto-reply/reply/agent-runner-memory.test.ts -t 'unstructured thread-not-found' --run

The Vitest process stayed CPU-bound without reporter output in this fresh worktree even with a timeout. I stopped treating the local wrapper as authoritative and pushed the branch for CI to exercise the included regression tests.

@openclaw-barnacle openclaw-barnacle Bot added size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 28, 2026
@clawsweeper

clawsweeper Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed May 28, 2026, 4:04 PM ET / 20:04 UTC.

Summary
The PR adds a shared native harness compaction recovery classifier, applies it to auto-reply, CLI, and queued embedded compaction, and adds raw missing-thread regression coverage.

PR surface: Source +5, Tests +131. Total +136 across 6 files.

Reproducibility: yes. Current main only recovers structured failure.reason binding failures, so a raw top-level thread not found compaction result reaches the preflight throw path; I did not execute tests because this review is read-only.

Review metrics: 2 noteworthy metrics.

  • Recovery paths updated: 1 helper added, 3 call sites switched. The prior concern was one-sided recovery; applying the same classifier to auto-reply, CLI, and queued embedded compaction reduces sibling-surface drift.
  • Regression cases added: 2 focused tests added. The new coverage targets the raw preflight failure and the sibling CLI raw fallback shape that current main did not cover.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🐚 platinum hermit
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Wait for the remaining relevant check run to finish and have a maintainer accept the raw reason-string recovery scope.

Risk before merge

  • [P1] The PR intentionally changes raw top-level thread not found / no thread binding native harness compaction reasons from fail-closed errors into recovery or fallback, so maintainers should accept that as the session-state contract before merge.
  • [P1] Live Telegram delivery was not exercised on this branch; the supplied proof covers the runtime recovery helper and CI-covered regression paths with sanitized local session state.

Maintainer options:

  1. Accept Bounded Raw-Recovery Scope (recommended)
    Merge once required checks are green if maintainers are comfortable treating raw native harness thread not found and no thread binding reasons as stale binding state.
  2. Tighten The Classifier First
    Restrict raw-string recovery to the Codex/native-harness producer or add focused proof that unrelated context-engine/plugin errors with similar text still fail closed.

Next step before merge

  • [P2] Human review should accept the session-state recovery contract and final check status; there is no narrow automated repair left from this review.

Security
Cleared: The diff adds a small TypeScript classifier and tests, with no dependency, workflow, credential, package, or secret-handling changes.

Review details

Best possible solution:

Land the shared classifier once maintainers accept bounded raw missing-thread recovery and required checks are green, while keeping unrelated compaction failures fail-closed.

Do we have a high-confidence way to reproduce the issue?

Yes. Current main only recovers structured failure.reason binding failures, so a raw top-level thread not found compaction result reaches the preflight throw path; I did not execute tests because this review is read-only.

Is this the best way to solve the issue?

Yes, with maintainer acceptance of the raw reason-string boundary. Sharing one classifier across auto-reply, CLI, and queued compaction is narrower than keeping three divergent recovery checks.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 409356fc666e.

Label changes

Label changes:

  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix live output from a local runtime invocation of the real preflight helper showing the raw missing-thread result no longer throws; live Telegram delivery remains explicitly untested.

Label justifications:

  • P1: The PR targets a user-facing regression where accepted Telegram group messages can fail before an assistant turn because Codex compaction sees stale session state.
  • merge-risk: 🚨 session-state: Merging changes which stale Codex thread/session binding failures recover instead of surfacing as compaction errors.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🐚 platinum hermit and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes after-fix live output from a local runtime invocation of the real preflight helper showing the raw missing-thread result no longer throws; live Telegram delivery remains explicitly untested.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix live output from a local runtime invocation of the real preflight helper showing the raw missing-thread result no longer throws; live Telegram delivery remains explicitly untested.
Evidence reviewed

PR surface:

Source +5, Tests +131. Total +136 across 6 files.

View PR surface stats
Area Files Added Removed Net
Source 4 29 24 +5
Tests 2 131 0 +131
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 6 160 24 +136

What I checked:

  • Root and scoped policy read: Root AGENTS.md and src/agents/AGENTS.md were read; the review applied the session-state compatibility and sibling-surface guidance to this compaction recovery PR. (AGENTS.md:19, 409356fc666e)
  • Current main fails raw top-level reasons: On current main, auto-reply preflight recovery only treats structured failure.reason values as recoverable, so a raw top-level reason: "thread not found: ..." reaches the preflight failure throw. (src/auto-reply/reply/agent-runner-memory.ts:132, 409356fc666e)
  • Sibling CLI path was structured-only on main: The CLI native compaction fallback path on main also only recognized structured missing/stale binding failures before this PR. (src/agents/command/cli-compaction.ts:177, 409356fc666e)
  • Queued embedded compaction was structured-only on main: The queued embedded compaction fallback path only fell back after structured missing_thread_binding or stale_thread_binding results, matching the sibling gap called out in the earlier review. (src/agents/embedded-agent-runner/compact.queued.ts:53, 409356fc666e)
  • PR adds shared raw-reason classifier: The PR adds isRecoverableNativeHarnessBindingReason and classifies the existing structured codes plus raw thread not found / no thread binding reason text, then uses the helper from all three touched compaction paths. (src/agents/harness/compaction-recovery.ts:3, 24ea4a100b2e)
  • PR adds focused regression coverage: The diff adds auto-reply coverage for an unstructured raw preflight compaction failure and CLI coverage for a raw top-level missing-thread native compaction result with no structured failure reason. (src/auto-reply/reply/agent-runner-memory.test.ts:929, 24ea4a100b2e)

Likely related people:

  • steipete: Merged commit 9b9d897 introduced the structured stale/missing binding recovery lineage, and current blame/log history for the central compaction files points to Peter Steinberger as the main recent contributor in this checked-out history. (role: prior recovery path author and recent area contributor; confidence: high; commits: 9b9d8970b0ce, f0bfa650dc90; files: src/auto-reply/reply/agent-runner-memory.ts, src/agents/command/cli-compaction.ts, src/agents/embedded-agent-runner/compact.queued.ts)
  • pfrederiksen: The prior merged recovery commit records Paul Frederiksen as co-author, and the related open report and this PR provide the sanitized reproduction/proof for the raw missing-thread follow-up. (role: prior recovery co-author and regression reporter; confidence: medium; commits: 9b9d8970b0ce, 24ea4a100b2e; files: src/auto-reply/reply/agent-runner-memory.test.ts, src/agents/command/cli-compaction.test.ts, src/agents/harness/compaction-recovery.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 28, 2026
@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P1 High-priority user-facing bug, regression, or broken workflow. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. labels May 28, 2026
@pfrederiksen pfrederiksen force-pushed the fix-codex-unstructured-thread-not-found branch from 7fbf91f to 4d20fef Compare May 28, 2026 18:03
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 28, 2026
@pfrederiksen pfrederiksen force-pushed the fix-codex-unstructured-thread-not-found branch from 4d20fef to 972bc11 Compare May 28, 2026 18:12
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 28, 2026
@pfrederiksen pfrederiksen force-pushed the fix-codex-unstructured-thread-not-found branch from 972bc11 to 354bb1d Compare May 28, 2026 18:36
@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling and removed proof: sufficient ClawSweeper judged the real behavior proof convincing. labels May 28, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 28, 2026
@pfrederiksen pfrederiksen force-pushed the fix-codex-unstructured-thread-not-found branch from 354bb1d to 79904ce Compare May 28, 2026 19:07
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 28, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 28, 2026
@pfrederiksen pfrederiksen force-pushed the fix-codex-unstructured-thread-not-found branch from 79904ce to d85878a Compare May 28, 2026 19:45
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 28, 2026
@pfrederiksen

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

Addressed the P1/P2 review items on the latest head d85878a:

  • CLI sibling path is now covered: the recoverable native harness binding classifier is shared by auto-reply preflight, CLI compaction, and queued embedded compaction.
  • Added CLI regression coverage for a raw top-level reason: "thread not found: ..." result with no failure.reason, asserting context-engine fallback still runs.
  • PR body validation/proof notes were updated for the current scoped diff.
  • Rebased onto fresh origin/main; PR is mergeable.

Local gates passed:

node --import tsx --input-type=module -e "await import('./src/agents/harness/compaction-recovery.ts'); console.log('import-ok')"
pnpm exec oxlint src/agents/harness/compaction-recovery.ts src/auto-reply/reply/agent-runner-memory.ts src/auto-reply/reply/agent-runner-memory.test.ts src/agents/command/cli-compaction.ts src/agents/command/cli-compaction.test.ts src/agents/embedded-agent-runner/compact.queued.ts
pnpm exec oxfmt --check src/agents/harness/compaction-recovery.ts src/auto-reply/reply/agent-runner-memory.ts src/auto-reply/reply/agent-runner-memory.test.ts src/agents/command/cli-compaction.ts src/agents/command/cli-compaction.test.ts src/agents/embedded-agent-runner/compact.queued.ts
git diff --check

@clawsweeper

clawsweeper Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@pfrederiksen pfrederiksen force-pushed the fix-codex-unstructured-thread-not-found branch from d85878a to 24d0bf8 Compare May 28, 2026 19:51
@pfrederiksen

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

Follow-up for the new head 24d0bf87651bb090936e7524297b70c8e47e0dbb:

  • Kept the P1/P2 compaction fixes from the prior head.
  • Addressed the current CI / check-test-types failure by changing src/infra/bonjour-discovery.test.ts to use the exported GatewayBonjourBeacon type instead of its narrower local BeaconRecord alias.
  • Re-ran local fast gates for all touched files: import sanity for the shared recovery helper, oxlint, oxfmt --check, and git diff --check.

The local tsgo lane itself stayed CPU-bound in this worktree, so CI is the authoritative proof for check-test-types on this head.

@clawsweeper

clawsweeper Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@pfrederiksen pfrederiksen force-pushed the fix-codex-unstructured-thread-not-found branch from 24d0bf8 to d081799 Compare May 28, 2026 19:55
@pfrederiksen pfrederiksen force-pushed the fix-codex-unstructured-thread-not-found branch from d081799 to 24ea4a1 Compare May 28, 2026 19:56
@pfrederiksen

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

New head 24ea4a100b2e0877c4eec97e7e3eca2ebbce4cce is rebased onto fresh origin/main.

  • The prior upstream bonjour-discovery.test.ts check-test-types failure is now fixed in base, so the PR is back to the scoped 6 compaction files.
  • P1/P2 fixes remain: shared native harness binding recovery helper is used by auto-reply, CLI compaction, and queued embedded compaction; CLI raw top-level thread not found fallback regression is included.
  • Local fast gates passed again: import sanity, oxlint, oxfmt --check, and git diff --check on touched files.

@clawsweeper

clawsweeper Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 28, 2026
@steipete steipete self-assigned this May 28, 2026
@steipete

Copy link
Copy Markdown
Contributor

Landing verification for PR #87738.

Behavior addressed: Codex stale/missing native thread binding failures can surface as raw thread not found: <id> compaction failures; preflight and CLI compaction now treat that binding class as recoverable instead of surfacing a reply failure.
Real environment tested: GitHub CI on PR head 24ea4a100b2e0877c4eec97e7e3eca2ebbce4cce, plus local source review against OpenClaw and sibling Codex app-server source.
Exact steps or command run after this patch: git status -sb; git diff --check origin/main...refs/remotes/pr/87738; gitcrawl gh pr checks 87738 --repo openclaw/openclaw --watch=false; source-read Codex thread/compact/start and load_thread path.
Evidence after fix: CI passed check-lint, check-test-types, checks-node-agentic-agents, checks-node-agentic-cli, checks-node-auto-reply-reply-agent-runner, security lanes, and Real behavior proof. Codex app-server source maps missing compact thread IDs to thread not found: {thread_id}, matching the newly covered raw failure path.
Observed result after fix: no blocking findings; PR is MERGEABLE / CLEAN and ready to land.
What was not tested: no additional local Vitest run beyond the passing PR CI lanes.

@steipete steipete merged commit e69855e into openclaw:main May 28, 2026
120 checks passed
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 29, 2026
…87738)

Recover Codex compaction paths when a stale app-server thread binding returns an unstructured `thread not found` failure. The raw missing-thread response now shares the same recovery behavior as structured missing/stale binding failures for preflight, queued compaction, and CLI fallback.

Fixes openclaw#87736.

Co-authored-by: Paul Frederiksen <paul@paulfrederiksen.com>
steipete pushed a commit that referenced this pull request May 29, 2026
Recover Codex compaction paths when a stale app-server thread binding returns an unstructured `thread not found` failure. The raw missing-thread response now shares the same recovery behavior as structured missing/stale binding failures for preflight, queued compaction, and CLI fallback.

Fixes #87736.

Co-authored-by: Paul Frederiksen <paul@paulfrederiksen.com>
eleboucher pushed a commit to eleboucher/homelab that referenced this pull request May 31, 2026
…026.5.28) (#759)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [ghcr.io/openclaw/openclaw](https://openclaw.ai) ([source](https://github.com/openclaw/openclaw)) | patch | `2026.5.27` → `2026.5.28` |

---

### Release Notes

<details>
<summary>openclaw/openclaw (ghcr.io/openclaw/openclaw)</summary>

### [`v2026.5.28`](https://github.com/openclaw/openclaw/blob/HEAD/CHANGELOG.md#2026528)

[Compare Source](openclaw/openclaw@v2026.5.27...v2026.5.28)

##### Highlights

- Agent and Codex runtime recovery is steadier: subagents keep cwd/workspace separation, hook context stays prompt-local, session locks release on timeout abort while live OpenClaw locks survive cleanup, stale restart continuations are avoided, and Codex app-server/helper failures no longer tear down shared runtime state. ([#&#8203;87218](openclaw/openclaw#87218), [#&#8203;86875](openclaw/openclaw#86875), [#&#8203;87409](openclaw/openclaw#87409), [#&#8203;87399](openclaw/openclaw#87399), [#&#8203;87375](openclaw/openclaw#87375), [#&#8203;88129](openclaw/openclaw#88129))
- Channel delivery and session identity got safer across outbound plugin hooks, Matrix room ids, iMessage reactions/approvals, Slack final replies, Discord recovered tool warnings, runtime-config message actions, WhatsApp profile auth roots, Telegram polling, and Microsoft Teams service URL trust checks. ([#&#8203;73706](openclaw/openclaw#73706), [#&#8203;75670](openclaw/openclaw#75670), [#&#8203;87366](openclaw/openclaw#87366), [#&#8203;87451](openclaw/openclaw#87451), [#&#8203;87334](openclaw/openclaw#87334), [#&#8203;84535](openclaw/openclaw#84535), [#&#8203;82492](openclaw/openclaw#82492), [#&#8203;83304](openclaw/openclaw#83304), [#&#8203;87160](openclaw/openclaw#87160))
- Mobile and chat surfaces got a broader refresh: the iOS Pro UI, hosted push relay default, realtime Talk tab playback, Gateway chat transport, onboarding, Talk permissions, WebChat reconnect delivery, and session picker behavior now preserve more state across reconnects and empty searches. ([#&#8203;87367](openclaw/openclaw#87367), [#&#8203;87531](openclaw/openclaw#87531), [#&#8203;87682](openclaw/openclaw#87682), [#&#8203;88096](openclaw/openclaw#88096), [#&#8203;88105](openclaw/openclaw#88105)) Thanks [@&#8203;ngutman](https://github.com/ngutman) and [@&#8203;BunsDev](https://github.com/BunsDev).
- Browser, channel, and automation inputs are stricter: Browser tool timeouts, viewport/tab indices, Gateway ports, cron retry handling, Discord component ids, schema array refs, Telegram callback pages, and channel progress callbacks now reject malformed values earlier and preserve the intended delivery context. ([#&#8203;82887](openclaw/openclaw#82887))
- Provider, media, and document coverage expands with Claude Opus 4.8, Fal Krea image schemas, NVIDIA featured models, MiniMax streaming music responses, encrypted PDF extraction, voice model catalogs, GitHub Copilot agent runtime support, and a Codex Supervisor plugin path for delegated Codex workflows. ([#&#8203;87845](openclaw/openclaw#87845), [#&#8203;87890](openclaw/openclaw#87890), [#&#8203;80775](openclaw/openclaw#80775), [#&#8203;84764](openclaw/openclaw#84764), [#&#8203;87751](openclaw/openclaw#87751), [#&#8203;87794](openclaw/openclaw#87794))
- CLI, auth, doctor, and provider paths fail faster and recover more clearly: malformed numeric/version options are rejected, workspace dotenv provider credentials are ignored, heartbeat defaults, OAuth/token lifetimes, and local service startup requests are bounded, agent auth health labels are clearer, legacy `api_key` auth profiles migrate to canonical form, and restart guidance is actionable. ([#&#8203;87398](openclaw/openclaw#87398), [#&#8203;86281](openclaw/openclaw#86281), [#&#8203;87361](openclaw/openclaw#87361), [#&#8203;88133](openclaw/openclaw#88133), [#&#8203;83655](openclaw/openclaw#83655), [#&#8203;87559](openclaw/openclaw#87559), [#&#8203;88088](openclaw/openclaw#88088), [#&#8203;85924](openclaw/openclaw#85924)) Thanks [@&#8203;vincentkoc](https://github.com/vincentkoc) and [@&#8203;giodl73-repo](https://github.com/giodl73-repo).
- Plugin and Gateway hot paths do less repeated work while preserving cache correctness for install records, config JSON parsing, tool search catalogs, session stores, manifest model rows, auto-enabled plugin config, browser tokens, viewer assets, and release-split external plugin packages. ([#&#8203;86699](openclaw/openclaw#86699))
- Release, QA, and E2E validation now bound more log, artifact, harness, and cross-OS waits so failing lanes produce proof instead of hanging or false-greening.

##### Changes

- Status: show active subagent details in status output.
- Diffs: split the default language pack and expand default Diffs language coverage while keeping the host floor aligned. ([#&#8203;87370](openclaw/openclaw#87370), [#&#8203;87372](openclaw/openclaw#87372)) Thanks [@&#8203;RomneyDa](https://github.com/RomneyDa).
- ClawHub: add plugin display names plus skill verification and trust surfaces. ([#&#8203;87354](openclaw/openclaw#87354), [#&#8203;86699](openclaw/openclaw#86699)) Thanks [@&#8203;thewilloftheshadow](https://github.com/thewilloftheshadow) and [@&#8203;Patrick-Erichsen](https://github.com/Patrick-Erichsen).
- iOS: refresh the dev app with Pro Command, Chat, Agents, Settings, hosted push relay defaults, and realtime Talk playback wired to gateway sessions, diagnostics, chat, and realtime Talk. ([#&#8203;87367](openclaw/openclaw#87367), [#&#8203;88096](openclaw/openclaw#88096), [#&#8203;88105](openclaw/openclaw#88105)) Thanks [@&#8203;Solvely-Colin](https://github.com/Solvely-Colin) and [@&#8203;ngutman](https://github.com/ngutman).
- Docs: clarify Codex computer-use setup, paste-token stdin auth setup, macOS gateway sleep troubleshooting, native Codex hook relay recovery, container model auth, install deployment cards, device-token admin gating, CLI setup flow compatibility, Notte cloud browser CDP setup, and backport targets. ([#&#8203;87313](openclaw/openclaw#87313), [#&#8203;63050](openclaw/openclaw#63050), [#&#8203;87685](openclaw/openclaw#87685)) Thanks [@&#8203;bdjben](https://github.com/bdjben), [@&#8203;liaoandi](https://github.com/liaoandi), and [@&#8203;thewilloftheshadow](https://github.com/thewilloftheshadow).
- PDF/tools: use ClawPDF for PDF extraction, support encrypted PDF extraction, and surface MCP structured content in agent tool results. ([#&#8203;87670](openclaw/openclaw#87670), [#&#8203;87751](openclaw/openclaw#87751))
- Providers: add Claude Opus 4.8 support, Fal Krea image model schemas, NVIDIA featured model catalogs, MiniMax streaming music responses, and provider-backed voice model catalogs. ([#&#8203;87845](openclaw/openclaw#87845), [#&#8203;87890](openclaw/openclaw#87890), [#&#8203;80775](openclaw/openclaw#80775), [#&#8203;84764](openclaw/openclaw#84764), [#&#8203;87794](openclaw/openclaw#87794)) Thanks [@&#8203;eleqtrizit](https://github.com/eleqtrizit) and [@&#8203;vincentkoc](https://github.com/vincentkoc).
- Codex/GitHub: add the GitHub Copilot agent runtime and the Codex Supervisor plugin package.
- Plugins: externalize GitHub Copilot and Tokenjuice as official install-on-demand plugins with npm and ClawHub publish metadata.
- Workboard: add agent coordination tools for tracking and handing off active agent work.
- Discord: show commentary in progress drafts so live Discord runs expose useful in-progress context. ([#&#8203;85200](openclaw/openclaw#85200))
- Plugin SDK: add a reply payload sending hook for plugins that need to deliver channel-owned replies and flatten package types for SDK declarations. ([#&#8203;82823](openclaw/openclaw#82823), [#&#8203;87165](openclaw/openclaw#87165)) Thanks [@&#8203;piersonr](https://github.com/piersonr) and [@&#8203;RomneyDa](https://github.com/RomneyDa).
- Policy: add policy comparison, ingress-channel conformance, and sandbox-posture conformance checks. ([#&#8203;85572](openclaw/openclaw#85572), [#&#8203;85744](openclaw/openclaw#85744), [#&#8203;86768](openclaw/openclaw#86768))

##### Fixes

- Agents: fall back to local config pruning when the optional `agents delete` Gateway probe cannot authenticate, so offline installs can still delete agents without removing shared workspaces.
- Tighten phone-control mutation authorization \[AI]. ([#&#8203;87150](openclaw/openclaw#87150)) Thanks [@&#8203;pgondhi987](https://github.com/pgondhi987).
- Clarify directive persistence authorization policy \[AI]. ([#&#8203;86369](openclaw/openclaw#86369)) Thanks [@&#8203;pgondhi987](https://github.com/pgondhi987).
- Agents/Codex: keep spawned agent cwd/workspace state separated, forward ACP spawn attachments, keep hook context prompt-local, release session locks on timeout abort and runtime teardown without deleting live OpenClaw-owned locks during cleanup, avoid session event queue self-wait, clean up exec abort listeners, stream assistant deltas incrementally, recover raw missing-thread compaction failures, preserve rotated compaction session identity, keep compaction-timeout snapshots continuable, preserve shared app-server state across startup or helper failures, keep native hook relay alive across restarts and prune stale bridge files, close native hook relay replacement races, keep Claude live tool progress visible for watchdog recovery, suppress abandoned requester completion handoff, route workspace memory through tools, resolve Codex runtime models first, report quarantined dynamic tools, format `skills` command output, bind node auto-review to prepared plans, retry Claude CLI transcript probes, and bound compaction/steering retries. ([#&#8203;87218](openclaw/openclaw#87218), [#&#8203;86875](openclaw/openclaw#86875), [#&#8203;86123](openclaw/openclaw#86123), [#&#8203;88129](openclaw/openclaw#88129), [#&#8203;87399](openclaw/openclaw#87399), [#&#8203;87375](openclaw/openclaw#87375), [#&#8203;72574](openclaw/openclaw#72574), [#&#8203;87383](openclaw/openclaw#87383), [#&#8203;87400](openclaw/openclaw#87400), [#&#8203;83022](openclaw/openclaw#83022), [#&#8203;87671](openclaw/openclaw#87671), [#&#8203;87738](openclaw/openclaw#87738), [#&#8203;87747](openclaw/openclaw#87747), [#&#8203;87706](openclaw/openclaw#87706), [#&#8203;87546](openclaw/openclaw#87546), [#&#8203;87541](openclaw/openclaw#87541), [#&#8203;81048](openclaw/openclaw#81048)) Thanks [@&#8203;mbelinky](https://github.com/mbelinky), [@&#8203;Alix-007](https://github.com/Alix-007), [@&#8203;luoyanglang](https://github.com/luoyanglang), [@&#8203;yetval](https://github.com/yetval), [@&#8203;sjf](https://github.com/sjf), [@&#8203;joshavant](https://github.com/joshavant), [@&#8203;benjamin1492](https://github.com/benjamin1492), [@&#8203;c19354837](https://github.com/c19354837), [@&#8203;fuller-stack-dev](https://github.com/fuller-stack-dev), [@&#8203;pfrederiksen](https://github.com/pfrederiksen), and [@&#8203;dodge1218](https://github.com/dodge1218).
- Codex Supervisor: keep real-home app-server MCP session listing on the loaded state path, bound stored history scans, and close WebSocket probes cleanly.
- Channels: thread canonical session keys into outbound hooks, preserve Matrix room-id case, keep fallback tool warnings mention-inert, retain delivered Slack final replies during late cleanup, continue iMessage polling after denied reactions, suppress duplicate native exec approvals, resolve Gateway message actions against the active runtime config, preserve Telegram SecretRef prompt config and polling keepalives, preserve WhatsApp profile auth roots, QR display, document filenames, and plugin hook config, suppress Discord recovered tool warnings, preserve the Discord voice outbound helper, cap Discord/Signal/Zalo channel request and container timeouts, and block untrusted Teams service URLs while keeping TeamsSDK patterns aligned. ([#&#8203;73706](openclaw/openclaw#73706), [#&#8203;75670](openclaw/openclaw#75670), [#&#8203;87366](openclaw/openclaw#87366), [#&#8203;87451](openclaw/openclaw#87451), [#&#8203;87465](openclaw/openclaw#87465), [#&#8203;87334](openclaw/openclaw#87334), [#&#8203;84535](openclaw/openclaw#84535), [#&#8203;76262](openclaw/openclaw#76262), [#&#8203;83304](openclaw/openclaw#83304), [#&#8203;82492](openclaw/openclaw#82492), [#&#8203;87581](openclaw/openclaw#87581), [#&#8203;77114](openclaw/openclaw#77114), [#&#8203;86426](openclaw/openclaw#86426), [#&#8203;85529](openclaw/openclaw#85529), [#&#8203;87160](openclaw/openclaw#87160)) Thanks [@&#8203;zeroaltitude](https://github.com/zeroaltitude), [@&#8203;lukeboyett](https://github.com/lukeboyett), [@&#8203;jarvis-mns1](https://github.com/jarvis-mns1), [@&#8203;xiaotian](https://github.com/xiaotian), [@&#8203;funmerlin](https://github.com/funmerlin), [@&#8203;joshavant](https://github.com/joshavant), [@&#8203;eleqtrizit](https://github.com/eleqtrizit), [@&#8203;heyitsaamir](https://github.com/heyitsaamir), [@&#8203;amittell](https://github.com/amittell), [@&#8203;lidge-jun](https://github.com/lidge-jun), [@&#8203;liorb-mountapps](https://github.com/liorb-mountapps), [@&#8203;masatohoshino](https://github.com/masatohoshino), [@&#8203;bladin](https://github.com/bladin), and [@&#8203;giodl73-repo](https://github.com/giodl73-repo).
- CLI/auth/doctor/providers: reject malformed numeric/timeout/subcommand-version inputs, ignore workspace dotenv provider credentials, wait for respawn child shutdown, bound heartbeat defaults plus Codex, GitHub Copilot, OpenAI, Anthropic, Google, Feishu, LM Studio, MiniMax, Xiaomi TTS, and local-provider OAuth/token/model requests, harden Codex auth probes, label auth health by agent, preserve explicit agentRuntime pins during Codex model migration, warm provider auth off the main thread, honor Codex response timeouts, stop migrating current Claude Haiku 4.5 profiles to Sonnet, bound local service startup, resolve GPT-5.5 without cached catalog, migrate legacy memory auto-provider config, rewrite non-canonical `api_key` auth profiles, and make doctor restart follow-ups actionable. ([#&#8203;87398](openclaw/openclaw#87398), [#&#8203;86281](openclaw/openclaw#86281), [#&#8203;87361](openclaw/openclaw#87361), [#&#8203;88133](openclaw/openclaw#88133), [#&#8203;83655](openclaw/openclaw#83655), [#&#8203;87559](openclaw/openclaw#87559), [#&#8203;87719](openclaw/openclaw#87719), [#&#8203;88088](openclaw/openclaw#88088), [#&#8203;85924](openclaw/openclaw#85924), [#&#8203;84362](openclaw/openclaw#84362)) Thanks [@&#8203;Patrick-Erichsen](https://github.com/Patrick-Erichsen), [@&#8203;samzong](https://github.com/samzong), [@&#8203;giodl73-repo](https://github.com/giodl73-repo), [@&#8203;alkor2000](https://github.com/alkor2000), [@&#8203;mmaps](https://github.com/mmaps), [@&#8203;nxmxbbd](https://github.com/nxmxbbd), and [@&#8203;vincentkoc](https://github.com/vincentkoc).
- Gateway/security/session state: expire browser tokens after auth rotation, scope assistant idempotency dedupe, drain probe client closes, avoid stale restart continuation reuse, preserve retry-after fallbacks and stale rate-limit cooldown probes, bound webchat image and artifact transcript scans, include seconds in inbound metadata timestamps, clear completed session active runs, clear stale chat stream buffers, and evict current plugin-state namespaces at row caps. ([#&#8203;87810](openclaw/openclaw#87810), [#&#8203;87833](openclaw/openclaw#87833), [#&#8203;75089](openclaw/openclaw#75089)) Thanks [@&#8203;joshavant](https://github.com/joshavant) and [@&#8203;litang9](https://github.com/litang9).
- Config/parsing/network: reject partial numeric parsing, parse provider/Discord retry headers and dates strictly, honor IPv6 and bare IPv6 `no_proxy` entries, preserve empty plugin allowlists, canonicalize secret target array indexes, and reject malformed media content lengths, inspected TCP ports, marketplace content lengths, cron epochs, sandbox stat fields, unsafe duration values, empty config path segments, noncanonical schema array refs, unsafe Telegram callback pages, and invalid Teams attachment-fetch DNS targets. ([#&#8203;87883](openclaw/openclaw#87883)) Thanks [@&#8203;zhangguiping-xydt](https://github.com/zhangguiping-xydt).
- Browser/input hardening: reject invalid tab indexes, excessive viewport resizes, explicit zero CDP ports, malformed geolocation options, unsafe screenshot or permission-grant timeouts, loose response-body limits, invalid cookie expiries, and non-finite Browser tool delays/timeouts.
- Cron/automation: retry recurring jobs after transient model rate limits before waiting for the next scheduled slot, and preflight model fallbacks before skipping scheduled work. ([#&#8203;82887](openclaw/openclaw#82887)) Thanks [@&#8203;chen-zhang-cs-code](https://github.com/chen-zhang-cs-code).
- Auto-reply/directives: respect provider and relayed channel metadata during directive persistence so channel-originated decisions keep their intended context. ([#&#8203;87683](openclaw/openclaw#87683))
- WhatsApp: resolve the auth directory from the active profile so profile-scoped WhatsApp installs do not drift to the wrong credential root. ([#&#8203;82492](openclaw/openclaw#82492)) Thanks [@&#8203;lidge-jun](https://github.com/lidge-jun).
- Gateway/session state: clear completed session active runs, avoid cold-loading providers for MCP inventory, cache single-session child indexes, cap handshake timers, and bound preauth, auth-guard, media, transcript, readiness, and port options.
- Channels/replies: preserve channel-owned progress callbacks when verbose output is off, keep group-room progress suppression intact, prefer external session delivery context, escape Discord component id delimiters, force final TUI chat repaints, show Slack reasoning previews, and normalize Discord/Matrix/Mattermost channel numeric options. ([#&#8203;87476](openclaw/openclaw#87476), [#&#8203;87423](openclaw/openclaw#87423))
- Agents/tool args: harden smart-quoted argument repair for edit arrays and exact escaped arguments so model-produced tool calls recover without corrupting valid input. ([#&#8203;86611](openclaw/openclaw#86611)) Thanks [@&#8203;ferminquant](https://github.com/ferminquant).
- Providers/agents: preserve seeded Anthropic signatures, preserve signed thinking payloads, concatenate signature-delta chunks, preserve DeepSeek `reasoning_content` replay across tier suffixes, apply OpenRouter strict9 ids to Mistral routes, promote Ollama plain-text tool calls, load NVIDIA featured model catalogs, stream MiniMax music generation responses, and recover empty preflight compaction. ([#&#8203;87593](openclaw/openclaw#87593), [#&#8203;87493](openclaw/openclaw#87493), [#&#8203;80775](openclaw/openclaw#80775), [#&#8203;84764](openclaw/openclaw#84764)) Thanks [@&#8203;Pluviobyte](https://github.com/Pluviobyte) and [@&#8203;eleqtrizit](https://github.com/eleqtrizit).
- Media/images: skip CLI image cache refs when resolving generated images, allow trusted generated HTML attachments, and bound generated video downloads so stale refs and slow providers fail cleanly. ([#&#8203;87523](openclaw/openclaw#87523), [#&#8203;87982](openclaw/openclaw#87982))
- File transfer: handle late tar stdin pipe errors after archive validation or unpacking has already settled.
- Performance: trust install-record caches between reloads, prefer native JSON parsing, reuse unchanged tool-search catalogs, reuse gateway session and plugin metadata paths, skip unchanged store serialization, patch single-entry session writes, add precomputed session patch writers, reduce store clone allocations, cache manifest model catalog rows and auto-enabled plugin config, avoid full session snapshots for entry reads, defer configured Slack full startup, prefer bundled plugin dist entries, and slim current metadata identity caches. ([#&#8203;87760](openclaw/openclaw#87760))
- Docker/release/QA: package runtime workspace templates, stream cross-OS served artifacts, preserve sparse Crabbox run artifacts, isolate npm plugin installs per package, reject incompatible package plugin API installs, drop the leftover root Sharp dependency from package manifests after the Rastermill migration, bound OpenClaw instance logs, plugin gauntlet relay logs, MCP channel buffers, kitchen-sink scans, agent-turn assertions, QA-Lab credential broker calls, QA Matrix substrate requests, and release scenario logs, and keep release/google live guards current. ([#&#8203;87647](openclaw/openclaw#87647), [#&#8203;87477](openclaw/openclaw#87477)) Thanks [@&#8203;rohitjavvadi](https://github.com/rohitjavvadi) and [@&#8203;vincentkoc](https://github.com/vincentkoc).
- Release/CI: bound manual git fetches, ClawHub verifier responses, ClawHub owner metadata, dependency-guard error bodies, Parallels limits, startup/test/memory budget parsing, and diffs viewer build warnings so release lanes fail with useful proof instead of hanging. ([#&#8203;87839](openclaw/openclaw#87839))

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about these updates again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMDEuMSIsInVwZGF0ZWRJblZlciI6IjQzLjEwMS4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9jb250YWluZXIiLCJ0eXBlL3BhdGNoIl19-->

Reviewed-on: https://git.erwanleboucher.dev/eleboucher/homelab/pulls/759
SYU8384 pushed a commit to SYU8384/openclaw that referenced this pull request Jun 3, 2026
…87738)

Recover Codex compaction paths when a stale app-server thread binding returns an unstructured `thread not found` failure. The raw missing-thread response now shares the same recovery behavior as structured missing/stale binding failures for preflight, queued compaction, and CLI fallback.

Fixes openclaw#87736.

Co-authored-by: Paul Frederiksen <paul@paulfrederiksen.com>
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
…87738)

Recover Codex compaction paths when a stale app-server thread binding returns an unstructured `thread not found` failure. The raw missing-thread response now shares the same recovery behavior as structured missing/stale binding failures for preflight, queued compaction, and CLI fallback.

Fixes openclaw#87736.

Co-authored-by: Paul Frederiksen <paul@paulfrederiksen.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. P1 High-priority user-facing bug, regression, or broken workflow. proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: S status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Regression: preflight compaction still surfaces missing Codex thread failure after #86602

2 participants