fix(codex): prevent false completion stalls during native streams by keshavbotagent · Pull Request #87781 · openclaw/openclaw

keshavbotagent · 2026-05-28T21:05:13Z

Summary

Keep raw assistant progress on the documented turn/completed terminal guard; progress text is still not treated as a final answer.
Count forwarded native Codex response stream deltas, including response.custom_tool_call_input.delta, as active turn activity and attempt progress while a turn is already open.
Guard unscoped native response.*.delta notifications so they only refresh a turn when the shared app-server client has one active lease; concurrent leased turns no longer keep each other's watchdogs alive.
Keep the post-tool raw-assistant fallback timeout conservative at 5 * 60_000 / 300000 ms for app-server/native paths that do not forward stream deltas, and align runtime docs plus plugin metadata.
Preserve recovery behavior for actual completion stalls: replay only replay-safe attempts; for side-effecting/non-replay-safe stalls retire the stuck app-server client, clear stale native thread binding only when it still points at the timed-out thread, release the lane, and surface a recoverable incomplete-turn result.
After the one replay-safe stdio retry is exhausted, a repeated turn_completion_idle_timeout now falls through to the normal timeout payload path instead of rejecting the embedded run.

Why this shape

The failure is not that raw assistant progress needs to become deliverable final text. The failure is that native Codex can spend a long time actively streaming a large tool input/patch after a raw assistant progress item has completed, before any completed tool-call notification exists. Without counting those response.*.delta events as activity, OpenClaw can falsely fire turn.completion_idle_timeout while the model is still working.

This keeps the lifecycle semantics strict: final delivery still requires turn/completed; active native stream deltas only keep the watchdog from killing a live turn. The latest review fixes also avoid cross-talk for unscoped deltas and keep retry-exhausted completion-idle failures on the timeout-result path.

Real behavior proof

Behavior addressed: Codex-backed Telegram turns could be falsely aborted while native Codex was still streaming response deltas for a large tool input/patch after raw assistant progress, producing Codex stopped before confirming the turn was complete. Some work may already have been performed; verify the current state before retrying.
Real environment tested: Keshav's live OpenClaw gateway and Telegram group flow on 2026-05-29 IST.
Exact steps or command run after this patch: Keshav built and restarted the gateway from this PR branch after the native-delta runtime fix, then reattempted the Vietnam webpage edit flow that had repeatedly failed with the completion-idle watchdog. The latest pushed head 2aa61b24ea is rebased onto origin/main 040eba1cdc; the final commit on top is test-only stabilization after the runtime proof.
Evidence after fix: Copied live output from the Telegram operator thread:

2026-05-29 21:46 IST, Keshav G:
I built and restarted server
Then reattempted the vietnam webpage and yay it worked this time without errors

2026-05-29 23:41 IST, Keshav G:
I have build and restarted
Now do the tests, ci and submit latest proof from latest Heads

Observed result after fix: The previously repeated Vietnam webpage edit flow completed without the prior turn.completion_idle_timeout / Codex stopped before confirming the turn was complete error after the gateway was built and restarted from the PR branch.
What was not tested: The bot did not personally restart the live gateway; Keshav performed the build/restart. Latest GitHub Actions were still running when this proof was added, and local validation is listed below.

Review feedback addressed

Latest head: 2aa61b24ea on origin/main base 040eba1cdc.

Plugin config metadata and both Codex harness docs now agree with the runtime default: postToolRawAssistantCompletionIdleTimeoutMs defaults to 300000 ms.
Timeout cleanup now clears the persisted Codex app-server binding only if the stored binding still points at the timed-out thread, preserving transient-turn bindings.
Unscoped native response deltas now count as activity only when the shared app-server client reports a single active lease; scoped deltas for the active turn still count directly.
Retry-exhausted replay-safe stdio turn_completion_idle_timeout failures now surface the timeout payload instead of throwing the prompt error after the one retry has already been used.
Added focused regression coverage for binding-preserving cleanup, concurrent-lease unscoped deltas, native response delta activity, and retry-exhausted completion-idle timeout handling.

Tests / validation

git diff --check origin/main...HEAD passed on head 2aa61b24ea.
Targeted oxfmt --check passed on changed TypeScript/plugin files on head 2aa61b24ea.
Focused Vitest passed on head 2aa61b24ea: 6 files / 82 tests.

Focused Vitest command:

./node_modules/.bin/vitest run \
  extensions/codex/src/app-server/attempt-results.test.ts \
  extensions/codex/src/app-server/attempt-timeouts.test.ts \
  extensions/codex/src/app-server/run-attempt.turn-watches.test.ts \
  extensions/codex/src/app-server/session-binding.test.ts \
  src/agents/embedded-agent-runner/run.codex-app-server-recovery.test.ts \
  --reporter=dot

Related issue

Related to [Bug]: Codex-backed Telegram turns repeatedly time out waiting for turn/completed on 2026.5.27 #87744. This PR addresses the Codex-backed Telegram completion-idle family; the live proof narrows one concrete false-timeout cause to native response delta streams not being counted as turn activity.

clawsweeper · 2026-05-28T21:07:33Z

Codex review: needs real behavior proof before merge. Reviewed May 29, 2026, 6:37 PM ET / 22:37 UTC.

Summary
This PR changes Codex app-server turn watching and recovery so native response deltas refresh activity, completion-idle stalls retry or surface safely, matching-thread binding cleanup is used, and Codex docs/metadata/tests reflect the 300000 ms post-tool guard.

PR surface: Source +154, Tests +481, Docs +10. Total +645 across 21 files.

Reproducibility: yes. The related issue and PR discussion give a high-confidence log/live-flow path: Codex-backed Telegram turns reach raw assistant or item completion, then idle waiting for turn/completed and surface the completion-idle fallback; I did not rerun it locally.

Review metrics: 1 noteworthy metric.

Config/default surface: 1 changed default surface. The Codex plugin's postToolRawAssistantCompletionIdleTimeoutMs unset behavior now floors to 300000 ms, which affects existing operator timing without a config edit.

Merge readiness
Overall: 🦪 silver shellfish
Proof: 🦪 silver shellfish
Patch quality: 🐚 platinum hermit
Result: blocked until stronger real behavior proof is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

Post redacted current-head live Telegram or gateway proof for 115f649.
[P2] Have a maintainer explicitly accept the longer timeout default and Codex session retry/cleanup behavior before merge.

Proof guidance:

[P1] Needs stronger real behavior proof before merge: The PR includes copied live Telegram output, but it names older head 2aa61b2 while current head 115f649 has later runtime commits, so exact-head real behavior proof is still needed; redact private IDs, paths, phone numbers, endpoints, and credentials before posting. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Mantis proof suggestion
A live Telegram lane would materially prove the user-visible Codex-backed completion path that unit tests cannot show. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

telegram live: verify a Codex-backed Telegram turn on head 115f6493 completes a large native tool-input stream without turn_completion_idle_timeout.

Risk before merge

[P1] The live proof in the PR body is not for current head 115f649, and later runtime commits changed the completion-idle retry and watchdog semantics.
[P2] Changing the unset post-tool raw-assistant timeout to a 300000 ms floor can make existing Codex setups wait longer before surfacing a stuck turn.
[P1] Retry, client retirement, and thread-binding cleanup affect Codex session continuity and availability when native app-server turns stall.

Maintainer options:

Prove current head before merge (recommended)
Rebuild/restart from 115f649 and post redacted live Telegram or gateway output showing the large native response-delta flow completes without the completion-idle timeout.
Accept the longer timeout default
Maintainers can intentionally accept the 300000 ms unset default after deciding that slower failure surfacing is preferable to false aborts for existing Codex users.
Pause if live proof cannot be produced
If current-head live proof is not feasible, keep this paused or split to a narrower low-risk diagnostics/recovery change before landing runtime behavior changes.

Next step before merge

[P1] Needs human maintainer handling for exact-head live proof and upgrade-sensitive Codex runtime/session semantics; I did not find a narrow code defect for an automated repair PR.

Security
Cleared: No concrete security or supply-chain regression was found; the diff does not add dependencies, secrets handling, workflow permissions, or downloaded code execution.

Review details

Best possible solution:

Land only after current head has redacted live Telegram/gateway proof and maintainers explicitly accept the longer timeout default plus retry/session cleanup semantics; keep narrower unsafe-stall diagnostics in #87793 if still needed.

Do we have a high-confidence way to reproduce the issue?

Yes. The related issue and PR discussion give a high-confidence log/live-flow path: Codex-backed Telegram turns reach raw assistant or item completion, then idle waiting for turn/completed and surface the completion-idle fallback; I did not rerun it locally.

Is this the best way to solve the issue?

Unclear until current-head proof is posted. The implementation follows the right Codex app-server boundary and has focused tests, but the changed default, retry, and session cleanup semantics need maintainer acceptance before this is the best merge path.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 63a06e312dfe.

Label changes

Label justifications:

P1: The PR addresses a current Codex-backed Telegram workflow regression that can leave real users with partial side effects and no final answer.
merge-risk: 🚨 compatibility: The PR changes default timeout behavior for an existing Codex plugin config field, affecting upgrades and existing unset configurations.
merge-risk: 🚨 session-state: The PR changes retry and binding cleanup behavior for Codex native threads, which can affect whether sessions resume or start fresh after stalls.
merge-risk: 🚨 availability: The PR changes watchdog, retry, and client-retirement paths that decide whether Codex turns stall, recover, or release the OpenClaw lane.
rating: 🦪 silver shellfish: Overall readiness is 🦪 silver shellfish; proof is 🦪 silver shellfish and patch quality is 🐚 platinum hermit.
status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs stronger real behavior proof before merge: The PR includes copied live Telegram output, but it names older head 2aa61b2 while current head 115f649 has later runtime commits, so exact-head real behavior proof is still needed; redact private IDs, paths, phone numbers, endpoints, and credentials before posting. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.
mantis: telegram-visible-proof: Mantis should capture Telegram visible proof. The PR changes Telegram-visible Codex completion behavior and is well suited to a short live Telegram proof showing the turn completes instead of timing out.

Evidence reviewed

PR surface:

Source +154, Tests +481, Docs +10. Total +645 across 21 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	13	182	28	+154
Tests	6	515	34	+481
Docs	2	56	46	+10
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	21	753	108	+645

Acceptance criteria:

[P1] Build and restart the gateway from head 115f649, then rerun the affected Codex-backed Telegram flow and capture redacted live output.
[P1] Run the focused Codex app-server and embedded-runner Vitest files listed in the PR body after any further runtime edits.
[P1] Keep git diff --check origin/main...HEAD and targeted oxfmt checks green after any proof-refresh commits.

What I checked:

Repository policy applied: Root and scoped AGENTS.md guidance made this a compatibility and real Telegram proof review because the PR changes plugin config/default metadata, Codex runtime fallback behavior, and Telegram-visible Codex turn outcomes. (AGENTS.md:15, 63a06e312dfe)
Native delta activity path: The PR head treats native response stream deltas as activity only when scoped to the active turn or attributable to a single shared-client lease. (extensions/codex/src/app-server/run-attempt.ts:1286, 115f649391fa)
Post-tool timeout default changed: The runtime now defaults the post-tool raw assistant completion guard to at least 300000 ms, and plugin metadata exposes that default. (extensions/codex/src/app-server/attempt-timeouts.ts:6, 115f649391fa)
Recovery retry boundary: The retry helper retries Codex app-server completion-idle failures only for stdio, replay-safe, completion-watch timeouts and rejects progress or terminal watchdog stalls. (src/agents/embedded-agent-runner/run/codex-app-server-recovery.ts:17, 115f649391fa)
Focused regression coverage: The PR head includes tests for native delta activity, concurrent-lease unscoped delta rejection, and timeout recovery behavior. (extensions/codex/src/app-server/run-attempt.turn-watches.test.ts:1339, 115f649391fa)
Proof is stale relative to current head: The PR body's live proof names head 2aa61b2, while the current PR head is 115f649 and includes later runtime commits such as retry-only completion watchdog handling. (115f649391fa)

Likely related people:

@steipete: Current-main blame and recent history place the Codex app-server turn-watch/recovery implementation and related lifecycle work primarily in Peter Steinberger's recent commits. (role: recent area contributor; confidence: high; commits: e6b011823ebb, 659bcc5e5b59, 545490c5920d; files: extensions/codex/src/app-server/run-attempt.ts, extensions/codex/src/app-server/attempt-turn-watches.ts, src/agents/embedded-agent-runner/run.ts)
@vincentkoc: Recent Codex app-server auth/startup routing commits touch adjacent runtime continuity and binding behavior that this PR also affects. (role: adjacent owner; confidence: medium; commits: f1cc8f0cfc7c, 859eb0666282; files: extensions/codex/src/app-server/run-attempt.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

keshavbotagent · 2026-05-28T21:24:57Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a59d953faf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

keshavbotagent · 2026-05-28T21:41:14Z

Post-fix live proof has been added to the PR body.

Redacted summary:

Gateway was rebuilt/restarted on this branch and observed running with service version 2026.5.28.
Runtime checkout was fix/codex-raw-assistant-midturn-release at a59d953faf.
Built dist contains the PR behavior: postToolRawAssistantCompletionIdleTimeoutMs default normalization to 60000, the post-tool completion idle watch, and structured turn_completion_idle_timeout failure metadata.
A fresh isolated Codex validation turn completed successfully after two read-only tool calls: model.completed and session.ended status=success at 2026-05-28T21:39:08.429Z.
Delivered result: LIVE_PROOF_OK branch=fix/codex-raw-assistant-midturn-release head=a59d953faf.
No turn.completion_idle_timeout or retryable app-server failure occurred in that post-restart proof.

Exact chat/session/thread/turn/run/tool-call IDs, auth profile details, local usernames, and local paths are intentionally omitted.

keshavbotagent · 2026-05-28T21:46:29Z

Post-fix recurrence update, redacted:

After rebuild/restart on fix/codex-raw-assistant-midturn-release at a59d953faf, a separate live side-effecting run still hit the same app-server terminal-signal failure:

2026-05-28T21:37:04.144Z: live run started for a multi-file itinerary/data-model edit.
The run completed many tool calls/results, so it was not replay-safe/no-output.
Codex emitted assistant progress text: I'm changing the data model now....
2026-05-28T21:40:22.199Z: OpenClaw recorded turn.completion_idle_timeout with idleMs=60001, timeoutMs=60000, lastActivityReason=notification:rawResponseItem/completed, lastNotificationMethod=rawResponseItem/completed, lastNotificationItemType=message, lastNotificationItemRole=assistant.
The run ended as session.ended status=error with codex app-server turn idle timed out waiting for turn/completed.

This confirms the current PR scope: it keeps completion semantics safe and handles replay-safe recovery, but it does not solve non-replay-safe post-tool assistant-progress stalls with side effects. Those need a separate recovery strategy or an upstream app-server fix. Exact chat/session/thread/turn/run/tool-call IDs, usernames, and local paths omitted.

keshavbotagent · 2026-05-28T22:08:20Z

Implemented the next fix after the post-restart recurrence.

The recurrence was a side-effecting/progress timeout, so auto-replay remains intentionally blocked. New commit 06703bc1a6 changes that path to:

clear the stale Codex native thread binding when the native turn times out;
retire the stuck app-server client and release the OpenClaw lane;
classify assistant-progress/tool-activity completion-idle stalls as recoverable timeout outcomes;
return the recoverable timeout payload instead of throwing the raw codex app-server turn idle timed out waiting for turn/completed prompt error.

Validation run:

git diff --check
pnpm exec oxfmt --check on changed TypeScript files
node scripts/run-vitest.mjs run --config test/vitest/vitest.extension-codex.config.ts extensions/codex/src/app-server/attempt-results.test.ts extensions/codex/src/app-server/run-attempt.turn-watches.test.ts
node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts src/agents/embedded-agent-runner/run.codex-app-server-recovery.test.ts
pnpm tsgo:core
pnpm tsgo:extensions

Remaining live validation: rebuild/restart on head 06703bc1a6 and confirm the same side-effecting stall now returns the recoverable message and does not poison the next native thread binding.

keshavbotagent · 2026-05-29T04:56:36Z

@clawsweeper re-review
@codex re-review

clawsweeper · 2026-05-29T04:56:40Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26618900651
Updated: 2026-05-29T05:03:34.058Z

chatgpt-codex-connector · 2026-05-29T05:00:54Z

Codex Review: Didn't find any major issues. Breezy!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-29T18:46:19Z

Codex Review: Didn't find any major issues. 👍

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

keshavbotagent · 2026-05-29T19:16:05Z

@clawsweeper hatch

keshavbotagent · 2026-05-29T20:39:14Z

@clawsweeper review
@codex review

clawsweeper · 2026-05-29T20:39:17Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Superseded
Detail: A newer re-review for this item started before this run finished, so GitHub cancelled this older run. Check the latest ClawSweeper run for the current result.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26661032427
Updated: 2026-05-29T20:46:41.082Z

chatgpt-codex-connector · 2026-05-29T20:43:56Z

Codex Review: Didn't find any major issues. Swish!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

github-actions · 2026-05-29T20:46:49Z

Dependency graph guard cleared

This PR no longer has blocked dependency graph changes. A future dependency graph change requires a fresh /allow-dependencies-change comment after the guard blocks that new head SHA.

Current SHA: c005ba9b702b389efb88b2bdfce0cacebc16b75f

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c005ba9b70

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-29T22:51:33Z

+            shouldSurfaceCodexCompletionTimeout =
+              attempt.codexAppServerFailure?.kind === "turn_completion_idle_timeout" &&
+              attempt.timedOut;


Gate timeout surfacing on completion watchdogs

When the Codex app-server watch fires for the progress or terminal watchdog, run-attempt.ts still reports the broad kind: "turn_completion_idle_timeout" and only distinguishes the actual watchdog in turnWatchTimeoutKind. This condition ignores that narrower field, so a turn_progress_idle_timeout or turn_terminal_idle_timeout now bypasses the normal prompt-error/failover path and is returned as the recoverable completion-timeout payload even though the retry resolver explicitly rejects non-completion watchdogs. Please also require turnWatchTimeoutKind === "completion" here (or handle the legacy-absent case deliberately).

Useful? React with 👍 / 👎.

steipete · 2026-05-29T22:52:39Z

Behavior addressed: Codex app-server replay-safe completion-idle stalls retry once; non-completion turn-watch timeouts no longer replay; retry-exhausted/non-stdio completion timeouts surface user timeout payloads; native response deltas no longer disarm the post-tool completion guard incorrectly.
Real environment tested: local macOS plus Blacksmith Testbox.
Exact steps or command run after this patch:

node scripts/run-vitest.mjs src/infra/net/http-connect-tunnel.test.ts extensions/codex/src/app-server/run-attempt.turn-watches.test.ts src/agents/embedded-agent-runner/run.codex-app-server-recovery.test.ts
node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.turn-watches.test.ts src/agents/embedded-agent-runner/run.codex-app-server-recovery.test.ts extensions/codex-supervisor/src/supervisor.test.ts extensions/codex-supervisor/src/plugin-tools.test.ts
node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.extensions.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/extensions-test.tsbuildinfo
node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.core.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core-test.tsbuildinfo
node scripts/run-oxlint.mjs extensions/codex/src/app-server/run-attempt.ts extensions/codex/src/app-server/run-attempt.turn-watches.test.ts src/agents/embedded-agent-runner/run.codex-app-server-recovery.test.ts src/agents/embedded-agent-runner/run/codex-app-server-recovery.ts src/agents/embedded-agent-runner/run/types.ts
/Users/steipete/Projects/agent-scripts/skills/autoreview/scripts/autoreview --mode branch --base origin/main
pnpm check:changed (Blacksmith Testbox tbx_01ksty84e7wby9y3ydsw5tmnjc)
Evidence after fix: focused tests passed (4 files/78 tests and 5 files/102 tests); final autoreview clean; final Testbox changed gate exit 0; GitHub exact-head checks green.
Observed result after fix: completion-idle stalls recover once only when replay-safe and from stdio; progress/terminal watchdog timeouts surface without replay; scoped native deltas keep post-tool guard semantics; oversized CONNECT timeout test no longer leaks unhandled rejection.
What was not tested: live upstream Codex app-server future native raw response delta stream; covered by local harness/unit tests.

keshavbotagent mentioned this pull request May 28, 2026

fix(codex): recover replay-safe app-server stalls #87773

Closed

openclaw-barnacle Bot added docs Improvements or additions to documentation agents Agent runtime and tooling extensions: codex size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 28, 2026

keshavbotagent force-pushed the fix/codex-raw-assistant-midturn-release branch from 2575552 to a59d953 Compare May 28, 2026 21:19

chatgpt-codex-connector Bot reviewed May 28, 2026

View reviewed changes

Comment thread extensions/codex/src/app-server/attempt-timeouts.ts Outdated

clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. labels May 28, 2026

keshavbotagent changed the title ~~fix(codex): recover replay-safe app-server stalls~~ fix(codex): recover app-server completion stalls May 28, 2026

slatem mentioned this pull request May 28, 2026

fix(codex): clarify unsafe app-server completion stalls #87793

Open

RomneyDa added the dependency-guard-backfill label May 29, 2026

RomneyDa removed the dependency-guard-backfill label May 29, 2026

clawsweeper Bot added rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. labels May 29, 2026

openclaw-barnacle Bot added the proof: supplied External PR includes structured after-fix real behavior proof. label May 29, 2026

steipete self-assigned this May 29, 2026

keshavbotagent and others added 13 commits May 30, 2026 00:36

fix(codex): recover replay-safe app-server stalls

f434af9

fix(codex): surface recoverable app-server stalls

3eb0a1f

fix(codex): avoid false post-progress stalls

7a8dfea

fix(codex): count native stream deltas as activity

beb36e4

fix(codex): address completion stall review feedback

5f35ccb

fix(codex): surface retry-exhausted completion timeouts

282b462

test(codex): stabilize completion timeout coverage

0aefb98

fix(codex): recover replay-safe completion timeouts

2d0ff13

fix(codex): surface side-effect completion timeouts

1c1b7be

fix(codex): keep scoped delta completion guard

98a8e84

test(codex-supervisor): type websocket message coercion

bb75f35

fix(codex): retry only completion watchdog stalls

1df74a3

test(infra): handle oversized tunnel timeout rejection

c005ba9

chatgpt-codex-connector Bot reviewed May 29, 2026

View reviewed changes

Applied-AI-Solutions-hub mentioned this pull request May 30, 2026

[Bug]: Codex OAuth compaction falls back to direct OpenAI API and fails without OPENAI_API_KEY #86820

Closed

steipete mentioned this pull request May 30, 2026

Codex missing-terminal fallback leaks into Discord channel on 2026.5.27 despite #87079 #87725

Closed

This was referenced Jun 3, 2026

test(codex): pin completion-idle timeout thread reset #90027

Merged

Codex app-server turn idle timeout is surfaced as user interruption #89974

Open

Uh oh!

Conversation

keshavbotagent commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why this shape

Real behavior proof

Review feedback addressed

Tests / validation

Related issue

Uh oh!

clawsweeper Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

keshavbotagent commented May 28, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

keshavbotagent commented May 28, 2026

Uh oh!

keshavbotagent commented May 28, 2026

Uh oh!

keshavbotagent commented May 28, 2026

Uh oh!

keshavbotagent commented May 29, 2026

Uh oh!

clawsweeper Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot commented May 29, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 29, 2026

Uh oh!

keshavbotagent commented May 29, 2026

Uh oh!

keshavbotagent commented May 29, 2026

Uh oh!

clawsweeper Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot commented May 29, 2026

Uh oh!

github-actions Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency graph guard cleared

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

steipete commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

keshavbotagent commented May 28, 2026 •

edited

Loading

clawsweeper Bot commented May 28, 2026 •

edited

Loading

clawsweeper Bot commented May 29, 2026 •

edited

Loading

clawsweeper Bot commented May 29, 2026 •

edited

Loading

github-actions Bot commented May 29, 2026 •

edited

Loading