fix(codex): ignore account updates for turn liveness#79667
Conversation
|
Codex review: needs maintainer review before merge. Summary Reproducibility: yes. Source inspection shows current main refreshes liveness before current-turn filtering, and the PR body includes live Telegram/OpenAI repro evidence for the lost-completion plus account-update path; I did not run the live path in this read-only review. Real behavior proof Next step before merge Security Review detailsBest possible solution: Land this focused fix after maintainer approval and merge gates if it is intended to close the linked timeout report; otherwise adjust the closing reference and keep broader timeout policy work tracked separately. Do we have a high-confidence way to reproduce the issue? Yes. Source inspection shows current main refreshes liveness before current-turn filtering, and the PR body includes live Telegram/OpenAI repro evidence for the lost-completion plus account-update path; I did not run the live path in this read-only review. Is this the best way to solve the issue? Yes for the stated PR scope. Moving the liveness refresh behind Acceptance criteria:
What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 0d277e9533af. |
Cherry-pick from upstream 5fdef4c. Fixes codex app-server completion liveness by ignoring account updates during turn. Conflicts resolved by taking upstream versions (pre-existing conflict markers in our fork are preserved by this cherry-pick).
6 upstream cherry-picks applied: - fix(telegram): mirror outbound replies to session transcript - fix(gateway): harden macOS update restart lifecycle - fix(telegram): harden command menu cache keys - Reduce Telegram command menu CPU work - fix(status): show codex usage for codex harness - fix(codex): ignore account updates for turn liveness (openclaw#79667) Skipped: imessage, whatsapp, slack fixes (deleted in fork)
* fix codex app-server completion liveness * docs changelog codex liveness fix
* fix codex app-server completion liveness * docs changelog codex liveness fix
* fix codex app-server completion liveness * docs changelog codex liveness fix
* fix codex app-server completion liveness * docs changelog codex liveness fix
Summary
turn/completedwas lost, because unrelated account/rate-limit notifications refreshed active-turn liveness.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Real behavior proof (required for external PRs)
turn/completedwhile account/rate-limit updates continue.blackhole-after-message-tool-completed-and-inject-account, sent live Telegram agent messages, and observed the CLI JSON result plus gateway/proxy logs.OC78756_FIXED_MAIN_RERUNreturnedstatus: timeout,payloads: [],durationMs: 66765; gateway loggedcodex app-server turn idle timed out waiting for completionwhile the proxy continued account update injections.Root Cause (if applicable)
run-attempt.tsrefreshed turn-completion activity for every app-server notification, including account/rate-limit updates unrelated to the current turn. Separately, the embedded runner's timeout fallback did not account for already-visible messaging-tool delivery evidence.turn/completedcondition, and no runner test covered timeout suppression after messaging-tool delivery.Regression Test Plan (if applicable)
extensions/codex/src/app-server/run-attempt.test.ts;src/agents/pi-embedded-runner/run.overflow-compaction.loop.test.tsUser-visible / Behavior Changes
Codex app-server runs that already visibly delivered a message now stop cleanly on completion-idle timeout when the completion signal is lost, without extending liveness from background account updates or sending a duplicate generic timeout reply.
Diagram (if applicable)
Security Impact (required)
Yes, explain risk + mitigation: N/ARepro + Verification
Environment
gpt-5.5Steps
message.actioncall.turn/completed, while injecting account/rate-limit updates.Expected
Actual
OC78756_FIXED_MAIN_RERUN,status: timeout,payloads: [],durationMs: 66765, gateway logcodex app-server turn idle timed out waiting for completion.OC78756_REL57_FIXED_RERUN,status: timeout,payloads: [],durationMs: 71715, gateway logcodex app-server turn idle timed out waiting for completion.Evidence
Human Verification (required)
Verification commands run:
Review Conversations
Compatibility / Migration
Risks and Mitigations