Any Single Engine Crash Leaves Orphaned Threads That Cause a Persistent Crash/Resume Loop

### What version of the Codex App are you using (From “About Codex” dialog)?

26.325 (internal engine: 0.118.0-alpha.2)

### What subscription do you have?

ChatGPT Pro

### What platform is your computer?

Windows 11 Pro 10.0.26200

### What issue are you seeing?

When the Codex engine process exits unexpectedly (for any reason — MCP failure, personality error, OOM, etc.), the Desktop App correctly detects the crash, auto-restarts the engine, and attempts to resume the interrupted conversation thread. However, the resume logic has a cascading failure mode that makes recovery impossible without manual intervention:

The interrupted thread's rollout file is never written (because the crash happened before the file was flushed).
On restart, the App sends thread/resume, which succeeds (thread goes to running).
The App then sends a second thread/resume with config overrides — but since the thread is already running, the overrides are silently dropped with WARN thread/resume overrides ignored for running thread.
The App attempts to thread/archive the thread, which fails: "no rollout found for thread id <id>".
Approximately 90 seconds later, the engine crashes again with exit code 3221225786 (STATUS_CONTROL_C_EXIT / 0xC000013A).
Return to step 1. The loop is infinite and requires manual state database deletion to break.



Relevant log sequence (timestamps from %LOCALAPPDATA%\Packages\OpenAI.Codex_*\LocalCache\Local\Codex\Logs\)

19:28:09  thread/resume → success (latestTurnStatus=interrupted → running)
19:29:09  thread/resume again → WARN: "thread/resume overrides ignored for running thread
          019d4035-...: config overrides were provided and ignored while running;
          developerInstructions override was provided and ignored while running"
19:29:13  thread/archive → ERROR: "no rollout found for thread id 019d4035-..."
19:30:48  app_server_connection.closed code=3221225786
          → fatal_error_broadcasted
          → cause=start_process → reconnecting
          (loop repeats)

### What steps can reproduce the bug?

During our session, the initial crash that started the loop was caused by one of three separate triggers, each of which is fatal on its own (see related issues):

WARN codex_protocol::openai_models: Model personality requested but model_messages is missing for model=gpt-5.4 personality=friendly → crash
WARN rmcp::transport::worker: worker quit with fatal: Transport channel closed, when AuthRequired from plugin-injected Stripe MCP → crash
WARN codex_core::mcp_connection_manager: Failed to list resource templates for MCP server 'playwright': Mcp error: -32601: Method not found → crash
Any of these triggers leaves a thread in interrupted state with no rollout file, which then trips the resume loop.

### What is the expected behavior?

The engine should handle the case where a thread has status=interrupted but no corresponding rollout file gracefully — either by marking it as unrecoverable and skipping it, or by creating an empty rollout as a placeholder. The second thread/resume call with overrides on an already-running thread should not cause instability.

### Additional information

Unrecoverable crash loop until the user manually deletes state_5.sqlite*, session_index.jsonl, and sessions/.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any Single Engine Crash Leaves Orphaned Threads That Cause a Persistent Crash/Resume Loop #16271

What version of the Codex App are you using (From “About Codex” dialog)?

What subscription do you have?

What platform is your computer?

What issue are you seeing?

What steps can reproduce the bug?

What is the expected behavior?

Additional information

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Any Single Engine Crash Leaves Orphaned Threads That Cause a Persistent Crash/Resume Loop #16271

Description

What version of the Codex App are you using (From “About Codex” dialog)?

What subscription do you have?

What platform is your computer?

What issue are you seeing?

What steps can reproduce the bug?

What is the expected behavior?

Additional information

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions