Summary
On codex-cli 0.119.0-alpha.11, switching into a freshly spawned subagent thread from the TUI can terminate the parent TUI back to the shell with:
Error: skills/list failed in TUI
This is distinct from the stale-busy-spinner bug in #16904. In this case the TUI actually exits, the parent rollout file stays empty, and the spawned child thread continues to exist independently.
Environment
@openai/codex 0.119.0-alpha.11
- local source audited at
e169c915824307eef6c175b7f28fb381da853ef0
- Linux in
tmux
- happened on April 6, 2026 at about
15:26:29Z (18:26:29 Europe/Moscow)
Repro shape
I hit this while working in a tmux window named copilot fix.
High-level sequence:
- Run a normal Codex TUI session in
tmux.
- Spawn a clean subagent / switch into that agent thread from the parent TUI.
- The UI briefly shows the spawned agent.
- The parent TUI exits to the shell with
Error: skills/list failed in TUI.
The specific user action was a request to run a review in a clean subagent without forked history, but the important part seems to be subagent spawn -> thread/session switch -> startup skills refresh.
Actual behavior
- The pane returned to a shell prompt.
- The visible last error in the pane was:
Error: skills/list failed in TUI
- The parent thread's rollout file was created but remained empty:
~/.codex/sessions/2026/04/06/rollout-2026-04-06T15-42-28-019d62d0-d99e-7132-81d5-cf06a8fad414.jsonl (size=0)
- The parent shell snapshot file also remained empty:
~/.codex/shell_snapshots/019d62d0-d99e-7132-81d5-cf06a8fad414.tmp-1775479348072805930 (size=0)
- The spawned child thread did exist and got its own rollout:
- child thread id
019d6367-056d-7ef2-a8f5-9117c84e6c38
Runtime evidence
From ~/.codex/log/codex-tui.log around the crash window for parent thread 019d62d0-d99e-7132-81d5-cf06a8fad414:
- repeated rollout recorder failures before and during spawn:
failed to record rollout items: failed to queue rollout items: channel closed
- then successful subagent/session init for child thread
019d6367-056d-7ef2-a8f5-9117c84e6c38
So by the time the TUI switched/attached, the parent already had a broken rollout writer, and then the TUI surfaced skills/list failed in TUI before exiting.
Code-path audit
I did a local source review of 0.119.0-alpha.11. The most important finding is that skills/list is treated as fatal in a path where it should almost certainly be degradable.
1. Thread/session switch triggers a skills refresh automatically
When a session is configured, the chat widget immediately submits list_skills(force_reload = true):
codex-rs/tui/src/chatwidget.rs:2009
That SessionConfigured path is reached during thread replacement / attach flows such as:
codex-rs/tui/src/app.rs:3319
codex-rs/tui/src/app.rs:3328
codex-rs/tui/src/app.rs:3056
So switching to a spawned subagent thread can trigger a fresh skills/list during the attach/configure path.
2. skills/list RPC failures bubble as hard errors out of the event loop
The RPC wrapper itself uses:
codex-rs/tui/src/app_server_session.rs:618
.wrap_err("skills/list failed in TUI")
Then AppCommandView::ListSkills uses await?:
codex-rs/tui/src/app.rs:2322
And both AppEvent::CodexOp and AppEvent::SubmitThreadOp also use await?:
codex-rs/tui/src/app.rs:4268
codex-rs/tui/src/app.rs:4271
The main loop breaks on any error returned by handle_event:
codex-rs/tui/src/app.rs:3863
codex-rs/tui/src/app.rs:3925
That makes a skills/list failure terminate the entire TUI instead of surfacing a non-fatal UI error and continuing.
3. There is a separate fatal-disconnect path as well
If the app-server event stream disconnects, the adapter explicitly requests fatal exit:
codex-rs/tui/src/app/app_server_adapter.rs:149
codex-rs/tui/src/app.rs:4264
I did not capture a direct app-server event stream disconnected log line for this exact repro, so I cannot prove that path fired here. But the code means the TUI currently has no graceful recovery for an app-server-side failure while switching threads.
4. Rollout recorder breakage is visible before the exit
The rollout recorder error comes from:
codex-rs/rollout/src/recorder.rs:504
codex-rs/core/src/codex.rs:3744
This matches the runtime log evidence that the parent session's rollout writer was already broken before the visible skills/list failed in TUI exit.
Why this seems buggy
skills/list is not a critical user turn operation. It is a startup / refresh convenience RPC. Failing to refresh skills should not crash the whole TUI, especially during thread switches or subagent attach.
At minimum, this path should degrade to an inline error or warning and leave the session alive.
Expected behavior
- Switching to a spawned subagent thread should not exit the parent TUI.
- If
skills/list fails, the UI should continue running and display a recoverable error.
- If the rollout recorder channel is already dead, that state should be surfaced clearly and should not cascade into a blank parent rollout plus fatal TUI exit.
Test gap
I found tests around SessionConfigured handling and thread attach behavior, but I did not find a regression test for:
SessionConfigured -> list_skills(force_reload=true) failing during thread switch / attach
- TUI surviving that failure without exiting
Suggested fix direction
Two likely fixes, both useful:
- Make
skills/list failures non-fatal in the SessionConfigured / refresh path.
- Investigate why the parent rollout recorder channel can already be closed during the same turn, since that appears to leave the parent thread with an empty rollout file and may be a precursor to the crash.
Summary
On
codex-cli 0.119.0-alpha.11, switching into a freshly spawned subagent thread from the TUI can terminate the parent TUI back to the shell with:This is distinct from the stale-busy-spinner bug in #16904. In this case the TUI actually exits, the parent rollout file stays empty, and the spawned child thread continues to exist independently.
Environment
@openai/codex 0.119.0-alpha.11e169c915824307eef6c175b7f28fb381da853ef0tmux15:26:29Z(18:26:29Europe/Moscow)Repro shape
I hit this while working in a tmux window named
copilot fix.High-level sequence:
tmux.Error: skills/list failed in TUI.The specific user action was a request to run a review in a clean subagent without forked history, but the important part seems to be
subagent spawn -> thread/session switch -> startup skills refresh.Actual behavior
~/.codex/sessions/2026/04/06/rollout-2026-04-06T15-42-28-019d62d0-d99e-7132-81d5-cf06a8fad414.jsonl(size=0)~/.codex/shell_snapshots/019d62d0-d99e-7132-81d5-cf06a8fad414.tmp-1775479348072805930(size=0)019d6367-056d-7ef2-a8f5-9117c84e6c38Runtime evidence
From
~/.codex/log/codex-tui.logaround the crash window for parent thread019d62d0-d99e-7132-81d5-cf06a8fad414:019d6367-056d-7ef2-a8f5-9117c84e6c38So by the time the TUI switched/attached, the parent already had a broken rollout writer, and then the TUI surfaced
skills/list failed in TUIbefore exiting.Code-path audit
I did a local source review of
0.119.0-alpha.11. The most important finding is thatskills/listis treated as fatal in a path where it should almost certainly be degradable.1. Thread/session switch triggers a skills refresh automatically
When a session is configured, the chat widget immediately submits
list_skills(force_reload = true):codex-rs/tui/src/chatwidget.rs:2009That
SessionConfiguredpath is reached during thread replacement / attach flows such as:codex-rs/tui/src/app.rs:3319codex-rs/tui/src/app.rs:3328codex-rs/tui/src/app.rs:3056So switching to a spawned subagent thread can trigger a fresh
skills/listduring the attach/configure path.2.
skills/listRPC failures bubble as hard errors out of the event loopThe RPC wrapper itself uses:
codex-rs/tui/src/app_server_session.rs:618Then
AppCommandView::ListSkillsusesawait?:codex-rs/tui/src/app.rs:2322And both
AppEvent::CodexOpandAppEvent::SubmitThreadOpalso useawait?:codex-rs/tui/src/app.rs:4268codex-rs/tui/src/app.rs:4271The main loop breaks on any error returned by
handle_event:codex-rs/tui/src/app.rs:3863codex-rs/tui/src/app.rs:3925That makes a
skills/listfailure terminate the entire TUI instead of surfacing a non-fatal UI error and continuing.3. There is a separate fatal-disconnect path as well
If the app-server event stream disconnects, the adapter explicitly requests fatal exit:
codex-rs/tui/src/app/app_server_adapter.rs:149codex-rs/tui/src/app.rs:4264I did not capture a direct
app-server event stream disconnectedlog line for this exact repro, so I cannot prove that path fired here. But the code means the TUI currently has no graceful recovery for an app-server-side failure while switching threads.4. Rollout recorder breakage is visible before the exit
The rollout recorder error comes from:
codex-rs/rollout/src/recorder.rs:504codex-rs/core/src/codex.rs:3744This matches the runtime log evidence that the parent session's rollout writer was already broken before the visible
skills/list failed in TUIexit.Why this seems buggy
skills/listis not a critical user turn operation. It is a startup / refresh convenience RPC. Failing to refresh skills should not crash the whole TUI, especially during thread switches or subagent attach.At minimum, this path should degrade to an inline error or warning and leave the session alive.
Expected behavior
skills/listfails, the UI should continue running and display a recoverable error.Test gap
I found tests around
SessionConfiguredhandling and thread attach behavior, but I did not find a regression test for:SessionConfigured -> list_skills(force_reload=true)failing during thread switch / attachSuggested fix direction
Two likely fixes, both useful:
skills/listfailures non-fatal in theSessionConfigured/ refresh path.