[Bug]: Codex dynamic tool calls can leave sessions stuck as blocked_tool_call

### Bug type

Crash / hang

### Beta release blocker

No

### Summary

Codex app-server dynamic tool calls can leave an OpenClaw session stuck in `blocked_tool_call` even after the tool request has returned a successful result to the UI/transcript.

I checked for existing reports before filing. This looks related to the orphaned tool-call family in #42112 and the failed-tool-call hang family in #8288, but I did not find an exact duplicate for this `item/tool/call` response path: the dynamic tool request returns, the UI shows the bash result as completed, but the gateway diagnostics keep `activeTool=bash` with `recovery=none`.

### Steps to reproduce

1. Run OpenClaw with the bundled Codex harness / `@openclaw/codex`.
2. Start a Codex-backed agent turn that executes a dynamic `bash` tool request.
3. In the observed case, the tool command was:

```bash
/bin/bash -lc 'PYTHONDONTWRITEBYTECODE=1 .venv/bin/python scripts/validate_public_install.py --dev && PYTHONDONTWRITEBYTECODE=1 timeout 600 .venv/bin/python scripts/run_security_contract_validation.py --include-pytest'
```

The command ran in:

```text
<redacted-local-worktree-path>
```

4. The tool UI/result reports: `No output — tool completed successfully.`
5. Do not restart the gateway immediately. Watch the active session diagnostics.

### Expected behavior

After the dynamic tool response returns to Codex/OpenClaw, OpenClaw should clear the active tool bookkeeping and emit a terminal `tool.execution.completed` or `tool.execution.error` diagnostic for that `toolCallId`. The session should then either continue to the next model event or hit the normal post-tool completion guard, but it should not remain classified as an active blocked tool call.

### Actual behavior

The session remains in `state=processing` and the gateway repeatedly classifies it as `blocked_tool_call` with `activeTool=bash`, even though the tool response has already been surfaced as completed.

Relevant gateway log excerpt from May 18, 2026:

```text
2026-05-18T07:52:04.249+02:00 [diagnostic] stalled session: sessionId=<redacted-session-id> sessionKey=<redacted-session-key> state=processing age=143s queueDepth=0 reason=blocked_tool_call classification=blocked_tool_call activeWorkKind=tool_call lastProgress=codex_app_server:notification:thread/tokenUsage/updated lastProgressAge=141s activeTool=bash activeToolCallId=<redacted-tool-call-id> activeToolAge=148s recovery=none
2026-05-18T07:57:04.291+02:00 [diagnostic] stalled session: sessionId=<redacted-session-id> sessionKey=<redacted-session-key> state=processing age=443s queueDepth=0 reason=blocked_tool_call classification=blocked_tool_call activeWorkKind=tool_call lastProgress=codex_app_server:notification:thread/tokenUsage/updated lastProgressAge=441s activeTool=bash activeToolCallId=<redacted-tool-call-id> activeToolAge=448s recovery=none
2026-05-18T07:57:27.172+02:00 [gateway] draining 2 active task(s) and 1 active embedded run(s) before restart with timeout 300000ms
2026-05-18T07:57:57.185+02:00 [gateway] still draining 2 active task(s) and 1 active embedded run(s) before restart
openclaw-gateway.service: State 'stop-sigterm' timed out. Killing.
```

An older affected turn from the same session family eventually hit the long terminal idle timeout instead:

```text
turn.terminal_idle_timeout ... idleMs=1800001 timeoutMs=1800000 lastActivityReason=notification:thread/tokenUsage/updated
```

### OpenClaw version

Observed on installed OpenClaw / `@openclaw/codex` 2026.5.12. The affected source path still existed on current upstream `main` when checked on May 18, 2026, before preparing the linked PR.

### OS

Linux x64, systemd user service. Host evidence was collected on `6.19.14+kali-amd64`.

### Install method

npm-managed OpenClaw install with user `openclaw-gateway.service`.

### Model

Codex-backed OpenAI model routing. The affected session was using the OpenAI Codex provider override with `gpt-5.4`/Codex harness routing.

### Provider / routing chain

OpenClaw gateway -> bundled `@openclaw/codex` -> Codex app-server / ChatGPT Codex transport.

### Additional provider/model setup details

No API-key provider path is required for the observed issue; this was the Codex app-server dynamic tool bridge path.

### Logs/screenshots/evidence

Evidence above includes:

- The exact dynamic bash command and redacted working-directory evidence.
- UI/tool-result state: `No output — tool completed successfully.`
- Gateway diagnostics repeatedly reporting `blocked_tool_call activeTool=bash recovery=none` with session and tool-call identifiers redacted.
- Gateway restart drain timing out because the embedded run remained active.
- A prior affected turn reaching `turn.terminal_idle_timeout` after 30 minutes.

### Impact/severity

High for Codex-backed local agents. A successfully completed dynamic shell command can still leave the OpenClaw session unusable until timeout or gateway restart. Restart may also hang during drain and require systemd to kill the service.

### Additional information

I prepared a PR that makes the dynamic `item/tool/call` request boundary emit terminal tool diagnostics and clear active dynamic tool bookkeeping in `finally`, so the gateway does not keep a completed dynamic tool as active.

Sensitive values in the evidence were redacted after filing while preserving the observed event order, diagnostic classification, active tool name, timings, and restart-drain behavior.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Codex dynamic tool calls can leave sessions stuck as blocked_tool_call #83474

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

OS

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs/screenshots/evidence

Impact/severity

Additional information

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Codex dynamic tool calls can leave sessions stuck as blocked_tool_call #83474

Description

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

OS

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs/screenshots/evidence

Impact/severity

Additional information

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions