Skip to content

[Bug]: openai-codex/gpt-5.5 hangs silently on 2026.4.25 — no tokens streamed, no error logged, legacy OAuth path #72966

@ped-ro

Description

@ped-ro

Bug type

Regression (worked before, now fails)

Summary

On OpenClaw 2026.4.25, agent runs targeting openai-codex/gpt-5.5 hang indefinitely on first turn. The model never streams a token, no error is logged, and the only way to recover is an external abort (stop button / external timeout). The trajectory artifact shows assistantTexts: [] and itemLifecycle.startedCount: 0 — the run never actually starts producing output.

The legacy openai-codex/gpt-5.4 path on the same OAuth profile, same host, same gateway works fine. Direct codex exec --model gpt-5.5 on the same OAuth profile also works fine (~4s round-trip). Direct POST /v1/responses against gpt-5.5 with API-key auth also works (~1.9s, returns gpt-5.5-2026-04-23).

So upstream is healthy and OAuth is healthy. The hang is inside the OpenClaw openai-codex-responses adapter for gpt-5.5 specifically.

Environment

  • OpenClaw: 2026.4.25 (aa36ee6) — installed via npm global
  • Host: macOS 25.4.0 / arm64 (Apple Silicon M1)
  • Node: v25.9.0
  • codex CLI: v0.118.0
  • Provider: openai-codex (legacy OAuth path, not the new bundled codex/* harness)
  • Auth: ChatGPT/Codex OAuth (Plus subscription)
  • Channel: webchat (also reproduced via openclaw agent --agent main --message ...)

Config (relevant excerpt)

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "openai-codex/gpt-5.5",
        "fallbacks": []
      },
      "models": {
        "openai-codex/gpt-5.4": {},
        "openai-codex/gpt-5.5": {}
      },
      "timeoutSeconds": 300,
      "thinkingDefault": "medium"
    }
  },
  "auth": {
    "profiles": {
      "openai-codex:<redacted>": {
        "provider": "openai-codex",
        "mode": "oauth"
      }
    }
  }
}

Steps to reproduce

  1. On OpenClaw 2026.4.25, with an OAuth-backed openai-codex profile that works for gpt-5.4, set agents.defaults.model.primary to openai-codex/gpt-5.5.
  2. Hot-reload picks up the change ([reload] config hot reload applied).
  3. Send any prompt via webchat or openclaw agent.
  4. Run hangs. No tokens stream. No errors in gateway.log / gateway.err.log other than the eventual external-abort surfaced ~5 min later by the embedded run timeout (timeoutSeconds: 300).

Expected behavior

gpt-5.5 should respond on the same OAuth profile that works in the codex CLI. Or, if some adapter-level incompatibility exists, it should surface a clear error rather than silently hang.

Actual behavior — trajectory artifact

From ~/.openclaw/agents/main/sessions/<runId>.trajectory.jsonl:

{
  "type": "model.completed",
  "provider": "openai-codex",
  "modelId": "gpt-5.5",
  "modelApi": "openai-codex-responses",
  "data": {
    "aborted": true,
    "externalAbort": true,
    "timedOut": true,
    "idleTimedOut": false,
    "compactionCount": 0,
    "promptError": "This operation was aborted | This operation was aborted",
    "promptErrorSource": "prompt",
    "assistantTexts": [],
    "messagesSnapshot": []
  }
}

Followed by:

{
  "type": "trace.artifacts",
  "data": {
    "finalStatus": "error",
    "aborted": true,
    "externalAbort": true,
    "timedOut": true,
    "itemLifecycle": {
      "startedCount": 0,
      "completedCount": 0,
      "activeCount": 0
    },
    "toolMetas": [],
    "didSendViaMessagingTool": false
  }
}

startedCount: 0 is the key signal — the run never even started producing items.

Gateway log evidence

After the gateway restart at 09:18 PDT today:

2026-04-27T09:18:11 [gateway] agent model: openai-codex/gpt-5.5
...
2026-04-27T10:16:38 [agent/embedded] embedded run failover decision:
  runId=7a85c138-... stage=assistant decision=surface_error
  reason=timeout from=openai-codex/gpt-5.5 profile=sha256:2ed3747bab55

Zero auth errors in this window: grep -iE "401|403|oauth|expired|refresh" returns nothing for the openai-codex provider.

A/B isolation evidence

Three orthogonal probes, same machine, same OAuth profile where applicable:

Path Model Auth Result Time
OpenClaw 2026.4.25 embedded run gpt-5.5 OAuth ❌ hangs forever abort only
OpenClaw 2026.4.25 embedded run gpt-5.4 OAuth ✅ works normal
codex exec --model gpt-5.5 ... gpt-5.5 OAuth ✅ works ~4s
curl POST /v1/responses gpt-5.5 API key ✅ works ~1.9s

This isolates the failure to OpenClaw 2026.4.25's openai-codex-responses adapter, specifically when the model is gpt-5.5. Same adapter on gpt-5.4 works fine.

Related issues

Workaround

Setting agents.defaults.model.primary = openai-codex/gpt-5.4 fully restores agent function. No re-auth needed.

Suggested fix direction

Per #64744's analysis, CodexAppServerClient.initialize() and related request-path entry points lack timeoutMs protection. Even if the underlying gpt-5.5 routing bug has its own root cause, adding the timeout would at minimum surface a real error instead of a silent infinite hang.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions