Skip to content

[Bug]: v2026.5.19 Codex unusable on headless VPS: openai-codex auth binding failure and codex provider Cloudflare 403 #84893

@Han-HanqingDong

Description

@Han-HanqingDong

Summary

After upgrading a headless Ubuntu VPS from OpenClaw 2026.5.12 to 2026.5.19, Codex became unusable in two different ways depending on the model route:

  1. Existing openai-codex/gpt-5.5 route fails before inference with an auth binding error:
No API key found for provider "openai-codex".
  1. Switching to the new codex/gpt-5.5 catalog route gets past auth resolution, but the actual request fails with a Cloudflare HTML 403 challenge from chatgpt.com/backend-api/responses:
provider=codex api=openai-codex-responses model=gpt-5.5 status=403
Authentication failed with an HTML 403 response from the provider.

This was tested on a clean, reversible upgrade with backups. Rolling back to 2026.5.12 immediately restored openai-codex/gpt-5.5 with the same OAuth profile, so this does not appear to be an expired OAuth token or account access issue.

Environment

  • Host type: headless VPS
  • OS: Ubuntu 24.04.4 LTS
  • Deployment: user systemd service openclaw-gateway.service
  • Node entrypoint: /usr/bin/node .../openclaw/dist/index.js gateway --port 18789
  • Working version before upgrade: OpenClaw 2026.5.12 (f066dd2)
  • Upgraded version tested: OpenClaw 2026.5.19 (a185ca2)
  • Global package after upgrade: openclaw@2026.5.19
  • Managed packages after upgrade:
    • openclaw@2026.5.19
    • @openclaw/codex@2026.5.19
    • @openai/codex@0.131.0
  • OAuth status according to openclaw models status --json: openai-codex status ok
  • No fallback model configured during verification

Baseline Before Upgrade

On 2026.5.12, the gateway was healthy and openai-codex/gpt-5.5 worked:

openclaw --version
OpenClaw 2026.5.12 (f066dd2)

systemctl --user is-active openclaw-gateway.service
active

Effective model state:

{
  "defaultModel": "openai-codex/gpt-5.5",
  "resolvedDefault": "openai-codex/gpt-5.5",
  "fallbacks": [],
  "allowed": ["openai-codex/gpt-5.5"],
  "missing": [],
  "oauth": [{ "provider": "openai-codex", "status": "ok" }]
}

Live agent check succeeded:

openclaw agent --agent main --session-id vultr-rollback512-after519-1779352869 \
  --message 'Reply exactly: OK' --thinking off --timeout 240 --json

Relevant result:

{
  "status": "ok",
  "result": {
    "payloads": [{ "text": "OK" }],
    "meta": {
      "executionTrace": {
        "winnerProvider": "openai-codex",
        "winnerModel": "gpt-5.5",
        "fallbackUsed": false,
        "runner": "embedded"
      }
    }
  }
}

Upgrade Procedure

The upgrade was performed with backups of openclaw.json, agent state, the user systemd unit, and package manifests.

Commands used in essence:

npm install -g openclaw@latest --no-audit --no-fund
npm install --prefix "$HOME/.openclaw/npm" openclaw@latest @openclaw/codex@latest --no-audit --no-fund --omit=dev
systemctl --user daemon-reload
systemctl --user restart openclaw-gateway.service

After upgrade:

openclaw --version
OpenClaw 2026.5.19 (a185ca2)

systemctl --user is-active openclaw-gateway.service
active

Failure Mode 1: Existing openai-codex/gpt-5.5 Route

After upgrade, keeping the existing default model openai-codex/gpt-5.5, models status still reported OAuth as present and OK:

{
  "defaultModel": "openai-codex/gpt-5.5",
  "resolvedDefault": "openai-codex/gpt-5.5",
  "fallbacks": [],
  "allowed": ["openai-codex/gpt-5.5"],
  "missing": [],
  "oauth": [{ "provider": "openai-codex", "status": "ok" }]
}

But a live agent call failed with:

Error: No API key found for provider "openai-codex". Auth store: /home/<user>/.openclaw/agents/main/agent/auth-profiles.json (agentDir: /home/<user>/.openclaw/agents/main/agent). Configure auth for this agent (openclaw agents add <id>) or copy only portable static auth profiles from the main agentDir.

This looks similar in spirit to the auth-profile / embedded-path work referenced by PR #84752, but in this environment it is observable on 2026.5.19 after upgrading from a working 2026.5.12 state.

Failure Mode 2: New codex/gpt-5.5 Route

openclaw models list on 5.19 showed both the old and new routes:

openai-codex/gpt-5.5    text        195k   no  yes   default,configured
codex/gpt-5.5           text+image  266k   no  yes

I then changed the default model and allowlist to the new route:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "codex/gpt-5.5",
        "fallbacks": []
      },
      "models": {
        "codex/gpt-5.5": {}
      }
    }
  }
}

After restarting the gateway, models status showed:

{
  "defaultModel": "codex/gpt-5.5",
  "resolvedDefault": "codex/gpt-5.5",
  "fallbacks": [],
  "allowed": ["codex/gpt-5.5"],
  "missing": ["codex"],
  "oauth": [{ "provider": "openai-codex", "status": "ok" }]
}

A live agent call then reached the transport layer but failed with a Cloudflare HTML challenge:

[openai-transport] [responses] error provider=codex api=openai-codex-responses model=gpt-5.5 name=Error status=403
message=403 <html> ...

The preserved HTML contained a Cloudflare challenge page for chatgpt.com and /backend-api/responses.

The gateway eventually surfaced:

Authentication failed with an HTML 403 response from the provider. Re-authenticate and verify your provider account access.

However, re-auth does not look like the root cause here because the same OAuth profile works again after rollback to 5.12.

Attempted Fix from #62142

Issue #62142 was closed after the reporter found that openai-codex had been mapped to the wrong transport (anthropic-messages) and fixed it by setting both provider-level and model-level API values to openai-codex-responses.

I tried the same workaround explicitly on 5.19:

{
  "models": {
    "mode": "merge",
    "providers": {
      "openai-codex": {
        "api": "openai-codex-responses",
        "models": [
          {
            "id": "gpt-5.5",
            "name": "gpt-5.5",
            "contextWindow": 272000,
            "maxTokens": 128000,
            "input": ["text", "image"],
            "api": "openai-codex-responses"
          },
          {
            "id": "gpt-5.4",
            "name": "gpt-5.4",
            "contextWindow": 272000,
            "maxTokens": 128000,
            "input": ["text", "image"],
            "api": "openai-codex-responses"
          }
        ]
      }
    }
  }
}

Then I switched the default back to openai-codex/gpt-5.5 and restarted the gateway.

Result: this did not fix 5.19 in this environment. The request again failed before inference with:

No API key found for provider "openai-codex".

So #62142's transport mapping fix is not sufficient for this 5.19 headless VPS case.

Rollback Result

I rolled back to the known-good 5.12 state:

npm install -g openclaw@2026.5.12 --no-audit --no-fund
npm install --prefix "$HOME/.openclaw/npm" openclaw@2026.5.12 @openclaw/codex@2026.5.12 --no-audit --no-fund --omit=dev
systemctl --user restart openclaw-gateway.service

Final state:

OpenClaw 2026.5.12 (f066dd2)
openclaw@2026.5.12
@openclaw/codex@2026.5.12
@openai/codex@0.131.0
systemd service: active

The same live agent test succeeded again:

{
  "status": "ok",
  "payloads": [{ "text": "OK" }],
  "executionTrace": {
    "winnerProvider": "openai-codex",
    "winnerModel": "gpt-5.5",
    "fallbackUsed": false,
    "runner": "embedded"
  }
}

Expected Behavior

After upgrading from a working 2026.5.12 Codex setup to 2026.5.19, one of the supported Codex routes should continue to work without requiring fallback to a different provider:

  • either existing openai-codex/gpt-5.5 should continue resolving the existing OAuth profile, or
  • the new codex/gpt-5.5 route should bind to the existing OAuth profile and avoid Cloudflare 403 on headless VPS requests.

Actual Behavior

On 5.19:

Notes

  • The host is a headless VPS, which may be relevant to the Cloudflare response.
  • The brave plugin warning was present before and does not appear related.
  • No secrets are included here; paths and account details have been sanitized.
  • I intentionally kept fallbacks: [] during verification so that success could not be masked by another provider.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High-priority user-facing bug, regression, or broken workflow.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.impact:auth-providerAuth, provider routing, model choice, or SecretRef resolution may break.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions