[Bug] Docs and onboarding silently route users to the PI path instead of the Codex app-server runtime, causing chatgpt.com 403 in Cloudflare-sensitive environments

# Draft: New issue to file against openclaw/openclaw

> 用途：作为独立 issue 提交。题目是 docs/onboarding 把用户引导到无法使用的路径。
> 比 #67670 范围更广（不限于 Cloudflare 403 那一种症状）。
>
> 直接拷贝 `---` 之间作为 issue body。Title 写在最上面。

**Title**: `[Bug] Docs and onboarding silently route users to the PI path instead of the Codex app-server runtime, causing chatgpt.com 403 in Cloudflare-sensitive environments`

---

## Summary

Following the canonical onboarding instructions in
`docs/providers/openai.md` and `docs/plugins/codex-harness.md` (as of
2026.5.12) lands the user on the **PI runtime path** — OpenClaw's
internal Node `fetch` (undici) directly calling
`chatgpt.com/backend-api`. This path fails with HTTP 403
`cf-mitigated: challenge` from any egress IP that hasn't been
whitelisted by OpenAI's Cloudflare config, which in practice means
every user in mainland China and many users behind commodity
VPN/proxy egress nodes.

There is a working alternative path (**Codex app-server runtime**:
`codex/gpt-* + agentRuntime.id="codex"` plus `@openclaw/codex`
plugin) that bypasses the issue entirely — OpenAI evidently
whitelists the official codex CLI's TLS profile. Unfortunately,
nothing in the docs or in the `openclaw models auth login` /
onboarding flow guides users toward this path. The user typically
discovers it only after several hours of debugging and reading
source code.

This is a docs↔implementation drift / onboarding completeness bug,
distinct from #67670 (which proposes adding TLS-fingerprint
emulation to the PI path).

## Reproduction (no Cloudflare workaround required)

Tested on OpenClaw 2026.5.12, Linux x86_64, Node 25, behind a
mihomo HTTP proxy whose egress IP is in Singapore (AWS).

1. Fresh install, follow `docs/providers/openai.md` Step 2 / 3:
   ```bash
   openclaw models auth login --provider openai-codex --device-code
   # (device-code flag is documented but not implemented in CLI yet,
   #  see drift item A below; if user falls back to plain login,
   #  the OAuth flow completes successfully)
   openclaw config set agents.defaults.model.primary openai-codex/gpt-5.5
   openclaw gateway restart
   ```
2. Send a turn:
   ```bash
   openclaw agent --agent main --message "Reply PONG" --json
   ```

Expected (per docs Step 3 "OpenAI agent turns select the native
Codex app-server runtime automatically"):
- `agentHarnessId: "codex"`
- Real codex app-server subprocess visible in `ps`
- 200 OK from chatgpt.com

Actual:
- `agentHarnessId: "pi"`
- No codex subprocess (`ps -ef | grep codex` empty during the turn)
- `[openai-transport] [responses] error provider=codex
  api=openai-codex-responses model=gpt-5.5 status=403 message=403
  <html>...cf-mitigated: challenge`
- Turn falls through to whatever fallback chain is configured
  (often another provider that happens to succeed by accident,
  masking the failure)

## Diagnosis

The `openai-codex/*` namespace exposed by the bundled `openai`
plugin (`buildOpenAICodexProviderPlugin` →
`https://chatgpt.com/backend-api/codex`) is a **PI-direct provider**
implemented in OpenClaw's own Node `fetch` stack. It has the same
authentication source (the `openai-codex:*` profile in
`auth-profiles.json`) as the Codex app-server path, but a
completely separate transport. The `openclaw models auth login
--provider openai-codex` command registers the OAuth profile but
does **not**:

- install `@openclaw/codex`
- mirror credentials to a per-agent
  `~/.openclaw/agents/<id>/agent/codex-home/auth.json`
- update `agents.defaults.model` to a value that triggers the
  Codex app-server runtime

So the post-login default state is "OAuth in place, but the only
ready provider is `openai-codex/*` which is PI-direct".

## Documentation ↔ implementation drift checklist

Cataloguing what I tripped over while diagnosing. Some are docs
errors, some are missing implementation pieces:

| # | Drift | Evidence |
|---|---|---|
| A | `openclaw models auth login --provider openai-codex --device-code` documented but `--device-code` not recognized by the 2026.5.12 CLI | `Error: openclaw does not recognize option "--device-code"` |
| B | `config set agents.defaults.model.primary openai/gpt-5.5` rejected with `Model override "X" is not allowed for agent "Y"` until the user separately writes `agents.defaults.models["openai/gpt-5.5"] = {}`. Not documented anywhere | Reproducible on any fresh install |
| C | `docs/plugins/codex-harness.md:170` claims `agentRuntime.id: "codex"` is *optional* "for normal OpenAI auto mode", implying the system picks Codex automatically. In practice the auto-mode picker always selects PI unless `agentRuntime.id` is set explicitly | See actual log signature in repro above |
| D | Two parallel auth stores exist (`auth-profiles.json` for PI provider + per-agent `codex-home/auth.json` for codex app-server). Docs don't explain which one each runtime path reads | Has to be discovered from source: `extensions/codex/src/app-server/transport-stdio.ts` (CODEX_HOME=per-agent) vs `extensions/openai/src/codex-provider.ts` (auth-profiles.json) |
| E | `openclaw models list` shows `codex/gpt-*` (from `@openclaw/codex` plugin) but not `openai/gpt-*`, while docs prescribe `openai/gpt-*`. No documented alias mechanism | The user can only set `openai/gpt-*` as primary if they manually add it to `agents.defaults.models`, but then no provider serves it |
| F | `docs/providers/openai.md` Step 3: "OpenClaw installs or repairs the bundled Codex plugin when this route is chosen". No such auto-install happens during `models auth login` | `~/.openclaw/npm/node_modules/@openclaw/codex` remains absent after `models auth login --provider openai-codex` |
| G | Schema rejects `contextWindow` / `input` keys under `agents.defaults.models[X]`, so users cannot correct stale model metadata registered by a plugin (e.g. codex plugin currently registers `codex/gpt-5.5` with `contextWindow=200000`, OpenAI's actual gpt-5.5 limit is 272k) | `config validate` error: `Unrecognized keys: "contextWindow", "input"` |
| H | `openclaw models auth login --provider openai-codex` doesn't `chmod 600` the per-agent `auth.json` it would write (if it wrote one); users who mirror credentials manually have to remember to do it | Minor, but a hardening gap |

Items A, C, D, E, F together are why every careful user who reads
the docs end-to-end still ends up on the PI path.

## Why this matters now

The PI path **happens to work** when the user's egress IP is in
OpenAI's accepted set — large ChatGPT account ranges (residential
ISPs in the US/EU) generally pass. The path fails for:

- **Any user in mainland China** routing through any commercial
  proxy (the egress IPs are routinely challenged; #67670 is one of
  several reports)
- **Cloud-hosted gateways** (e.g. AWS / GCP / DigitalOcean egress
  often challenges)
- **Any user whose proxy IP rotates** into a Cloudflare-blocked
  range later (silently breaks production)

The `codex/* + agentRuntime.id="codex"` path is robust against all
of these because it uses the codex CLI's whitelisted client. Users
deserve to land on it by default.

## Proposed fixes (high-level, open to direction)

In rough order of leverage:

1. **Consolidate `openclaw models auth login --provider openai-codex`
   into a one-shot Codex onboarding command** that, in addition to
   the current OAuth profile write:
   - Installs `@openclaw/codex` if not already present (matches the
     docs Step 3 promise).
   - Mirrors the OAuth credentials to
     `~/.openclaw/agents/<default-agent>/agent/codex-home/auth.json`
     (chmod 600).
   - Optionally (interactive) / informatively (non-interactive) sets
     `agents.defaults.model.primary = codex/gpt-5.5` with
     `agentRuntime.id: "codex"` in the allow-list.
   - Prints a one-line verification: "Run `openclaw agent ... PONG`
     and look for `agentHarnessId: codex`".
2. **Implement the device-code flag (Drift A)** in `openclaw models
   auth login` — it's already in docs.
3. **Make `auto mode` actually pick Codex when it's available** (Drift
   C), or update the docs to say "auto mode does *not* automatically
   choose codex runtime; explicit `agentRuntime.id: "codex"` is
   required".
4. **Improve the `Model override "X" is not allowed for agent "Y"`
   error message** to point at the `agents.defaults.models`
   allow-list (Drift B).
5. **Deprecate `openai-codex/*` as a directly-configurable provider
   ref** (or at least emit a warning at gateway startup) since it's
   the PI path that 99% of users don't want.

I'm happy to file a PR for Fix #1 / #4 — I have a working
reproduction and the relevant source code annotated. The other
items I'll wait for maintainer direction.

## Related

- #67670 — same 403 symptom but proposes TLS-fingerprint
  emulation as the fix. With this issue's path fix, no fingerprint
  emulation is needed.
- #67717 — narrow 403 error classification fix. Helpful but does
  not change the underlying path selection.
- #22144 — `cf_clearance` cookie loss after `codex login` in
  app-server mode (separate; cookie store sharing between processes)

## Environment

- OpenClaw 2026.5.12 (`f066dd2`)
- Linux 6.8.0 x86_64, Node 25.2.0
- Codex CLI 0.130.0 (via `@openclaw/codex` plugin)
- Egress: mihomo HTTP proxy → AWS Singapore


#	Drift	Evidence
A	`openclaw models auth login --provider openai-codex --device-code` documented but `--device-code` not recognized by the 2026.5.12 CLI	`Error: openclaw does not recognize option "--device-code"`
B	`config set agents.defaults.model.primary openai/gpt-5.5` rejected with `Model override "X" is not allowed for agent "Y"` until the user separately writes `agents.defaults.models["openai/gpt-5.5"] = {}`. Not documented anywhere	Reproducible on any fresh install
C	`docs/plugins/codex-harness.md:170` claims `agentRuntime.id: "codex"` is optional "for normal OpenAI auto mode", implying the system picks Codex automatically. In practice the auto-mode picker always selects PI unless `agentRuntime.id` is set explicitly	See actual log signature in repro above
D	Two parallel auth stores exist (`auth-profiles.json` for PI provider + per-agent `codex-home/auth.json` for codex app-server). Docs don't explain which one each runtime path reads	Has to be discovered from source: `extensions/codex/src/app-server/transport-stdio.ts` (CODEX_HOME=per-agent) vs `extensions/openai/src/codex-provider.ts` (auth-profiles.json)
E	`openclaw models list` shows `codex/gpt-` (from `@openclaw/codex` plugin) but not `openai/gpt-`, while docs prescribe `openai/gpt-*`. No documented alias mechanism	The user can only set `openai/gpt-*` as primary if they manually add it to `agents.defaults.models`, but then no provider serves it
F	`docs/providers/openai.md` Step 3: "OpenClaw installs or repairs the bundled Codex plugin when this route is chosen". No such auto-install happens during `models auth login`	`~/.openclaw/npm/node_modules/@openclaw/codex` remains absent after `models auth login --provider openai-codex`
G	Schema rejects `contextWindow` / `input` keys under `agents.defaults.models[X]`, so users cannot correct stale model metadata registered by a plugin (e.g. codex plugin currently registers `codex/gpt-5.5` with `contextWindow=200000`, OpenAI's actual gpt-5.5 limit is 272k)	`config validate` error: `Unrecognized keys: "contextWindow", "input"`
H	`openclaw models auth login --provider openai-codex` doesn't `chmod 600` the per-agent `auth.json` it would write (if it wrote one); users who mirror credentials manually have to remember to do it	Minor, but a hardening gap

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] Docs and onboarding silently route users to the PI path instead of the Codex app-server runtime, causing chatgpt.com 403 in Cloudflare-sensitive environments #82978

Draft: New issue to file against openclaw/openclaw

Summary

Reproduction (no Cloudflare workaround required)

Diagnosis

Documentation ↔ implementation drift checklist

Why this matters now

Proposed fixes (high-level, open to direction)

Related

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug] Docs and onboarding silently route users to the PI path instead of the Codex app-server runtime, causing chatgpt.com 403 in Cloudflare-sensitive environments #82978

Description

Draft: New issue to file against openclaw/openclaw

Summary

Reproduction (no Cloudflare workaround required)

Diagnosis

Documentation ↔ implementation drift checklist

Why this matters now

Proposed fixes (high-level, open to direction)

Related

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions