Skip to content

Add Codex (ChatGPT Pro/Plus) as inference provider#45

Closed
wakamex wants to merge 10 commits into
NousResearch:mainfrom
wakamex:feat/codex-provider
Closed

Add Codex (ChatGPT Pro/Plus) as inference provider#45
wakamex wants to merge 10 commits into
NousResearch:mainfrom
wakamex:feat/codex-provider

Conversation

@wakamex

@wakamex wakamex commented Feb 26, 2026

Copy link
Copy Markdown

Honestly, after trying this out with Codex, it does a really bad job of tool calls, so I probably won't be using it. But I thought I'd share 😶

  • Added "codex" as a provider option (hermes chat --provider codex or hermes model -> Codex)
  • Reuses credentials from ~/.codex/auth.json (written by codex login) -- no extra setup needed
  • Custom httpx transport translates between chat completions format (used by hermes) and responses API format (required by Codex endpoint)
  • OAuth token refresh handled automatically (401 -> refresh -> retry)
  • 20 tests covering auth, transport, and model resolution

Implementation

The Codex endpoint at chatgpt.com/backend-api/codex/responses only speaks the responses API with stream=true, store=false. The OpenAI SDK sends chat completions format. CodexTransport (an httpx.BaseTransport) sits between the SDK and the network:

  1. Intercepts outgoing chat completions requests
  2. Converts messages to responses API input items (system -> instructions, tool_calls -> function_call items, tool results -> function_call_output items)
  3. Rewrites URL to the Codex endpoint with clean headers (SDK headers confuse Cloudflare)
  4. Consumes the SSE stream and reassembles a chat completions JSON response

The agent loop still calls client.chat.completions.create() and gets back a normal ChatCompletion.

Design decisions

  • Transport layer rather than native responses API: avoids forking the entire agent loop, keeps the diff contained to new files
  • Store transport instance (not httpx.Client) on the agent: transports survive client.close(), so interrupt recovery just wraps a fresh Client around the same transport
  • Model list and default read from ~/.codex/models_cache.json and ~/.codex/config.toml written by the Codex CLI

Limitations and brittleness

  • The transport hand-rolls chat completions <-> responses API translation. By comparison, opencode uses the Vercel AI SDK's @ai-sdk/openai provider which handles this natively with full coverage of edge cases (streaming lifecycle, error mapping, content part types, rate limit headers, etc). Our translation covers the core path but may break on less common response shapes.
  • Codex models are weaker at tool use than comparable models via OpenRouter (observed during testing, not a code issue).
  • Transport consumes the full SSE stream before returning (no streaming to the user) -- acceptable since hermes already buffers responses.
  • No Codex-specific error mapping (rate limits, content policy) -- errors pass through as-is.
  • Auxiliary clients (context compression, web extraction, vision) don't route through Codex -- they use their own resolution chain (OpenRouter -> Nous -> custom). When codex is the only provider, these features degrade gracefully to disabled.

Possible follow-ups

  • Register codex as a proper entry in PROVIDER_REGISTRY instead of the current special-case wiring in cli.py and auth.py, so it participates in hermes login, hermes status, etc. like other providers
  • Port translation logic to use the openai SDK's native responses API client once hermes supports it, removing the need for a custom transport entirely
  • Map Codex-specific error codes to user-friendly messages
  • Add streaming pass-through if hermes adds streaming support
  • Support codex login directly from hermes setup instead of requiring the Codex CLI

Tested

  • python -m pytest tests/test_codex_auth.py tests/test_codex_transport.py -v (20 tests pass)
  • Manual: hermes chat --provider codex with valid ~/.codex credentials
  • Manual: multi-turn conversation with tool calls
  • Manual: subagent delegation with codex provider

🤖 Generated with Claude Code

wakamex and others added 10 commits February 25, 2026 21:09
Add support for using OpenAI Codex models via ~/.codex/auth.json
credentials (both API key and ChatGPT OAuth modes).

Architecture: uses a custom httpx transport (inspired by opencode) that
transparently rewrites URLs and injects auth headers, so the agent's
conversation loop has zero codex-specific branching.

New files:
- agent/codex_auth.py: credential reading, JWT parsing, token refresh
- agent/codex_models.py: model resolution from ~/.codex config/cache
- agent/codex_transport.py: httpx transport for URL rewriting
- tests/test_codex_auth.py: 17 tests covering auth + model resolution

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
_has_any_provider_configured() was missing the codex credential check,
causing the setup prompt to appear even when ~/.codex/auth.json exists.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Codex endpoint requires responses API format (input, instructions,
stream=true, store=false) but hermes uses chat.completions.create().
The transport now converts the request body and reassembles the SSE
stream response back into a chat completions JSON response.

Also builds clean headers to avoid Cloudflare 403s from stale Host
and SDK-specific headers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The responses API uses a different message format than chat completions:
- assistant tool_calls → separate function_call items with call_id
- tool results → function_call_output items
- No finish_reason, reasoning, or other hermes-specific fields

Verified against the Vercel AI SDK's conversion logic used by opencode.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Set CODEX_DEBUG=1 to see each request to the Codex endpoint printed
to stderr with model, token counts, and latency.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two bugs from code review:

1. Multiple parallel tool calls got wrong arguments assigned because
   the done handler iterated tool_calls without matching by item_id.
   Fixed with an explicit item_id → index map.

2. Codex api_key mode read OPENAI_BASE_URL, which could route Codex
   credentials to a user's custom endpoint. Hardcoded to api.openai.com.

Added tests for multi-tool-call and text-only SSE reconstruction.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Move TestResponsesToCompletion to tests/test_codex_transport.py for
  discoverability. Add TestMessageConversion for format translation.
- Fix httpx.Client lifecycle: on interrupt, the old transport was closed
  but the rebuild reused stale _client_kwargs. Now rebuilds a fresh
  transport from current credentials.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The api_key mode is redundant — hermes already manages OpenAI API keys
natively. Only the chatgpt (OAuth) mode is codex-specific.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Child AIAgent instances were created without use_codex_auth, so they
hit api.openai.com with no API key instead of routing through the
CodexTransport. Forward the flag from the parent agent.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@teknium1

Copy link
Copy Markdown
Contributor

Closing as #43 is more comprehensive to start from for me

sudo-yf pushed a commit to sudo-yf/hermes-agent that referenced this pull request Apr 5, 2026
BUG-1 (medium): _validate_profile_name() used re.match() with a $ anchor.
re.match() with $ is truthy for 'name\n' because match() allows trailing
content after the $ in multiline mode. Changed to re.fullmatch() which
requires the entire string to match — trailing newlines now correctly rejected.

BUG-2 (medium/defense-in-depth): create_profile_api() validated 'name' via
_validate_profile_name() but passed clone_from directly to hermes_cli and
_create_profile_fallback() without validation. Added clone_from validation
inside create_profile_api() (skipping 'default' which is a valid clone source).
routes.py already validates it at the HTTP layer; this adds API-layer defense.

BUG-3 (low): When hermes_cli is not importable (the exact Docker case this PR
targets), list_profiles_api() also returns only the stub default dict and
can't find the newly created profile by name. The fallback return was a
2-key dict {name, path} — incomplete vs the 9-key schema everywhere else.
Expanded to the full profile dict with all fields so API clients get
consistent data regardless of hermes_cli availability.

OBS-4 (low/TOCTOU): _create_profile_fallback() checked profile_dir.exists()
then called mkdir(exist_ok=True). If a concurrent request created the dir
between those two calls, mkdir silently succeeded — defeating the
FileExistsError guard. Changed to mkdir(exist_ok=False) so the OS raises
FileExistsError atomically if the dir appears in the race window.

Tests: 423 passed, 0 failed.
sudo-yf pushed a commit to sudo-yf/hermes-agent that referenced this pull request Apr 5, 2026
…-docker-fallback

fix: profile creation fallback for Docker (NousResearch#44)
@alfred-consciousrepo

Copy link
Copy Markdown

Hierarchy updated: #45 is now a direct sub-issue under #42 (4848 and 4852). #44 closed as requested.

Prep started on appraisal scheduling with WCCU. Insurance reimbursement strategic document draft created (see alfred-memory/thoughts/insurance-reimbursement-strategic-draft.md). Coordination draft for Olga on IGS MO LLC bank account and 743 N Euclid offer also prepared for review.

Next: user to review drafts and provide approval or details before any external outreach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants