Add Codex (ChatGPT Pro/Plus) as inference provider by wakamex · Pull Request #45 · NousResearch/hermes-agent

wakamex · 2026-02-26T02:36:03Z

Honestly, after trying this out with Codex, it does a really bad job of tool calls, so I probably won't be using it. But I thought I'd share 😶

Added "codex" as a provider option (hermes chat --provider codex or hermes model -> Codex)
Reuses credentials from ~/.codex/auth.json (written by codex login) -- no extra setup needed
Custom httpx transport translates between chat completions format (used by hermes) and responses API format (required by Codex endpoint)
OAuth token refresh handled automatically (401 -> refresh -> retry)
20 tests covering auth, transport, and model resolution

Implementation

The Codex endpoint at chatgpt.com/backend-api/codex/responses only speaks the responses API with stream=true, store=false. The OpenAI SDK sends chat completions format. CodexTransport (an httpx.BaseTransport) sits between the SDK and the network:

Intercepts outgoing chat completions requests
Converts messages to responses API input items (system -> instructions, tool_calls -> function_call items, tool results -> function_call_output items)
Rewrites URL to the Codex endpoint with clean headers (SDK headers confuse Cloudflare)
Consumes the SSE stream and reassembles a chat completions JSON response

The agent loop still calls client.chat.completions.create() and gets back a normal ChatCompletion.

Design decisions

Transport layer rather than native responses API: avoids forking the entire agent loop, keeps the diff contained to new files
Store transport instance (not httpx.Client) on the agent: transports survive client.close(), so interrupt recovery just wraps a fresh Client around the same transport
Model list and default read from ~/.codex/models_cache.json and ~/.codex/config.toml written by the Codex CLI

Limitations and brittleness

The transport hand-rolls chat completions <-> responses API translation. By comparison, opencode uses the Vercel AI SDK's @ai-sdk/openai provider which handles this natively with full coverage of edge cases (streaming lifecycle, error mapping, content part types, rate limit headers, etc). Our translation covers the core path but may break on less common response shapes.
Codex models are weaker at tool use than comparable models via OpenRouter (observed during testing, not a code issue).
Transport consumes the full SSE stream before returning (no streaming to the user) -- acceptable since hermes already buffers responses.
No Codex-specific error mapping (rate limits, content policy) -- errors pass through as-is.
Auxiliary clients (context compression, web extraction, vision) don't route through Codex -- they use their own resolution chain (OpenRouter -> Nous -> custom). When codex is the only provider, these features degrade gracefully to disabled.

Possible follow-ups

Register codex as a proper entry in PROVIDER_REGISTRY instead of the current special-case wiring in cli.py and auth.py, so it participates in hermes login, hermes status, etc. like other providers
Port translation logic to use the openai SDK's native responses API client once hermes supports it, removing the need for a custom transport entirely
Map Codex-specific error codes to user-friendly messages
Add streaming pass-through if hermes adds streaming support
Support codex login directly from hermes setup instead of requiring the Codex CLI

Tested

python -m pytest tests/test_codex_auth.py tests/test_codex_transport.py -v (20 tests pass)
Manual: hermes chat --provider codex with valid ~/.codex credentials
Manual: multi-turn conversation with tool calls
Manual: subagent delegation with codex provider

🤖 Generated with Claude Code

Add support for using OpenAI Codex models via ~/.codex/auth.json credentials (both API key and ChatGPT OAuth modes). Architecture: uses a custom httpx transport (inspired by opencode) that transparently rewrites URLs and injects auth headers, so the agent's conversation loop has zero codex-specific branching. New files: - agent/codex_auth.py: credential reading, JWT parsing, token refresh - agent/codex_models.py: model resolution from ~/.codex config/cache - agent/codex_transport.py: httpx transport for URL rewriting - tests/test_codex_auth.py: 17 tests covering auth + model resolution Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

_has_any_provider_configured() was missing the codex credential check, causing the setup prompt to appear even when ~/.codex/auth.json exists. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The Codex endpoint requires responses API format (input, instructions, stream=true, store=false) but hermes uses chat.completions.create(). The transport now converts the request body and reassembles the SSE stream response back into a chat completions JSON response. Also builds clean headers to avoid Cloudflare 403s from stale Host and SDK-specific headers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The responses API uses a different message format than chat completions: - assistant tool_calls → separate function_call items with call_id - tool results → function_call_output items - No finish_reason, reasoning, or other hermes-specific fields Verified against the Vercel AI SDK's conversion logic used by opencode. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Set CODEX_DEBUG=1 to see each request to the Codex endpoint printed to stderr with model, token counts, and latency. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Two bugs from code review: 1. Multiple parallel tool calls got wrong arguments assigned because the done handler iterated tool_calls without matching by item_id. Fixed with an explicit item_id → index map. 2. Codex api_key mode read OPENAI_BASE_URL, which could route Codex credentials to a user's custom endpoint. Hardcoded to api.openai.com. Added tests for multi-tool-call and text-only SSE reconstruction. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Move TestResponsesToCompletion to tests/test_codex_transport.py for discoverability. Add TestMessageConversion for format translation. - Fix httpx.Client lifecycle: on interrupt, the old transport was closed but the rebuild reused stale _client_kwargs. Now rebuilds a fresh transport from current credentials. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The api_key mode is redundant — hermes already manages OpenAI API keys natively. Only the chatgpt (OAuth) mode is codex-specific. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Child AIAgent instances were created without use_codex_auth, so they hit api.openai.com with no API key instead of routing through the CodexTransport. Forward the flag from the parent agent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

teknium1 · 2026-02-28T07:39:20Z

Closing as #43 is more comprehensive to start from for me

BUG-1 (medium): _validate_profile_name() used re.match() with a $ anchor. re.match() with $ is truthy for 'name\n' because match() allows trailing content after the $ in multiline mode. Changed to re.fullmatch() which requires the entire string to match — trailing newlines now correctly rejected. BUG-2 (medium/defense-in-depth): create_profile_api() validated 'name' via _validate_profile_name() but passed clone_from directly to hermes_cli and _create_profile_fallback() without validation. Added clone_from validation inside create_profile_api() (skipping 'default' which is a valid clone source). routes.py already validates it at the HTTP layer; this adds API-layer defense. BUG-3 (low): When hermes_cli is not importable (the exact Docker case this PR targets), list_profiles_api() also returns only the stub default dict and can't find the newly created profile by name. The fallback return was a 2-key dict {name, path} — incomplete vs the 9-key schema everywhere else. Expanded to the full profile dict with all fields so API clients get consistent data regardless of hermes_cli availability. OBS-4 (low/TOCTOU): _create_profile_fallback() checked profile_dir.exists() then called mkdir(exist_ok=True). If a concurrent request created the dir between those two calls, mkdir silently succeeded — defeating the FileExistsError guard. Changed to mkdir(exist_ok=False) so the OS raises FileExistsError atomically if the dir appears in the race window. Tests: 423 passed, 0 failed.

…-docker-fallback fix: profile creation fallback for Docker (NousResearch#44)

alfred-consciousrepo · 2026-05-18T19:14:01Z

Hierarchy updated: #45 is now a direct sub-issue under #42 (4848 and 4852). #44 closed as requested.

Prep started on appraisal scheduling with WCCU. Insurance reimbursement strategic document draft created (see alfred-memory/thoughts/insurance-reimbursement-strategic-draft.md). Coordination draft for Olga on IGS MO LLC bank account and 743 N Euclid offer also prepared for review.

Next: user to review drafts and provide approval or details before any external outreach.

wakamex and others added 10 commits February 25, 2026 21:09

fix: check codex credentials in provider detection

2f7679e

_has_any_provider_configured() was missing the codex credential check, causing the setup prompt to appear even when ~/.codex/auth.json exists. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: add CODEX_DEBUG env var for live request tracing

e960052

Set CODEX_DEBUG=1 to see each request to the Codex endpoint printed to stderr with model, token counts, and latency. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chore: remove CODEX_DEBUG tracing

38626c9

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

refactor: remove api_key auth mode from codex provider

2342337

The api_key mode is redundant — hermes already manages OpenAI API keys natively. Only the chatgpt (OAuth) mode is codex-specific. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

wakamex mentioned this pull request Feb 26, 2026

Enable ChatGPT subscription Codex support end-to-end #43

Merged

teknium1 closed this Feb 28, 2026

sudo-yf pushed a commit to sudo-yf/hermes-agent that referenced this pull request Apr 5, 2026

Merge pull request NousResearch#45 from nesquena/fix/profile-creation…

c488031

…-docker-fallback fix: profile creation fallback for Docker (NousResearch#44)

zhangtobybot-a11y mentioned this pull request Apr 11, 2026

Rl capabilities && File Operator Tools #15

Merged

iiai-lab mentioned this pull request May 8, 2026

feat(api-server): resumable chat-completion stream with grace window iiai-lab/hermes-agent#6

Open

4 tasks

ixhxpns mentioned this pull request May 13, 2026

Harden GitHub bounty cron scouting #24480

Open

OmarB97 mentioned this pull request May 31, 2026

fix(agent): bound local dflash stream hangs #35701

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Codex (ChatGPT Pro/Plus) as inference provider#45

Add Codex (ChatGPT Pro/Plus) as inference provider#45
wakamex wants to merge 10 commits into
NousResearch:mainfrom
wakamex:feat/codex-provider

wakamex commented Feb 26, 2026 •

edited

Loading

Uh oh!

teknium1 commented Feb 28, 2026

Uh oh!

alfred-consciousrepo commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

wakamex commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Implementation

Design decisions

Limitations and brittleness

Possible follow-ups

Tested

Uh oh!

teknium1 commented Feb 28, 2026

Uh oh!

alfred-consciousrepo commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wakamex commented Feb 26, 2026 •

edited

Loading