Skip to content

openai-codex/gpt-5.5 still unstable in Hermes v0.14.0: subagents almost always hit APIConnectionError/TTFB timeout while Codex CLI works #33075

@yangguangjin

Description

@yangguangjin

Summary

openai-codex / gpt-5.5 is still highly unstable in Hermes Agent on the current v0.14.0 release, while the official Codex CLI remains usable on the same Windows machine, same network, and same ChatGPT/Codex login family.

This is a fresh late-May reproduction, not just a follow-up to the old April thread. The older report (#13834) has been open since April and the problem still severely affects real usage near June.

The failure is especially severe with Hermes subagents/delegation:

  • Main Hermes agent: roughly ~50% chance of hitting this failure family in normal usage.
  • Hermes subagents: nearly 100% chance when several subagents are running concurrently.
  • Official Codex CLI on the same host/network/login remains usable for normal interaction.

This makes Hermes delegation almost unusable with openai-codex / gpt-5.5.

Environment

  • OS: Windows 10, Git Bash/MSYS terminal
    • MINGW64_NT-10.0-26200
  • Hermes Agent: v0.14.0 (2026.5.16)
  • Hermes project path shown by version output: C:\Users\yangg\AppData\Local\hermes\hermes-agent
  • Python shown by Hermes: 3.11.13
  • OpenAI SDK shown by Hermes: 2.24.0
  • Official Codex CLI: codex-cli 0.134.0
  • GitHub CLI: gh version 2.92.0

Sanitized Hermes auth status:

openai-codex (1 credentials):
  #1  openai-codex-oauth-1 oauth   device_code ←

Sanitized relevant Hermes config:

model:
  default: gpt-5.5
  provider: openai-codex
  base_url: https://chatgpt.com/backend-api/codex
  context_length: 1000000

delegation:
  provider: openai-codex
  model: gpt-5.5
  context_length: 1000000
  api_mode: codex_responses
  max_iterations: 50
  child_timeout_seconds: 3600
  max_concurrent_children: 10
  max_spawn_depth: 1
  orchestrator_enabled: true

agent:
  api_max_retries: 10
  reasoning_effort: xhigh

Expected behavior

If official Codex CLI can complete normal prompts on the same machine/network/account family, Hermes openai-codex should also be able to complete normal main-agent and subagent requests reliably, or at least fail in a structured way that does not make delegation unusable.

In particular:

  • Subagents should not almost always fail before producing a response.
  • The retry loop should not amplify connection/TTFB failures into repeated stalls and eventual 429s.
  • Hermes' Codex transport should behave close enough to official Codex CLI that the same host/network/login does not show a massive reliability gap.

Actual behavior

Hermes frequently fails against:

https://chatgpt.com/backend-api/codex

Observed failure modes in the same run:

  • APIConnectionError after ~16-22 seconds.
  • No first byte from provider in 45s / TTFB watchdog timeout.
  • TimeoutError: Codex stream produced no bytes within 45s.
  • HTTP 429 rate limit after many concurrent retries.

The 429 looks like a secondary amplification effect: many subagent requests fail/stall, retry concurrently, and then hit rate limiting. The primary reliability problem appears to happen before 429: Codex requests frequently fail or stall before first byte.

Fresh sanitized log sample

[subagent-0] API call failed (attempt 1/10): APIConnectionError
[subagent-0] Provider: openai-codex  Model: gpt-5.5
[subagent-0] Endpoint: https://chatgpt.com/backend-api/codex
[subagent-0] Error: Connection error.
[subagent-0] Elapsed: 16.24s  Context: 10 msgs, ~8,607 tokens
[subagent-0] Retrying in 2.5s (attempt 1/10)...

[subagent-3] API call failed (attempt 1/10): APIConnectionError
[subagent-3] Provider: openai-codex  Model: gpt-5.5
[subagent-3] Endpoint: https://chatgpt.com/backend-api/codex
[subagent-3] Error: Connection error.
[subagent-3] Elapsed: 16.90s  Context: 18 msgs, ~45,777 tokens
[subagent-3] Retrying in 2.8s (attempt 1/10)...

[subagent-3] API call failed (attempt 1/10): APIConnectionError
[subagent-3] Provider: openai-codex  Model: gpt-5.5
[subagent-3] Endpoint: https://chatgpt.com/backend-api/codex
[subagent-3] Error: Connection error.
[subagent-3] Elapsed: 22.57s  Context: 24 msgs, ~49,033 tokens
[subagent-3] Retrying in 2.7s (attempt 1/10)...

[subagent-1] API call failed (attempt 1/10): APIConnectionError
[subagent-1] Provider: openai-codex  Model: gpt-5.5
[subagent-1] Endpoint: https://chatgpt.com/backend-api/codex
[subagent-1] Error: Connection error.
[subagent-1] Elapsed: 16.38s  Context: 52 msgs, ~88,056 tokens
[subagent-1] Retrying in 2.9s (attempt 1/10)...

[subagent-3] No first byte from provider in 45s (codex stream, model: gpt-5.5). Reconnecting.
[subagent-3] API call failed (attempt 1/10): TimeoutError
[subagent-3] Provider: openai-codex  Model: gpt-5.5
[subagent-3] Endpoint: https://chatgpt.com/backend-api/codex
[subagent-3] Error: Codex stream produced no bytes within 45s (TTFB threshold: 45s)
[subagent-3] Elapsed: 47.18s  Context: 32 msgs, ~59,753 tokens
[subagent-3] Retrying in 2.8s (attempt 1/10)...

[subagent-0] API call failed (attempt 1/10): APIConnectionError
[subagent-0] Provider: openai-codex  Model: gpt-5.5
[subagent-0] Endpoint: https://chatgpt.com/backend-api/codex
[subagent-0] Error: Connection error.
[subagent-0] Elapsed: 20.66s  Context: 67 msgs, ~109,001 tokens
[subagent-0] Retrying in 2.4s (attempt 1/10)...

[subagent-3] No first byte from provider in 45s (codex stream, model: gpt-5.5). Reconnecting.
[subagent-3] API call failed (attempt 1/10): TimeoutError
[subagent-3] Provider: openai-codex  Model: gpt-5.5
[subagent-3] Endpoint: https://chatgpt.com/backend-api/codex
[subagent-3] Error: Codex stream produced no bytes within 45s (TTFB threshold: 45s)
[subagent-3] Elapsed: 47.20s  Context: 42 msgs, ~68,758 tokens
[subagent-3] Retrying in 2.3s (attempt 1/10)...

[subagent-2] No first byte from provider in 45s (codex stream, model: gpt-5.5). Reconnecting.
[subagent-2] API call failed (attempt 1/10): TimeoutError
[subagent-2] Provider: openai-codex  Model: gpt-5.5
[subagent-2] Endpoint: https://chatgpt.com/backend-api/codex
[subagent-2] Error: Codex stream produced no bytes within 45s (TTFB threshold: 45s)
[subagent-2] Elapsed: 47.16s  Context: 70 msgs, ~98,074 tokens
[subagent-2] Retrying in 2.0s (attempt 1/10)...

[subagent-1] API call failed (attempt 1/10): APIConnectionError
[subagent-1] Provider: openai-codex  Model: gpt-5.5
[subagent-1] Endpoint: https://chatgpt.com/backend-api/codex
[subagent-1] Error: Connection error.
[subagent-1] Elapsed: 17.62s  Context: 76 msgs, ~144,687 tokens
[subagent-1] Retrying in 2.8s (attempt 1/10)...

[subagent-0] API call failed (attempt 1/10): RateLimitError [HTTP 429]
[subagent-0] Provider: openai-codex  Model: gpt-5.5
[subagent-0] Endpoint: https://chatgpt.com/backend-api/codex
[subagent-0] Error: HTTP 429: Error code: 429 - {'detail': 'Rate limit exceeded'}
[subagent-0] Details: {'detail': 'Rate limit exceeded'}
[subagent-0] Elapsed: 3.23s  Context: 88 msgs, ~136,681 tokens
[subagent-0] Rate limited. Waiting 1.0s (attempt 2/10)...

[subagent-1] No first byte from provider in 45s (codex stream, model: gpt-5.5). Reconnecting.
[subagent-1] API call failed (attempt 1/10): TimeoutError
[subagent-1] Provider: openai-codex  Model: gpt-5.5
[subagent-1] Endpoint: https://chatgpt.com/backend-api/codex
[subagent-1] Error: Codex stream produced no bytes within 45s (TTFB threshold: 45s)
[subagent-1] Elapsed: 47.03s  Context: 100 msgs, ~180,332 tokens
[subagent-1] Retrying in 2.5s (attempt 1/10)...

Why this seems Hermes-specific or Hermes-amplified

The official Codex CLI remains usable on the same host/network/account family, but Hermes' openai-codex OAuth path becomes unreliable, especially under delegation/concurrency.

The old April report (#13834) already described the same general gap: official Codex CLI works while Hermes fails against the Codex backend. This fresh reproduction shows the issue is still present on the current v0.14.0 release and is severe enough to break subagent workflows.

Impact

This is not a minor intermittent warning. It severely affects normal Hermes usage:

  • Main agent randomly becomes unreliable.
  • Subagent/delegation workflows are almost unusable with Codex OAuth.
  • Retrying many concurrent failed subagent calls can amplify into 429s.
  • Long stalls and retries make the user experience very poor.

Related issues / context

This new issue is filed because the older April report is still unresolved in practice near June, and the failure still reproduces on current Hermes v0.14.0.

Questions / possible areas to inspect

  1. Are Hermes subagents creating independent Codex clients/transports in a way that differs from the main agent and/or official Codex CLI?
  2. Is Hermes' openai-codex path missing a concurrency limiter or smarter backoff policy for delegation?
  3. Does the Codex Responses transport differ from official Codex CLI in connection reuse, websocket/SSE handling, headers, session affinity, Cloudflare/browser-like behavior, or request payload shape?
  4. Could Hermes mirror the official Codex CLI runtime path more closely for primary and concurrent agent calls?
  5. Can Hermes detect this silent/TTFB failure family earlier and prevent concurrent retries from self-amplifying into 429s?

I can run additional diagnostics if maintainers have a recommended way to compare Hermes' Codex OAuth transport against official Codex CLI behavior without exposing tokens.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3Low — cosmetic, nice to havecomp/agentCore agent loop, run_agent.py, prompt builderprovider/copilotGitHub Copilot (ACP + Chat)tool/delegateSubagent delegationtype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions