fix(agent): tolerate large Codex stream prefill by sanghyuk-seo-nexcube · Pull Request #33383 · NousResearch/hermes-agent

sanghyuk-seo-nexcube · 2026-05-27T17:54:32Z

What does this PR do?

Large openai-codex subscription requests can spend longer than the fixed TTFB cutoff in backend admission or prompt prefill before the first SSE event is emitted. Hermes treated that as a no-first-byte stream failure and reconnected, which caused long-context Codex turns to retry even though the backend could still complete successfully.

This updates the Codex Responses watchdog policy to distinguish three cases:

small requests with no first SSE event still fail fast and reconnect
large openai-codex requests wait for backend prefill instead of being killed by the first-byte watchdog
streams that emit at least one SSE event and then go idle are handled by a separate event-idle watchdog

How this differs from recent Codex stream fixes

This complements the existing Codex stream fixes rather than replacing them:

fix(agent): recover Codex Responses streams with null output #32963 recovered from response.completed.output = null crashes after stream events had already been collected.
refactor(codex): drop SDK responses.stream() helper; consume events directly #33042 removed the SDK responses.stream() helper and consumes raw responses.create(stream=True) events directly, making that null-output parser failure structurally impossible.
fix(codex): update silent-hang workaround hint + wire into TTFB watchdog #33133 improved the user-facing silent-hang workaround hint and wired that hint into the TTFB watchdog path.

Those fixes address stream consumption once events arrive, or improve the message shown when the TTFB watchdog kills a request. This PR addresses the separate pre-first-event case where hosted Codex accepts a large request but spends a long time in admission/prompt prefill before emitting the first SSE event.

Related Issue

Addresses the no-first-byte / large-context TTFB portion of #33075.
Related: #22986, #7069, #32963, #33042, #33133

Type of Change

🐛 Bug fix (non-breaking change that fixes an issue)
✅ Tests (adding or improving test coverage)

Changes Made

Updated agent/chat_completion_helpers.py to detect the hosted openai-codex Responses backend in the streaming watchdog path.
Scaled stale and idle stream watchdog thresholds by estimated request context size.
Disabled strict no-first-byte TTFB reconnects for large hosted Codex requests by default, while preserving HERMES_CODEX_TTFB_STRICT=1 for strict behavior.
Kept fast reconnect behavior for small no-byte stalls and capped stale TTFB env values for small hosted Codex requests.
Added a post-first-event idle watchdog so real stream stalls are still retried.
Expanded tests/agent/test_codex_ttfb_watchdog.py with regression coverage for small no-byte stalls, first-event-then-idle stalls, large-context prefill delay, and strict-mode behavior.

How to Test

scripts/run_tests.sh tests/agent/test_codex_ttfb_watchdog.py
scripts/run_tests.sh tests/agent/test_codex_ttfb_watchdog.py tests/agent/test_auxiliary_client.py tests/run_agent/test_run_agent_codex_responses.py
Manual verification on macOS: long openai-codex turns with large estimated input context no longer reconnect solely because the hosted backend takes longer than the small-request TTFB cutoff before the first SSE event.

Checklist

Code

I've read the Contributing Guide
My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
I searched for existing PRs to make sure this isn't a duplicate
My PR contains only changes related to this fix/feature (no unrelated commits)
I've run relevant tests and all tests pass
I've added tests for my changes
I've tested on my platform: macOS

Documentation & Housekeeping

Documentation update N/A
cli-config.yaml.example update N/A
CONTRIBUTING.md or AGENTS.md update N/A
Cross-platform impact considered: no file I/O, terminal, process management, or platform-specific behavior changed
Tool descriptions/schemas update N/A

Screenshots / Logs

Before this fix, large hosted Codex requests could show repeated reconnects like:

No first byte from provider in 12s (codex stream, model: gpt-5.5). Reconnecting.

After this fix, large hosted Codex requests wait for backend prefill; small no-byte stalls and post-first-event idle stalls are still retried.

teknium1 · 2026-05-27T18:20:08Z

Salvaged onto current main via #33390 (merged as 3476509). Your authorship is preserved in git log. Thanks @sanghyuk-seo-nexcube — the three-case watchdog policy (small-no-event / large-no-event / event-then-idle) is exactly the right shape, and it directly addresses user reports like CRUSADER's gpt-5.5 long-context TTFB false-positives.

#AI commit#

fix(agent): tolerate large codex stream prefill

5c58fb1

sanghyuk-seo-nexcube force-pushed the fix/codex-large-context-ttfb branch from a2d2ceb to 5c58fb1 Compare May 27, 2026 18:07

teknium1 mentioned this pull request May 27, 2026

fix(agent): tolerate large codex stream prefill #33390

Merged

alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder codex P3 Low — cosmetic, nice to have labels May 27, 2026

teknium1 added a commit that referenced this pull request May 27, 2026

chore(release): map sanghyuk-seo-nexcube for #33383 salvage

3476509

teknium1 closed this May 27, 2026

zqchris mentioned this pull request May 28, 2026

fix(codex): activity watchdog for codex_responses stale detector #32131

Closed

3 tasks

mathias3 pushed a commit to mathias3/hermes-agent that referenced this pull request May 28, 2026

chore(release): map sanghyuk-seo-nexcube for NousResearch#33383 salvage

a4f3301

Bryce-huang pushed a commit to wbkunlun/hermes-agent that referenced this pull request May 29, 2026

chore(release): map sanghyuk-seo-nexcube for NousResearch#33383 salvage

af8dfc7

#AI commit#

mosaiq-systems pushed a commit to mosaiq-systems/hermes-agent that referenced this pull request May 29, 2026

chore(release): map sanghyuk-seo-nexcube for NousResearch#33383 salvage

2dadc2e

KKT-OPT pushed a commit to KKT-OPT/hermes-agent that referenced this pull request May 31, 2026

chore(release): map sanghyuk-seo-nexcube for NousResearch#33383 salvage

19a95db

gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026

chore(release): map sanghyuk-seo-nexcube for NousResearch#33383 salvage

3fcfc4b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent): tolerate large Codex stream prefill#33383

fix(agent): tolerate large Codex stream prefill#33383
sanghyuk-seo-nexcube wants to merge 1 commit into
NousResearch:mainfrom
sanghyuk-seo-nexcube:fix/codex-large-context-ttfb

sanghyuk-seo-nexcube commented May 27, 2026 •

edited

Loading

Uh oh!

teknium1 commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sanghyuk-seo-nexcube commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

How this differs from recent Codex stream fixes

Related Issue

Type of Change

Changes Made

How to Test

Checklist

Code

Documentation & Housekeeping

Screenshots / Logs

Uh oh!

teknium1 commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sanghyuk-seo-nexcube commented May 27, 2026 •

edited

Loading