Skip to content

fix(run_agent): tolerate missing response.created on custom codex streams#8134

Open
ZhangYiqun018 wants to merge 1 commit into
NousResearch:mainfrom
ZhangYiqun018:fix/custom-codex-missing-response-created
Open

fix(run_agent): tolerate missing response.created on custom codex streams#8134
ZhangYiqun018 wants to merge 1 commit into
NousResearch:mainfrom
ZhangYiqun018:fix/custom-codex-missing-response-created

Conversation

@ZhangYiqun018

@ZhangYiqun018 ZhangYiqun018 commented Apr 12, 2026

Copy link
Copy Markdown

What does this PR do?

Fixes a custom-endpoint compatibility bug in the Codex Responses streaming path.

Some provider: custom + api_mode: codex_responses relays emit a valid, usable SSE stream but start with response.in_progress instead of response.created. curl -N can still read the stream, but the OpenAI SDK's strict Responses stream state machine raises:

Expected to have received response.created before response.in_progress

This PR keeps the strict responses.stream(...) path as the default, but adds a narrow compatibility fallback for custom providers only:

  • retry once on the normal stream path
  • if the same missing-response.created ordering error persists, fall back to Hermes' existing responses.create(stream=True) parser
  • tolerate fallback streams that begin with response.in_progress
  • preserve live text/reasoning deltas and output backfill while using the fallback

The behavior for native Codex / non-custom providers stays unchanged.

Related Issue

Fixes #8133

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • Scoped missing-response.created detection to custom Codex Responses providers in run_agent.py
  • Reused the existing responses.create(stream=True) fallback instead of broadening the main SDK path for every provider
  • Extended the fallback parser to handle streams that start with response.in_progress
  • Added fallback-side delta/reasoning callback delivery so live streaming still works during compatibility fallback
  • Added regression tests for:
    • custom provider retries + fallback when response.created is missing
    • non-custom provider still raising on the same SDK error
    • fallback parser consuming a stream that begins with response.in_progress

How to Test

  1. Configure Hermes with a custom Responses endpoint:
    model:
      provider: custom
      default: gpt-5.4
      base_url: https://<custom-endpoint>/openai
      api_key: <key>
      api_mode: codex_responses
  2. Reproduce against a relay that omits the initial response.created event and starts its SSE stream with response.in_progress.
  3. Confirm Hermes no longer fails with Expected to have received response.created before response.in_progress and instead completes normally.
  4. Run the targeted regression tests:
    .venv/bin/pytest tests/run_agent/test_run_agent_codex_responses.py -q
    .venv/bin/pytest tests/run_agent/test_streaming.py -q

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: macOS 26.4

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
  • I've updated tool descriptions/schemas if I changed tool behavior — or N/A

Screenshots / Logs

Observed failing SDK error before this patch:

Expected to have received response.created before response.in_progress

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 28, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Fixes #8133. Related to #14634 (similar stream ordering issue with codex.rate_limits).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Custom codex_responses streams fail when response.created is omitted

2 participants