Skip to content

fix(auxiliary): recover Codex stream drops#21761

Closed
xdeepsnyx wants to merge 1 commit into
NousResearch:mainfrom
xdeepsnyx:fix/codex-aux-stream-recovery
Closed

fix(auxiliary): recover Codex stream drops#21761
xdeepsnyx wants to merge 1 commit into
NousResearch:mainfrom
xdeepsnyx:fix/codex-aux-stream-recovery

Conversation

@xdeepsnyx

Copy link
Copy Markdown

Summary

  • Add recovery for Codex auxiliary Responses streams when the chunked stream drops mid-response.
  • Retry responses.stream() once, then fall back to responses.create(stream=True) because the Codex backend requires streaming.
  • Preserve streamed output backfill and timeout/interruption behavior.
  • Add regression coverage for transient transport failure and fallback streaming recovery.

Why

Codex auxiliary clients are used by side tasks such as context compression. The main Codex chat path already has stream recovery, but the auxiliary adapter could fail on transient incomplete chunked read / peer-closed stream errors and force callers into fallback behavior.

Test plan

  • ./venv/bin/python -m pytest tests/agent/test_auxiliary_client.py::TestCodexAuxiliaryStreamRecovery -q -o 'addopts='
  • ./venv/bin/python -m pytest tests/agent/test_auxiliary_client.py tests/agent/test_auxiliary_transport_autodetect.py tests/agent/test_auxiliary_main_first.py -q -o 'addopts='

Codex auxiliary calls must use streaming Responses API, but the adapter could fail compression on transient chunked stream drops. Add retry plus create(stream=True) fallback, preserve streamed output recovery, and cover the behavior with regression tests.
@teknium1

Copy link
Copy Markdown
Contributor

This looks implemented on current main by the later Codex streaming refactor plus the shared auxiliary transient-retry path. This is an automated hermes-sweeper review.

Evidence:

  • agent/auxiliary_client.py:824 now uses responses.create(stream=True) directly for Codex auxiliary calls and consumes the raw event stream via _consume_codex_event_stream, so the old responses.stream() helper path this PR patched is no longer in the auxiliary call path.
  • agent/codex_runtime.py:330 documents the reason for that structural fix: avoid SDK reconstruction from terminal response.output and assemble content from streamed response.output_item.done / delta events instead.
  • agent/auxiliary_client.py:5173 retries transient auxiliary transport errors once on the same provider before existing fallback handling; _is_connection_error includes incomplete chunked read, peer closed connection, premature end, EOF, and protocol errors.
  • tests/agent/test_auxiliary_client.py:1824 covers the incomplete-chunked-read/peer-closed retry behavior and fallback escalation after a second transient failure.
  • The relevant landed commits are cb38ce28cbd22278c30973eb4af5260c46543a7f and 02a4d66951984e7e4a656ac0d5162a7a6a1ee8ad.

@teknium1 teknium1 closed this Jun 11, 2026
@teknium1 teknium1 added the sweeper:implemented-on-main Sweeper: behavior already present on current main label Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists sweeper:implemented-on-main Sweeper: behavior already present on current main type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants