Skip to content

fix(agent): recover Codex streams with null final output#32884

Closed
serejaris wants to merge 3 commits into
NousResearch:mainfrom
serejaris:fix/codex-null-output-stream
Closed

fix(agent): recover Codex streams with null final output#32884
serejaris wants to merge 3 commits into
NousResearch:mainfrom
serejaris:fix/codex-null-output-stream

Conversation

@serejaris

@serejaris serejaris commented May 27, 2026

Copy link
Copy Markdown

What does this PR do?

Fixes a Codex Responses streaming crash where the ChatGPT Codex backend can emit valid streamed output and then trip openai-python final parsing with:

TypeError: 'NoneType' object is not iterable

Observed repro shape: openai-codex / gpt-5.5 on https://chatgpt.com/backend-api/codex streamed response.output_text.delta chunks spelling the expected answer, then openai-python raised while parsing a final response whose output was null.

This PR preserves already-received streamed content instead of treating that SDK finalization failure as a fatal empty response. It handles both observed SDK failure points: during stream iteration and during stream.get_final_response().

Related Issue

Related to #21444, #19981, and #22986.

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✅ Tests (adding or improving test coverage)

Changes Made

  • agent/codex_runtime.py
    • Adds a narrow recovery helper for Codex streamed responses when openai-python hits the null-final-output parser path after Hermes already collected usable streamed output.
    • Recovers from response.output_item.done items first.
    • Synthesizes text from streamed deltas only when no function/tool call was observed.
    • Handles the null-output TypeError both from stream iteration and from stream.get_final_response().
    • Preserves failure behavior when there is no streamed output, when a tool-call turn only has text deltas, or when the TypeError comes from Hermes-side callback/processing code.
  • tests/run_agent/test_run_agent_codex_responses.py
    • Adds regression coverage for recovered streamed deltas, recovered output items, finalization-time null output, no recovery without streamed output, no text synthesis after tool-call events, and no masking of unrelated callback TypeErrors.

How to Test

  1. Targeted canonical runner:
scripts/run_tests.sh tests/run_agent/test_run_agent_codex_responses.py -- -q

Result in CT216 after the test follow-up commit: 73 tests passed.

  1. Direct pytest before the follow-up commit:
venv/bin/python -m pytest tests/run_agent/test_run_agent_codex_responses.py -q

Result in CT216: 70 passed, 1 warning.

  1. Live validation against ChatGPT Codex backend:
hermes chat -Q --provider openai-codex -m gpt-5.5 -q "Reply exactly: LATEST_GPT55_FINAL_OK"

Result in CT216 after the earlier code follow-up commit:

LATEST_GPT55_FINAL_OK
  1. Partial full-suite run before the follow-up commit:
scripts/run_tests.sh -j 4

Stopped after confirming early progress because the full suite is 1205 files / 26435 tests in this CT. Progress before stop: 669 passed, 0 failed.

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(agent): ...)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes
  • I've tested on my platform: Debian/Linux LXC on Proxmox, Python 3.11.15

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — N/A
  • I've considered cross-platform impact (Windows, macOS) — N/A; this is provider stream parsing logic
  • I've updated tool descriptions/schemas if I changed tool behavior — N/A

Screenshots / Logs

Live validation output after the follow-up commit:

session_id: 20260527_003011_c8f528
LATEST_GPT55_FINAL_OK

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder provider/openai OpenAI / Codex Responses API labels May 27, 2026
@serejaris

Copy link
Copy Markdown
Author

Thanks, this was a real remaining path. I cherry-picked your follow-up as 9a9d46f, preserving authorship. Re-ran targeted canonical validation: scripts/run_tests.sh tests/run_agent/test_run_agent_codex_responses.py -- -q -> 71 passed. Live check after the follow-up: hermes chat -Q --provider openai-codex -m gpt-5.5 -q \"Reply exactly: LATEST_GPT55_FINAL_OK\" -> LATEST_GPT55_FINAL_OK.

@xxglbs

xxglbs commented May 27, 2026

Copy link
Copy Markdown

Linux 修复命令,直接终端运行即可,然后重启gateway

`
cd /home/admin/.hermes/hermes-agent

0. 备份

cp venv/lib/python3.11/site-packages/openai/lib/_parsing/_responses.py
/tmp/_responses.py.bak.$(date +%Y%m%d-%H%M%S)

cp agent/transports/codex.py
/tmp/codex.py.bak.$(date +%Y%m%d-%H%M%S)

1. Patch OpenAI SDK:给 response.output 加空值保护

python3 - <<'PY'
from pathlib import Path

p = Path("venv/lib/python3.11/site-packages/openai/lib/_parsing/_responses.py")
src = p.read_text()

old = "for output in response.output:"
new = "for output in (response.output or []):"

if new in src:
print("✓ SDK patch already applied")
elif old in src:
p.write_text(src.replace(old, new, 1))
print("✓ SDK patch applied")
else:
raise SystemExit("✗ SDK patch failed: target line not found")
PY

2. Patch Hermes transport:避免 tools=None 传进 SDK

python3 - <<'PY'
from pathlib import Path

p = Path("agent/transports/codex.py")
src = p.read_text()

old = (
' "tools": response_tools,\n'
' "store": False,\n'
' }\n'
' if response_tools:\n'
' kwargs["tool_choice"] = "auto"\n'
' kwargs["parallel_tool_calls"] = True'
)

new = (
' "store": False,\n'
' }\n'
' if response_tools:\n'
' kwargs["tools"] = response_tools\n'
' kwargs["tool_choice"] = "auto"\n'
' kwargs["parallel_tool_calls"] = True'
)

if new in src:
print("✓ Transport patch already applied")
elif old in src:
p.write_text(src.replace(old, new, 1))
print("✓ Transport patch applied")
else:
print("⚠ Transport patch skipped: source differs or already patched")
PY

3. 验证关键行

grep -n "for output in (response.output or [])"
venv/lib/python3.11/site-packages/openai/lib/_parsing/_responses.py

grep -n '"tools": response_tools' agent/transports/codex.py || true
`

@mikeytag

Copy link
Copy Markdown

Thank you @serejaris ! I applied this patch with 2 Hermes installs and I'm back up and running with Hermes on codex again.

@iqdoctor

Copy link
Copy Markdown
Contributor

For the SDK-level root cause isolated here, I opened an upstream OpenAI Python SDK follow-up:

openai/openai-python#3315

The Hermes-side canonical fix in this PR recovers usable streamed output when the final Codex Responses parse hits response.output=None. The upstream SDK invariant is narrower: openai-python should not leak raw TypeError: 'NoneType' object is not iterable when parser/accessor code sees a Responses object with missing/null output.

Terminology note: this is the OpenAI Python SDK (openai-python) layer rather than the OpenAI Agents SDK layer.

@iqdoctor

Copy link
Copy Markdown
Contributor

Non-blocking suggestion: it may be worth adding one regression for the post-SDK-fix path where stream.get_final_response() succeeds but returns output=[] or output=None after text deltas were already streamed.

The implementation already handles _out is None or empty, so this would mostly document compatibility with a future openai-python fix that stops raising TypeError and instead exposes an empty final output.

This should keep #32884 explicitly covered for the SDK-layer follow-up in openai/openai-python#3315.

@serejaris

Copy link
Copy Markdown
Author

Added the regression in 21503a9: test_run_codex_stream_recovers_when_final_response_output_is_empty_after_deltas covers both output=None and output=[] after streamed text deltas when get_final_response() succeeds.

Validation: scripts/run_tests.sh tests/run_agent/test_run_agent_codex_responses.py -- -q -> 73 passed.

@teknium1

Copy link
Copy Markdown
Contributor

Closing as duplicate — the Codex null-output fix has been merged via #32963 (cherry-picked from @carltonawong's PR #32890, the one Gille reviewed). Thanks for jumping on the outage so quickly; appreciate the help. Closes #11179.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists provider/openai OpenAI / Codex Responses API type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants