fix(agent): recover Responses streams with null output by richard950825-sys · Pull Request #11182 · NousResearch/hermes-agent

richard950825-sys · 2026-04-16T18:29:49Z

What does this PR do?

Fixes a Responses streaming compatibility crash where an OpenAI-compatible provider streams valid response.output_item.done events, then sends a terminal response whose response.output is null.

Existing recovery handled response.output == [], but output=None can make the OpenAI SDK raise inside stream.get_final_response() before Hermes reaches that backfill logic. This PR recovers from the already-streamed output events in both the main agent stream path and the auxiliary Codex/Responses adapter.

Related Issue

Fixes #11179

Type of Change

?? Bug fix (non-breaking change that fixes an issue)
? New feature (non-breaking change that adds functionality)
?? Security fix
?? Documentation update
? Tests (adding or improving test coverage)
?? Refactor (no behavior change)
?? New skill (bundled or hub)

Changes Made

run_agent.py
- Adds a shared Responses stream backfill helper for missing/null/empty terminal output.
- Captures terminal response.completed / response.incomplete / response.failed objects from stream events before falling back to SDK final parsing.
- Recovers from the specific SDK TypeError("'NoneType' object is not iterable") final-parse failure only when streamed output items or text deltas are available.
- Applies the same backfill behavior to responses.create(stream=True) fallback events.
agent/auxiliary_client.py
- Adds equivalent null-output recovery for the Codex auxiliary chat-completions shim used by compression, memory flushes, vision, session search, and web extraction paths.
tests/run_agent/test_run_agent_codex_responses.py
- Adds a fake-stream regression where response.completed.response.output is None but a prior response.output_item.done exists.
tests/agent/test_auxiliary_client.py
- Adds the same regression for the auxiliary client path.

How to Test

Run targeted regressions:

uv run --extra dev pytest tests/run_agent/test_run_agent_codex_responses.py::test_run_codex_stream_completed_output_none_backfills_output_items tests/agent/test_auxiliary_client.py::TestCodexAuxiliaryClientStreamRecovery::test_stream_completed_output_none_backfills_output_items -q

Run the related test files:

uv run --extra dev pytest tests/run_agent/test_run_agent_codex_responses.py tests/agent/test_auxiliary_client.py -q

Optional syntax check:

uv run --extra dev python -m py_compile run_agent.py agent/auxiliary_client.py tests/run_agent/test_run_agent_codex_responses.py tests/agent/test_auxiliary_client.py

Checklist

Code

I've read the Contributing Guide
My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
I searched for existing PRs to make sure this isn't a duplicate
My PR contains only changes related to this fix/feature (no unrelated commits)
I've run pytest tests/ -q and all tests pass
I've added tests for my changes (required for bug fixes, strongly encouraged for features)
I've tested on my platform: Alibaba Cloud Linux 3 (OpenAnolis Edition)

Documentation & Housekeeping

I've updated relevant documentation (README, docs/, docstrings) ? or N/A
I've updated cli-config.yaml.example if I added/changed config keys ? or N/A
I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows ? or N/A
I've considered cross-platform impact (Windows, macOS) per the compatibility guide ? or N/A
I've updated tool descriptions/schemas if I changed tool behavior ? or N/A

For New Skills

N/A

Screenshots / Logs

Targeted regressions:

..                                                                       [100%]
2 passed, 2 warnings in 10.54s

Related test files:

........................................................................ [ 50%]
........................................................................ [100%]
144 passed, 142 warnings in 13.72s

This reverts commit cd4f2fa.

…tput-backfill

JeremyDev87 · 2026-05-27T00:34:34Z

Review — independent read of current head

Head reviewed: e4cdf2e4a01a79c92f4ba72c4ff1c631dfd6bb9d
Scope: static review of the diff (4 files, +305/−99) against issue [Bug]: Responses stream crashes when terminal response.output is null #11179. I did not check out the branch or run the test suite locally; test results below are from the PR description, not re-executed.
Mergeability: branch currently reports CONFLICTING / DIRTY against main — a rebase/merge of main is needed before this can land.
CI: no checks reported on the branch.

The fix is well-targeted and the issue write-up is clear. A few things worth addressing:

High (please verify) — main stream path now bypasses `get_final_response()` for all providers

Previously _run_codex_stream only handled response.incomplete/response.failed in-loop and always fell through to stream.get_final_response(). It now returns inside the loop on the first response.completed carrying a non-None response:

elif event_type in {"response.completed", "response.incomplete", "response.failed"}:
    resp_obj = getattr(event, "response", None)
    ...
    if resp_obj is not None:
        return self._backfill_responses_stream_output(resp_obj, ...)

This removes the SDK's get_final_response() / parse_response() post-processing from the success path, not just the null-output case — so the blast radius is every request through this path. Could you confirm nothing downstream relied on SDK final-parse behavior the raw terminal-event response doesn't already provide (e.g. structured-output/text_format parsing, typed-object normalization, usage accounting)? The auxiliary and create_stream_fallback paths already returned the event response directly, so there's precedent, but this is new behavior for the main path.

Medium — the `get_final_response()` TypeError-recovery branch isn't covered by the new tests

Both new regressions emit a response.completed event whose response is a non-None object with output=None. That takes the in-loop return path and never calls get_final_response() — the final_error=TypeError(...) on the fake stream only acts as a guard ("terminal event should be used before final parsing"). So the central try/except recovery:

except TypeError as exc:
    if "NoneType" not in err_text or "not iterable" not in err_text:
        raise
    ...
    final_response = SimpleNamespace(status="completed", model=self.model, output=[], usage=None)

has no test that actually triggers it. Suggest adding a case where the terminal event's response is None (or absent) and get_final_response() raises the TypeError, so the recovery path is exercised end-to-end.

Low

Brittle error matching: recovery keys off "NoneType" ... "not iterable" substrings in the exception message, which can shift across Python/SDK versions. Reasonable as a narrowing guard, but a comment noting the dependency would help future maintainers.
Duplicate helpers: _backfill_responses_output (auxiliary_client.py) and _backfill_responses_stream_output (run_agent.py) are nearly identical; only the latter handles model/usage. Fine across modules for now, but they'll drift.
Type hint: model: str = None should be Optional[str] = None.

Nice

Adding has_tool_calls detection in _run_codex_create_stream_fallback is a real correctness improvement — the old fallback could collapse a function_call response into a synthesized plain-text message, and the shared helper now guards that with elif collected_text_deltas and not has_tool_calls.

Conclusion: not a formal approval — merge conflict must be resolved, and I'd like the High item confirmed and the Medium test gap closed. None of these are hard blockers to the core fix's correctness for the documented null-output scenario.

The chatgpt.com/backend-api/codex backend can emit a terminal event (response.completed/failed/incomplete) whose `output` is null. The openai SDK (2.24.0) then crashes with "'NoneType' object is not iterable" at lib/_parsing/_responses.py:61 (`for output in response.output`). A captured traceback proved the crash happens INSIDE the event loop — accumulate_event() -> parse_response() during `for event in stream` — i.e. before get_final_response() is ever reached. run_codex_stream only caught httpx/RuntimeError, so the TypeError escaped, was classified upstream as a non-retryable local error, and the raw "'NoneType' object is not iterable" was surfaced to the user (e.g. the Telegram gateway). Fix (defense in depth, both Codex streaming code paths): - run_codex_stream (agent/codex_runtime.py): wrap the stream-event loop with an `except TypeError` that recovers from the output items already yielded via response.output_item.done (or from streamed text deltas), or returns an empty-output response so validate_response() routes to retry/fallback. Also guard get_final_response() and extend the backfill to treat a null `output`. - _CodexCompletionsAdapter (agent/auxiliary_client.py): the auxiliary client has a parallel Codex stream loop used for summaries/titles/iteration-limit recaps with the same vulnerability — mirror the same recovery there. Adds regression tests for both files covering the iteration-crash path, the get_final_response crash path, and the nothing-collected case. Verified in production: under a degraded codex backend returning null output on every call, the gateway recovered 9 consecutive crashes (4-8 output items each) with zero raw errors surfaced, instead of dying on every turn. Upstream: NousResearch#11179 (PR NousResearch#11182). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

teknium1 · 2026-05-27T02:39:18Z

Superseded by #32963 (cherry-picked from @carltonawong's PR #32890, which ported your fix shape onto current main and added iterator-time regression coverage for the failure shape that broke the Codex backend today). Thanks for the original fix @richard950825-sys — your work is what the salvage was based on. Closes #11179.

richard950825-sys added 4 commits April 17, 2026 02:28

fix(agent): recover Responses streams with null output

c2fad73

Checkpoint session search timeout handling

cd4f2fa

Revert "Checkpoint session search timeout handling"

c124403

This reverts commit cd4f2fa.

Merge remote-tracking branch 'origin/main' into fix/responses-null-ou…

e4cdf2e

…tput-backfill

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 25, 2026

alt-glitch mentioned this pull request May 27, 2026

Fix Codex stream None output recovery #32883

Closed

carltonawong mentioned this pull request May 27, 2026

fix(agent): recover Codex Responses streams with null output #32890

Closed

bananawalnut mentioned this pull request May 27, 2026

[Bug]: Codex Responses stream aborts on terminal null output #32894

Closed

bender-the-bot1k mentioned this pull request May 27, 2026

fix: recover Codex streams with null output #32910

Closed

19 tasks

clarkchen mentioned this pull request May 27, 2026

fix(agent): recover Responses streams with null output #32913

Closed

This was referenced May 27, 2026

fix(codex): recover from null terminal stream output #32888

Closed

[Bug]: Responses stream crashes when terminal response.output is null #11179

Closed

QuarkAssistant mentioned this pull request May 27, 2026

[Bug]: Error: 'NoneType' object is not iterable #32892

Closed

1 task

teknium1 closed this May 27, 2026

315zuicai mentioned this pull request May 27, 2026

test(codex): cover null Responses stream output recovery #33003

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent): recover Responses streams with null output#11182

fix(agent): recover Responses streams with null output#11182
richard950825-sys wants to merge 4 commits into
NousResearch:mainfrom
richard950825-sys:fix/responses-null-output-backfill

richard950825-sys commented Apr 16, 2026

Uh oh!

JeremyDev87 commented May 27, 2026

Uh oh!

teknium1 commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

richard950825-sys commented Apr 16, 2026

What does this PR do?

Related Issue

Type of Change

Changes Made

How to Test

Checklist

Code

Documentation & Housekeeping

For New Skills

Screenshots / Logs

Uh oh!

JeremyDev87 commented May 27, 2026

Review — independent read of current head

High (please verify) — main stream path now bypasses get_final_response() for all providers

Medium — the get_final_response() TypeError-recovery branch isn't covered by the new tests

Low

Nice

Uh oh!

teknium1 commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

High (please verify) — main stream path now bypasses `get_final_response()` for all providers

Medium — the `get_final_response()` TypeError-recovery branch isn't covered by the new tests