Skip to content

[Bug]: Codex Responses stream completes with empty output after tool-call events, forcing fallback #5732

@SoWizard

Description

@SoWizard

Bug Description

Hermes intermittently treats the primary openai-codex response as empty or malformed and switches to the fallback model, even though the Codex stream reached response.completed.

This is reproducible on the gateway path, including Telegram and Discord. The key detail from the latest diagnostics is that the stream can emit tool-call events and then end with response.completed whose final response object has output=[].

Observed on April 7, 2026 after upgrading Hermes runtime to openai==2.24.0 to fix the separate proxies initialization bug.

Steps to Reproduce

  1. Configure Hermes v0.7.0 with primary model gpt-5.4 via provider openai-codex and fallback model moonshotai/kimi-k2.5 via openrouter.
  2. Run Hermes gateway with Telegram or Discord enabled.
  3. Send ordinary user prompts through the gateway.
  4. Occasionally Hermes logs Codex response.output is empty ... and switches to the fallback model.
  5. In the fresh instrumented case below, the Codex stream emitted tool-call events and then response.completed with response_output_len: 0.

Expected Behavior

Hermes should receive a usable final response object from the Codex Responses stream, or be able to reconstruct the final response from streamed events when the provider or SDK returns an empty final output.

Actual Behavior

Hermes receives a final response with status=completed but response.output=[] and no recoverable streamed assistant text. Hermes then immediately falls back to the configured secondary model.

Affected Component

Agent Core (conversation loop, context compression, memory)

Messaging Platform (if gateway-related)

Discord

Operating System

macOS 13.6 (Darwin 22.6.0 x86_64)

Python Version

3.11.15

Hermes Version

Hermes Agent v0.7.0 (2026.4.3)

Relevant Logs / Traceback

2026-04-07 13:32:16,510 INFO gateway.run: inbound message: platform=telegram user=redacted chat=redacted msg='ok what about now'
2026-04-07 13:32:31,338 WARNING root: Codex response.output is empty (status=completed, incomplete_details=None, streamed_chars=0, output_text='', model=gpt-5.4). api_mode=codex_responses provider=openai-codex
2026-04-07 13:32:31,339 WARNING root: Recent Codex stream events before empty output: {'type': 'response.function_call_arguments.delta', ...} | {'type': 'response.function_call_arguments.done', ...} | {'type': 'response.output_item.done', 'output_index': 1, 'item_type': 'function_call', 'item_status': 'completed'} | {'type': 'response.completed', 'response_status': 'completed', 'response_output_len': 0}
2026-04-07 13:32:31,866 INFO root: Fallback activated: gpt-5.4 -> moonshotai/kimi-k2.5 (openrouter)
2026-04-07 13:32:50,003 INFO gateway.run: response ready: platform=telegram chat=redacted time=33.5s api_calls=1 response=95 chars

Root Cause Analysis (optional)

Hermes currently relies on the final object returned from active_client.responses.stream(...).get_final_response() in run_agent.py.

The diagnostics suggest a failure mode where the stream emits function-call events and then terminates with:

  • response.output_item.done for a completed function_call
  • response.completed
  • final response object with output=[]

So the immediate Hermes-visible bug is that a completed Codex stream can arrive with no usable final output, which causes Hermes to classify the response as malformed and fall back.

This may be upstream in the OpenAI/Codex Responses streaming implementation, but Hermes likely needs a more defensive recovery path for tool-call-only or partially materialized final responses.

Proposed Fix (optional)

  • Reconstruct a synthetic final response from streamed tool-call events when response.output comes back empty.
  • Preserve enough streamed event state to rebuild response.output items, not just streamed assistant text deltas.
  • Treat response.completed plus prior response.output_item.done events as recoverable rather than immediately malformed.

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    sweeper:implemented-on-mainSweeper: behavior already present on current maintype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions