Bug Description
Hermes intermittently treats the primary openai-codex response as empty or malformed and switches to the fallback model, even though the Codex stream reached response.completed.
This is reproducible on the gateway path, including Telegram and Discord. The key detail from the latest diagnostics is that the stream can emit tool-call events and then end with response.completed whose final response object has output=[].
Observed on April 7, 2026 after upgrading Hermes runtime to openai==2.24.0 to fix the separate proxies initialization bug.
Steps to Reproduce
- Configure Hermes
v0.7.0 with primary model gpt-5.4 via provider openai-codex and fallback model moonshotai/kimi-k2.5 via openrouter.
- Run Hermes gateway with Telegram or Discord enabled.
- Send ordinary user prompts through the gateway.
- Occasionally Hermes logs
Codex response.output is empty ... and switches to the fallback model.
- In the fresh instrumented case below, the Codex stream emitted tool-call events and then
response.completed with response_output_len: 0.
Expected Behavior
Hermes should receive a usable final response object from the Codex Responses stream, or be able to reconstruct the final response from streamed events when the provider or SDK returns an empty final output.
Actual Behavior
Hermes receives a final response with status=completed but response.output=[] and no recoverable streamed assistant text. Hermes then immediately falls back to the configured secondary model.
Affected Component
Agent Core (conversation loop, context compression, memory)
Messaging Platform (if gateway-related)
Discord
Operating System
macOS 13.6 (Darwin 22.6.0 x86_64)
Python Version
3.11.15
Hermes Version
Hermes Agent v0.7.0 (2026.4.3)
Relevant Logs / Traceback
2026-04-07 13:32:16,510 INFO gateway.run: inbound message: platform=telegram user=redacted chat=redacted msg='ok what about now'
2026-04-07 13:32:31,338 WARNING root: Codex response.output is empty (status=completed, incomplete_details=None, streamed_chars=0, output_text='', model=gpt-5.4). api_mode=codex_responses provider=openai-codex
2026-04-07 13:32:31,339 WARNING root: Recent Codex stream events before empty output: {'type': 'response.function_call_arguments.delta', ...} | {'type': 'response.function_call_arguments.done', ...} | {'type': 'response.output_item.done', 'output_index': 1, 'item_type': 'function_call', 'item_status': 'completed'} | {'type': 'response.completed', 'response_status': 'completed', 'response_output_len': 0}
2026-04-07 13:32:31,866 INFO root: Fallback activated: gpt-5.4 -> moonshotai/kimi-k2.5 (openrouter)
2026-04-07 13:32:50,003 INFO gateway.run: response ready: platform=telegram chat=redacted time=33.5s api_calls=1 response=95 chars
Root Cause Analysis (optional)
Hermes currently relies on the final object returned from active_client.responses.stream(...).get_final_response() in run_agent.py.
The diagnostics suggest a failure mode where the stream emits function-call events and then terminates with:
- response.output_item.done for a completed function_call
- response.completed
- final response object with output=[]
So the immediate Hermes-visible bug is that a completed Codex stream can arrive with no usable final output, which causes Hermes to classify the response as malformed and fall back.
This may be upstream in the OpenAI/Codex Responses streaming implementation, but Hermes likely needs a more defensive recovery path for tool-call-only or partially materialized final responses.
Proposed Fix (optional)
- Reconstruct a synthetic final response from streamed tool-call events when response.output comes back empty.
- Preserve enough streamed event state to rebuild response.output items, not just streamed assistant text deltas.
- Treat response.completed plus prior response.output_item.done events as recoverable rather than immediately malformed.
Are you willing to submit a PR for this?
Bug Description
Hermes intermittently treats the primary
openai-codexresponse as empty or malformed and switches to the fallback model, even though the Codex stream reachedresponse.completed.This is reproducible on the gateway path, including Telegram and Discord. The key detail from the latest diagnostics is that the stream can emit tool-call events and then end with
response.completedwhose final response object hasoutput=[].Observed on April 7, 2026 after upgrading Hermes runtime to
openai==2.24.0to fix the separateproxiesinitialization bug.Steps to Reproduce
v0.7.0with primary modelgpt-5.4via provideropenai-codexand fallback modelmoonshotai/kimi-k2.5viaopenrouter.Codex response.output is empty ...and switches to the fallback model.response.completedwithresponse_output_len: 0.Expected Behavior
Hermes should receive a usable final response object from the Codex Responses stream, or be able to reconstruct the final response from streamed events when the provider or SDK returns an empty final
output.Actual Behavior
Hermes receives a final response with
status=completedbutresponse.output=[]and no recoverable streamed assistant text. Hermes then immediately falls back to the configured secondary model.Affected Component
Agent Core (conversation loop, context compression, memory)
Messaging Platform (if gateway-related)
Discord
Operating System
macOS 13.6 (Darwin 22.6.0 x86_64)
Python Version
3.11.15
Hermes Version
Hermes Agent v0.7.0 (2026.4.3)
Relevant Logs / Traceback
Root Cause Analysis (optional)
Hermes currently relies on the final object returned from active_client.responses.stream(...).get_final_response() in run_agent.py.
The diagnostics suggest a failure mode where the stream emits function-call events and then terminates with:
So the immediate Hermes-visible bug is that a completed Codex stream can arrive with no usable final output, which causes Hermes to classify the response as malformed and fall back.
This may be upstream in the OpenAI/Codex Responses streaming implementation, but Hermes likely needs a more defensive recovery path for tool-call-only or partially materialized final responses.
Proposed Fix (optional)
Are you willing to submit a PR for this?