fix(codex): treat reasoning-only responses as incomplete, not stop#2070
Merged
Conversation
added 2 commits
March 19, 2026 10:00
When a Codex Responses API response contains only reasoning items (encrypted thinking state) with no message text or tool calls, the _normalize_codex_response method was setting finish_reason='stop'. This sent the response into the empty-content retry loop, which burned 3 retries and then failed — exactly the pattern Nester reported in Discord. Two fixes: 1. _normalize_codex_response: reasoning-only responses (reasoning_items_raw non-empty but no final_text) now get finish_reason='incomplete', routing them to the Codex continuation path instead of the retry loop. 2. Incomplete handling: also checks for codex_reasoning_items when deciding whether to preserve an interim message, so encrypted reasoning state is not silently dropped when there is no visible reasoning text. Adds 4 regression tests covering: - Unit: reasoning-only → incomplete, reasoning+content → stop - E2E: reasoning-only → continuation → final answer succeeds - E2E: encrypted reasoning items preserved in interim messages
…I input Follow-up to the reasoning-only response fix. Three additional issues found by tracing the full replay path: 1. _chat_messages_to_responses_input: when a reasoning-only interim message was converted to Responses API input, the reasoning items were emitted as the last items with no following item. The Responses API requires a following item after each reasoning item (otherwise: 'missing_following_item' error, as seen in OpenHands #11406). Now emits an empty assistant message as the required following item when content is empty but reasoning items were added. 2. Duplicate detection: two consecutive reasoning-only incomplete messages with identical empty content/reasoning but different encrypted codex_reasoning_items were incorrectly treated as duplicates, silently dropping the second response's reasoning state. Now includes codex_reasoning_items in the duplicate comparison. 3. Added tests for both the API input conversion path and the duplicate detection edge case. Research context: verified against OpenCode (uses Vercel AI SDK, no retry loop so avoids the issue), Clawdbot (drops orphaned reasoning blocks entirely), and OpenHands (hit the missing_following_item error). Our approach preserves reasoning continuity while satisfying the API constraint.
angelburgosrosado
pushed a commit
to angelburgosrosado/hermes-agent
that referenced
this pull request
Apr 27, 2026
…arch#2070) * fix(codex): treat reasoning-only responses as incomplete, not stop When a Codex Responses API response contains only reasoning items (encrypted thinking state) with no message text or tool calls, the _normalize_codex_response method was setting finish_reason='stop'. This sent the response into the empty-content retry loop, which burned 3 retries and then failed — exactly the pattern Nester reported in Discord. Two fixes: 1. _normalize_codex_response: reasoning-only responses (reasoning_items_raw non-empty but no final_text) now get finish_reason='incomplete', routing them to the Codex continuation path instead of the retry loop. 2. Incomplete handling: also checks for codex_reasoning_items when deciding whether to preserve an interim message, so encrypted reasoning state is not silently dropped when there is no visible reasoning text. Adds 4 regression tests covering: - Unit: reasoning-only → incomplete, reasoning+content → stop - E2E: reasoning-only → continuation → final answer succeeds - E2E: encrypted reasoning items preserved in interim messages * fix(codex): ensure reasoning items have required following item in API input Follow-up to the reasoning-only response fix. Three additional issues found by tracing the full replay path: 1. _chat_messages_to_responses_input: when a reasoning-only interim message was converted to Responses API input, the reasoning items were emitted as the last items with no following item. The Responses API requires a following item after each reasoning item (otherwise: 'missing_following_item' error, as seen in OpenHands NousResearch#11406). Now emits an empty assistant message as the required following item when content is empty but reasoning items were added. 2. Duplicate detection: two consecutive reasoning-only incomplete messages with identical empty content/reasoning but different encrypted codex_reasoning_items were incorrectly treated as duplicates, silently dropping the second response's reasoning state. Now includes codex_reasoning_items in the duplicate comparison. 3. Added tests for both the API input conversion path and the duplicate detection edge case. Research context: verified against OpenCode (uses Vercel AI SDK, no retry loop so avoids the issue), Clawdbot (drops orphaned reasoning blocks entirely), and OpenHands (hit the missing_following_item error). Our approach preserves reasoning continuity while satisfying the API constraint. --------- Co-authored-by: Test <test@test.com>
02356abc
pushed a commit
to 02356abc/hermes-agent
that referenced
this pull request
May 14, 2026
…arch#2070) * fix(codex): treat reasoning-only responses as incomplete, not stop When a Codex Responses API response contains only reasoning items (encrypted thinking state) with no message text or tool calls, the _normalize_codex_response method was setting finish_reason='stop'. This sent the response into the empty-content retry loop, which burned 3 retries and then failed — exactly the pattern Nester reported in Discord. Two fixes: 1. _normalize_codex_response: reasoning-only responses (reasoning_items_raw non-empty but no final_text) now get finish_reason='incomplete', routing them to the Codex continuation path instead of the retry loop. 2. Incomplete handling: also checks for codex_reasoning_items when deciding whether to preserve an interim message, so encrypted reasoning state is not silently dropped when there is no visible reasoning text. Adds 4 regression tests covering: - Unit: reasoning-only → incomplete, reasoning+content → stop - E2E: reasoning-only → continuation → final answer succeeds - E2E: encrypted reasoning items preserved in interim messages * fix(codex): ensure reasoning items have required following item in API input Follow-up to the reasoning-only response fix. Three additional issues found by tracing the full replay path: 1. _chat_messages_to_responses_input: when a reasoning-only interim message was converted to Responses API input, the reasoning items were emitted as the last items with no following item. The Responses API requires a following item after each reasoning item (otherwise: 'missing_following_item' error, as seen in OpenHands NousResearch#11406). Now emits an empty assistant message as the required following item when content is empty but reasoning items were added. 2. Duplicate detection: two consecutive reasoning-only incomplete messages with identical empty content/reasoning but different encrypted codex_reasoning_items were incorrectly treated as duplicates, silently dropping the second response's reasoning state. Now includes codex_reasoning_items in the duplicate comparison. 3. Added tests for both the API input conversion path and the duplicate detection edge case. Research context: verified against OpenCode (uses Vercel AI SDK, no retry loop so avoids the issue), Clawdbot (drops orphaned reasoning blocks entirely), and OpenHands (hit the missing_following_item error). Our approach preserves reasoning continuity while satisfying the API constraint. --------- Co-authored-by: Test <test@test.com>
olympus-terminal
pushed a commit
to olympus-terminal/hermes-agent
that referenced
this pull request
May 16, 2026
…arch#2070) * fix(codex): treat reasoning-only responses as incomplete, not stop When a Codex Responses API response contains only reasoning items (encrypted thinking state) with no message text or tool calls, the _normalize_codex_response method was setting finish_reason='stop'. This sent the response into the empty-content retry loop, which burned 3 retries and then failed — exactly the pattern Nester reported in Discord. Two fixes: 1. _normalize_codex_response: reasoning-only responses (reasoning_items_raw non-empty but no final_text) now get finish_reason='incomplete', routing them to the Codex continuation path instead of the retry loop. 2. Incomplete handling: also checks for codex_reasoning_items when deciding whether to preserve an interim message, so encrypted reasoning state is not silently dropped when there is no visible reasoning text. Adds 4 regression tests covering: - Unit: reasoning-only → incomplete, reasoning+content → stop - E2E: reasoning-only → continuation → final answer succeeds - E2E: encrypted reasoning items preserved in interim messages * fix(codex): ensure reasoning items have required following item in API input Follow-up to the reasoning-only response fix. Three additional issues found by tracing the full replay path: 1. _chat_messages_to_responses_input: when a reasoning-only interim message was converted to Responses API input, the reasoning items were emitted as the last items with no following item. The Responses API requires a following item after each reasoning item (otherwise: 'missing_following_item' error, as seen in OpenHands NousResearch#11406). Now emits an empty assistant message as the required following item when content is empty but reasoning items were added. 2. Duplicate detection: two consecutive reasoning-only incomplete messages with identical empty content/reasoning but different encrypted codex_reasoning_items were incorrectly treated as duplicates, silently dropping the second response's reasoning state. Now includes codex_reasoning_items in the duplicate comparison. 3. Added tests for both the API input conversion path and the duplicate detection edge case. Research context: verified against OpenCode (uses Vercel AI SDK, no retry loop so avoids the issue), Clawdbot (drops orphaned reasoning blocks entirely), and OpenHands (hit the missing_following_item error). Our approach preserves reasoning continuity while satisfying the API constraint. --------- Co-authored-by: Test <test@test.com>
gweeteve
pushed a commit
to gweeteve/hermes-agent
that referenced
this pull request
Jun 2, 2026
…arch#2070) * fix(codex): treat reasoning-only responses as incomplete, not stop When a Codex Responses API response contains only reasoning items (encrypted thinking state) with no message text or tool calls, the _normalize_codex_response method was setting finish_reason='stop'. This sent the response into the empty-content retry loop, which burned 3 retries and then failed — exactly the pattern Nester reported in Discord. Two fixes: 1. _normalize_codex_response: reasoning-only responses (reasoning_items_raw non-empty but no final_text) now get finish_reason='incomplete', routing them to the Codex continuation path instead of the retry loop. 2. Incomplete handling: also checks for codex_reasoning_items when deciding whether to preserve an interim message, so encrypted reasoning state is not silently dropped when there is no visible reasoning text. Adds 4 regression tests covering: - Unit: reasoning-only → incomplete, reasoning+content → stop - E2E: reasoning-only → continuation → final answer succeeds - E2E: encrypted reasoning items preserved in interim messages * fix(codex): ensure reasoning items have required following item in API input Follow-up to the reasoning-only response fix. Three additional issues found by tracing the full replay path: 1. _chat_messages_to_responses_input: when a reasoning-only interim message was converted to Responses API input, the reasoning items were emitted as the last items with no following item. The Responses API requires a following item after each reasoning item (otherwise: 'missing_following_item' error, as seen in OpenHands NousResearch#11406). Now emits an empty assistant message as the required following item when content is empty but reasoning items were added. 2. Duplicate detection: two consecutive reasoning-only incomplete messages with identical empty content/reasoning but different encrypted codex_reasoning_items were incorrectly treated as duplicates, silently dropping the second response's reasoning state. Now includes codex_reasoning_items in the duplicate comparison. 3. Added tests for both the API input conversion path and the duplicate detection edge case. Research context: verified against OpenCode (uses Vercel AI SDK, no retry loop so avoids the issue), Clawdbot (drops orphaned reasoning blocks entirely), and OpenHands (hit the missing_following_item error). Our approach preserves reasoning continuity while satisfying the API constraint. --------- Co-authored-by: Test <test@test.com>
Egavasyug
pushed a commit
to Egavasyug/hermes-agent
that referenced
this pull request
Jun 10, 2026
…arch#2070) * fix(codex): treat reasoning-only responses as incomplete, not stop When a Codex Responses API response contains only reasoning items (encrypted thinking state) with no message text or tool calls, the _normalize_codex_response method was setting finish_reason='stop'. This sent the response into the empty-content retry loop, which burned 3 retries and then failed — exactly the pattern Nester reported in Discord. Two fixes: 1. _normalize_codex_response: reasoning-only responses (reasoning_items_raw non-empty but no final_text) now get finish_reason='incomplete', routing them to the Codex continuation path instead of the retry loop. 2. Incomplete handling: also checks for codex_reasoning_items when deciding whether to preserve an interim message, so encrypted reasoning state is not silently dropped when there is no visible reasoning text. Adds 4 regression tests covering: - Unit: reasoning-only → incomplete, reasoning+content → stop - E2E: reasoning-only → continuation → final answer succeeds - E2E: encrypted reasoning items preserved in interim messages * fix(codex): ensure reasoning items have required following item in API input Follow-up to the reasoning-only response fix. Three additional issues found by tracing the full replay path: 1. _chat_messages_to_responses_input: when a reasoning-only interim message was converted to Responses API input, the reasoning items were emitted as the last items with no following item. The Responses API requires a following item after each reasoning item (otherwise: 'missing_following_item' error, as seen in OpenHands NousResearch#11406). Now emits an empty assistant message as the required following item when content is empty but reasoning items were added. 2. Duplicate detection: two consecutive reasoning-only incomplete messages with identical empty content/reasoning but different encrypted codex_reasoning_items were incorrectly treated as duplicates, silently dropping the second response's reasoning state. Now includes codex_reasoning_items in the duplicate comparison. 3. Added tests for both the API input conversion path and the duplicate detection edge case. Research context: verified against OpenCode (uses Vercel AI SDK, no retry loop so avoids the issue), Clawdbot (drops orphaned reasoning blocks entirely), and OpenHands (hit the missing_following_item error). Our approach preserves reasoning continuity while satisfying the API constraint. --------- Co-authored-by: Test <test@test.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the bug Nester reported in Discord where Codex responses containing only reasoning/thinking blocks (no visible content) would trigger the empty-content retry loop, burning 3 retries and failing with
Max retries (3) for empty content exceeded.Root Cause
_normalize_codex_response()was settingfinish_reason='stop'for responses that contained only reasoning items (encrypted thinking state) with no message text. This is incorrect — the model is still thinking and needs another turn.Changes (2 commits)
Commit 1: Core fix
run_agent.py— two fixes:_normalize_codex_response: Added a new branch — whenreasoning_items_rawis non-empty butfinal_textis empty (and no tool calls), setfinish_reason='incomplete'instead of'stop'. This routes the response to the Codex continuation path.codex_reasoning_itemswhen deciding whether to preserve an interim message.Commit 2: Replay path hardening (found via research)
After researching how OpenCode, Clawdbot/KiloCode, and OpenHands handle reasoning-only Responses API responses, found 2 additional bugs:
_chat_messages_to_responses_input: Reasoning-only interim messages were converted to API input with the reasoning item as the LAST item — no following item. The Responses API requires a following item after each reasoning item (missing_following_itemerror, as OpenHands discovered in their feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro #11406). Now emits an empty assistant message as the required following item.Duplicate detection: Two consecutive reasoning-only incomplete messages with different
codex_reasoning_itemsbut identical empty content/reasoning were treated as duplicates, silently dropping the second response's encrypted state. Fixed by includingcodex_reasoning_itemsin the comparison.Comparison with other agents:
downgradeOpenAIReasoningBlocks). Defensive but loses reasoning continuity.missing_following_itemandinvalid_encrypted_contenterrors. Their fix: treat reasoning items as output-only artifacts.Tests (8 new)