fix: backfill codex stream output from output_item.done events#5689
Merged
Conversation
Salvages the core fix from PR #5673 (egerev) onto current main. The chatgpt.com/backend-api/codex endpoint streams valid output items via response.output_item.done events, but the OpenAI SDK's get_final_response() returns an empty output list. This caused every Codex response to be rejected as invalid. Fix: collect output_item.done events during streaming and backfill response.output when get_final_response() returns empty. Falls back to synthesizing from text deltas when no done events were received. Also moves the synthesis logic from the validation loop (too late, from #5681) into _run_codex_stream() (before the response leaves the streaming function), and simplifies the validation to just log diagnostics since recovery now happens upstream. Co-authored-by: Egor <egerev@users.noreply.github.com>
Closed
4 tasks
This was referenced Apr 7, 2026
Tommyeds
pushed a commit
to Tommyeds/hermes-agent
that referenced
this pull request
Apr 12, 2026
…esearch#5689) Salvages the core fix from PR NousResearch#5673 (egerev) onto current main. The chatgpt.com/backend-api/codex endpoint streams valid output items via response.output_item.done events, but the OpenAI SDK's get_final_response() returns an empty output list. This caused every Codex response to be rejected as invalid. Fix: collect output_item.done events during streaming and backfill response.output when get_final_response() returns empty. Falls back to synthesizing from text deltas when no done events were received. Also moves the synthesis logic from the validation loop (too late, from NousResearch#5681) into _run_codex_stream() (before the response leaves the streaming function), and simplifies the validation to just log diagnostics since recovery now happens upstream. Co-authored-by: Egor <egerev@users.noreply.github.com>
This was referenced Apr 27, 2026
angelburgosrosado
pushed a commit
to angelburgosrosado/hermes-agent
that referenced
this pull request
Apr 27, 2026
…esearch#5689) Salvages the core fix from PR NousResearch#5673 (egerev) onto current main. The chatgpt.com/backend-api/codex endpoint streams valid output items via response.output_item.done events, but the OpenAI SDK's get_final_response() returns an empty output list. This caused every Codex response to be rejected as invalid. Fix: collect output_item.done events during streaming and backfill response.output when get_final_response() returns empty. Falls back to synthesizing from text deltas when no done events were received. Also moves the synthesis logic from the validation loop (too late, from NousResearch#5681) into _run_codex_stream() (before the response leaves the streaming function), and simplifies the validation to just log diagnostics since recovery now happens upstream. Co-authored-by: Egor <egerev@users.noreply.github.com>
atlas243
pushed a commit
to atlas243/hermes-agent
that referenced
this pull request
Apr 28, 2026
…tomizations Brings in 30+ commits of upstream Hermes changes (including the codex output[] backfill fix from NousResearch#5689 / commit 0e336b0) AND closes the loop on the branch-safe update flow that left this branch silently behind upstream for 11+ days. Symptom that triggered this work: every gateway turn was failing with "Invalid API response (attempt 1/3): response.output is empty" on gpt-5.4 via openai-codex. The fix landed upstream 2026-04-06; without this merge it never reached the customizations branch because ``hermes update`` only updated origin/main and switched back to blaize-customizations without merging main into it. == Conflict resolution highlights == run_agent.py: kept main's _touch_activity(desc) API + main's codex backfill in _run_codex_stream; preserved HEAD's _reasoning_deltas_fired reset and the public touch_activity() wrapper for delegate_tool / gateway/run.py callers (now delegates to _touch_activity for description sync). Guarded the cached-agent touch_activity() reset with hasattr() so test mocks don't break. hermes_cli/config.py: bumped _config_version 18 → 19 and added HEAD's progress-aware-timeout migration as a new 18 → 19 step (idempotent via "if 'timeout' not in config" guard, so users on either v12 or v18 land in a correct state). hermes_cli/main.py: kept HEAD's branch-safety guards (should_restore_original_branch, should_auto_restart_gateway) and swapped in main's improved multi-profile gateway restart logic (supports_systemd_services, find_gateway_pids, retry-on-die). gateway/run.py: kept HEAD's per-channel overrides + two-threshold progress-aware timeout monitor (CLAUDE.md documents this as the intentional design); added main's _notify_long_running periodic "Still working" notifications and main's service_tier / request_overrides plumbing on cached agent reuse. tools/cronjob_tools.py: restored both 'reason' (HEAD) and 'script' (main) schema entries that the auto-merger had collided. Restored both timeout_seconds (HEAD) and script (main) function args. hermes_cli/commands.py: kept HEAD's priority_skills reordering for Telegram menus while taking main's _collect_gateway_skill_entries refactor (priority_skills now applied as a post-processing step on the helper's output). Kept both new CommandDefs (restart-gateway from HEAD, debug from main). cron/scheduler.py: took main's inactivity-based timeout structure but restored HEAD's per-job timeout_seconds lookup (job.get("timeout_seconds")) so per-job overrides still work. gateway/platforms/telegram.py: kept HEAD's _menu_config_mtime AND main's _model_picker_state, _approval_state, plus all of main's new helper methods. == Update flow fix (prevents future drift) == hermes_cli/main.py cmd_update: after restoring the working tree to the customizations branch, run ``git merge --no-edit origin/main`` into the customizations branch so it actually catches up to main. On clean merge, log success and proceed. On conflict, ``git merge --abort`` so the working tree stays clean, surface the conflict to the user, and force should_auto_restart_gateway = False. Applies in both the up-to-date-already path and the new-commits-pulled path. Adds two regression tests in TestUpdateMergesMainIntoCustomizations: - test_clean_merge_runs_after_branch_restore: verifies the merge is invoked when on the customizations branch - test_conflict_aborts_merge_and_blocks_auto_restart: verifies merge --abort runs on conflict and launchd_restart is skipped == Test fixes == tests/hermes_cli/test_config.py: bumped expected config version 18 → 19 tests/tools/test_browser_camofox_state.py: same bump == Pre-existing upstream test failures (unrelated to this merge) == Verified failing on clean origin/main: - test_wsl_with_systemd: macOS lacks systemctl - test_concurrent_inserts_settle_at_cap: ~70s slow concurrent test - test_file_staleness::test_warning_when_file_modified_externally - test_file_staleness::test_patch_warns_on_stale_file (macOS treats /var/folders as a sensitive system path) - test_transcription::test_explicit_local_no_cloud_fallback - test_transcription::test_local_nothing_available Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
02356abc
pushed a commit
to 02356abc/hermes-agent
that referenced
this pull request
May 14, 2026
…esearch#5689) Salvages the core fix from PR NousResearch#5673 (egerev) onto current main. The chatgpt.com/backend-api/codex endpoint streams valid output items via response.output_item.done events, but the OpenAI SDK's get_final_response() returns an empty output list. This caused every Codex response to be rejected as invalid. Fix: collect output_item.done events during streaming and backfill response.output when get_final_response() returns empty. Falls back to synthesizing from text deltas when no done events were received. Also moves the synthesis logic from the validation loop (too late, from NousResearch#5681) into _run_codex_stream() (before the response leaves the streaming function), and simplifies the validation to just log diagnostics since recovery now happens upstream. Co-authored-by: Egor <egerev@users.noreply.github.com>
olympus-terminal
pushed a commit
to olympus-terminal/hermes-agent
that referenced
this pull request
May 16, 2026
…esearch#5689) Salvages the core fix from PR NousResearch#5673 (egerev) onto current main. The chatgpt.com/backend-api/codex endpoint streams valid output items via response.output_item.done events, but the OpenAI SDK's get_final_response() returns an empty output list. This caused every Codex response to be rejected as invalid. Fix: collect output_item.done events during streaming and backfill response.output when get_final_response() returns empty. Falls back to synthesizing from text deltas when no done events were received. Also moves the synthesis logic from the validation loop (too late, from NousResearch#5681) into _run_codex_stream() (before the response leaves the streaming function), and simplifies the validation to just log diagnostics since recovery now happens upstream. Co-authored-by: Egor <egerev@users.noreply.github.com>
gweeteve
pushed a commit
to gweeteve/hermes-agent
that referenced
this pull request
Jun 2, 2026
…esearch#5689) Salvages the core fix from PR NousResearch#5673 (egerev) onto current main. The chatgpt.com/backend-api/codex endpoint streams valid output items via response.output_item.done events, but the OpenAI SDK's get_final_response() returns an empty output list. This caused every Codex response to be rejected as invalid. Fix: collect output_item.done events during streaming and backfill response.output when get_final_response() returns empty. Falls back to synthesizing from text deltas when no done events were received. Also moves the synthesis logic from the validation loop (too late, from NousResearch#5681) into _run_codex_stream() (before the response leaves the streaming function), and simplifies the validation to just log diagnostics since recovery now happens upstream. Co-authored-by: Egor <egerev@users.noreply.github.com>
Egavasyug
pushed a commit
to Egavasyug/hermes-agent
that referenced
this pull request
Jun 10, 2026
…esearch#5689) Salvages the core fix from PR NousResearch#5673 (egerev) onto current main. The chatgpt.com/backend-api/codex endpoint streams valid output items via response.output_item.done events, but the OpenAI SDK's get_final_response() returns an empty output list. This caused every Codex response to be rejected as invalid. Fix: collect output_item.done events during streaming and backfill response.output when get_final_response() returns empty. Falls back to synthesizing from text deltas when no done events were received. Also moves the synthesis logic from the validation loop (too late, from NousResearch#5681) into _run_codex_stream() (before the response leaves the streaming function), and simplifies the validation to just log diagnostics since recovery now happens upstream. Co-authored-by: Egor <egerev@users.noreply.github.com>
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Salvages the core fix from PR #5673 (egerev) onto current main.
Problem
The chatgpt.com/backend-api/codex endpoint streams valid output items via
response.output_item.doneevents, but the OpenAI SDK'sget_final_response()returns an emptyoutputlist. This caused every Codex response to be rejected as invalid with "response.output is empty".Fix
response.output_item.doneevents during streamingget_final_response(), backfillresponse.outputfrom collected items when empty_run_codex_stream()(before the response leaves the streaming function)Credit
Core approach from PR #5673 by @egerev. Closes #5673.
Test plan
python -m pytest tests/test_run_agent_codex_responses.py -n0 -q— 33 passed