fix(codex): surface actionable hint when stale-call detector fires on known silent-reject pattern#32016
Merged
Merged
Conversation
… known silent-reject pattern The ChatGPT Codex backend (chatgpt.com/backend-api/codex) has historically silently dropped certain model requests: the connection is accepted but no stream events are emitted and no error is raised. PR #31967 lowered the implicit stale-call default from 300s to 90s so fallbacks kick in faster, but users still see an opaque "No response from provider for 90s (non-streaming, ...)" message that gives no path forward. This patch adds a narrow heuristic — gpt-5.5 family on the Codex backend via codex_responses api_mode — that substitutes the generic timeout message with actionable text naming the gpt-5.4-codex workaround and pointing at #21444 for symptom history. Changes: - run_agent.py — new ``AIAgent._codex_silent_hang_hint(model=...)`` method. Returns ``None`` for any request that does not match all three guards (codex_responses api_mode, openai-codex provider or chatgpt.com Codex base URL, gpt-5.5-family model name with word-boundary regex anchoring to avoid false-positives on e.g. ``gpt-5.50``). - agent/chat_completion_helpers.py — the non-stream stale-call site consults the hint via ``getattr(...)`` so the call site stays robust if the helper is ever removed or stubbed in tests. Hint is appended to both the ``_emit_status`` warning and the ``TimeoutError`` message so the user sees it in their terminal AND it lands in any retry-loop diagnostics. - tests/run_agent/test_codex_silent_hang_hint.py — 10 regression tests covering positive cases (bare gpt-5.5, vendor-prefixed openai/gpt-5.5, gpt-5.5-codex SKU, model=None fallback to self.model) and negative cases (gpt-5.4-codex workaround, gpt-5.50 false-positive guard, non-codex api_mode, non-codex provider, empty/None model, unrelated models on Codex). Does NOT fix the backend-side issue (that's an upstream OpenAI/ChatGPT problem we cannot patch from here). Only converts an opaque timeout into text that names the workaround so users do not have to dig through logs or wait for a forum post to learn what to do. Closes #22046
Contributor
🔎 Lint report:
|
| Rule | Count |
|---|---|
invalid-argument-type |
10 |
unresolved-attribute |
3 |
unresolved-import |
1 |
First entries
run_agent.py:955: [unresolved-attribute] unresolved-attribute: Object of type `Self@_codex_silent_hang_hint` has no attribute `provider`
tests/run_agent/test_codex_silent_hang_hint.py:29: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `dict[str, Any]`, found `str | bool`
tests/run_agent/test_codex_silent_hang_hint.py:29: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `int | float | None`, found `str | bool`
tests/run_agent/test_codex_silent_hang_hint.py:29: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `int | float`, found `str | bool`
tests/run_agent/test_codex_silent_hang_hint.py:29: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `list[str]`, found `str | bool`
tests/run_agent/test_codex_silent_hang_hint.py:29: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `str`, found `str | bool`
run_agent.py:952: [unresolved-attribute] unresolved-attribute: Object of type `Self@_codex_silent_hang_hint` has no attribute `api_mode`
tests/run_agent/test_codex_silent_hang_hint.py:29: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `IterationBudget`, found `str | bool`
tests/run_agent/test_codex_silent_hang_hint.py:29: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `bool`, found `str | bool`
tests/run_agent/test_codex_silent_hang_hint.py:29: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `int`, found `str | bool`
tests/run_agent/test_codex_silent_hang_hint.py:13: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
run_agent.py:963: [unresolved-attribute] unresolved-attribute: Object of type `Self@_codex_silent_hang_hint` has no attribute `model`
tests/run_agent/test_codex_silent_hang_hint.py:29: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `list[str] | None`, found `str | bool`
tests/run_agent/test_codex_silent_hang_hint.py:29: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `list[dict[str, Any]]`, found `str | bool`
✅ Fixed issues: none
Unchanged: 4909 pre-existing issues carried over.
Diagnostics are surfaced as warnings — this check never fails the build.
robintong
pushed a commit
to robintong/hermes-agent
that referenced
this pull request
May 27, 2026
Replaying a batch of prior codex_reasoning_items into the Responses input hangs the chatgpt.com/backend-api/codex backend on a cold prompt cache: the SSE stream emits no first event and the call dies at the stale timeout (issue NousResearch#21444 / NousResearch#11179 family). Isolated 2026-05-27 on a 198-message resume: full reasoning replay -> 80s+ hang; drop reasoning -> ~2s; summary-only (encrypted_content stripped, items kept) still hangs -> so it is the reasoning items in the input, not just the encrypted blob. Upstream only mitigated (PR NousResearch#31967 fails faster; PR NousResearch#32016 hints at gpt-5.4-codex, unavailable on a ChatGPT account). Gate the codex reasoning replay behind _codex_replay_reasoning_enabled() (default off; HERMES_CODEX_REPLAY_REASONING=1 restores it). Mirrors the existing is_xai_responses strip. The model re-reasons from visible history each turn; summaries remain in the saved session. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When the stale-call detector fires on a known Codex silent-reject pattern (gpt-5.5 family on
chatgpt.com/backend-api/codex), the user now sees actionable text instead of "Aborting call." Names the gpt-5.4-codex workaround and points at #21444 for symptom history.Companion to PR #31967 (which lowered the implicit stale-call default from 300s to 90s). Together: fallbacks kick in faster, and when they do the user knows what to do about it.
Closes #22046.
Salvage notes
PR #22046 (@Tranquil-Flow) was based on a branch ~2 weeks stale against
main, so direct cherry-pick would have reverted unrelated fixes (#29507 stranger-thread close protection, natural-ending emoji/caret, xAI disambiguator #29344, redact-sensitive-text import). Only the substantive contribution — the_codex_silent_hang_hinthelper plus the call-site hook — was salvaged onto current main. Authorship preserved on the commit.Wording was updated: the original PR linked
openai/codex#19654which closed May 1, 2026. The new hint text instead points at hermes-agent#21444 for symptom history and recommends the gpt-5.4-codex workaround generally rather than referencing a now-closed upstream issue.Changes
run_agent.py— newAIAgent._codex_silent_hang_hint(model=...)method. ReturnsNoneunless all three guards match:api_mode == "codex_responses"openai-codexOR base URL ischatgpt.com/backend-api/codexgpt-5.5family via word-boundary regex (guards against false-positive ongpt-5.50)agent/chat_completion_helpers.py— non-stream stale-call site consults the hint viagetattr(...)for robustness. Hint is appended to both_emit_status(terminal warning) andTimeoutErrormessage (retry-loop diagnostics).tests/run_agent/test_codex_silent_hang_hint.py— 10 regression tests covering 4 positive + 6 negative cases.Validation
Infographic