fix(auxiliary): universal retry when any provider rejects temperature#15627
Merged
Conversation
Universal reactive fix for 'HTTP 400: Unsupported parameter: temperature' across all providers/models — not just Codex Responses. The same backend can accept temperature for some models and reject it for others (e.g. gpt-5.4 accepts but gpt-5.5 rejects on the same OpenAI endpoint; similar patterns on Copilot, OpenRouter reasoning routes, and Anthropic Opus 4.7+ via OAI-compat). An allow/deny-list by model name does not scale. call_llm / async_call_llm now detect the concrete 'unsupported parameter: temperature' 400 and transparently retry once without temperature. Kimi's server-managed omission and Opus 4.7+'s proactive strip stay in place — this is the safety net for everything else. Changes: - agent/auxiliary_client.py: add _is_unsupported_temperature_error helper; wire into both sync and async call_llm paths before the existing max_tokens/payment/auth retry ladder - tests/agent/test_unsupported_temperature_retry.py: 19 tests covering detector phrasings, sync + async retry, no-retry-without-temperature, and non-temperature 400s not triggering the retry Builds on PR #15620 (codex_responses fallback) which stripped temperature up front for that one api_mode. This PR closes the gap for every other provider/model combo via reactive retry. Credit: retry approach and detector originate from @BlueBirdBack's PR #15578. Co-authored-by: BlueBirdBack <BlueBirdBack@users.noreply.github.com>
Collaborator
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Auxiliary LLM calls now survive ''HTTP 400: Unsupported parameter: temperature'' from any provider, not just Codex Responses. Follow-up to #15620, which stripped temperature up front for ''codex_responses'' only — this PR closes the gap for every other provider/model combo via a reactive retry.
Why this isn't a provider-by-provider allowlist: the same backend can accept temperature for some models and reject it for others (gpt-5.4 accepts / gpt-5.5 rejects on the same OpenAI endpoint; similar patterns on Copilot reasoning models, OpenRouter reasoning routes, and Anthropic Opus 4.7+ via OAI-compat). React to the concrete error rather than maintain a list.
Changes
agent/auxiliary_client.py: add_is_unsupported_temperature_errorhelper; retry once withouttemperaturewhencall_llm/async_call_llmsee the marker error, before the existing max_tokens / payment / auth retry ladder.tests/agent/test_unsupported_temperature_retry.py: 19 tests — detector phrasings across providers, sync + async retry, no-retry whentemperaturewasn't sent, unrelated 400s don't trigger a silent strip.scripts/release.py: AUTHOR_MAP entry forash@users.noreply.github.com→ash.Validation
tests/agent/test_unsupported_temperature_retry.pytests/agent/test_auxiliary_client.py+tests/run_agent/test_flush_memories_codex.pyCredit
Retry approach and detector originate from @BlueBirdBack's PR #15578 (authored by @ash). Commit authorship preserved; #15578 will be closed as superseded.