fix(core): distinguish fallback chains and fix maxAttempts for auto vs explicit model selection#26163
Conversation
|
Hi @adamfweidman, thank you so much for your contribution to Gemini CLI! We really appreciate the time and effort you've put into this. We're making some updates to our contribution process to improve how we track and review changes. Please take a moment to review our recent discussion post: Improving Our Contribution Process & Introducing New Guidelines. Key Update: Starting January 26, 2026, the Gemini CLI project will require all pull requests to be associated with an existing issue. Any pull requests not linked to an issue by that date will be automatically closed. Thank you for your understanding and for being a part of our community! |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses critical discrepancies in model routing and retry behavior. By decoupling auto-routing logic from explicit model selections and enabling dynamic resolution of retry limits, the changes ensure that models maintain their intended attempt thresholds even when falling back mid-turn. This improves the reliability of both automated and user-specified model interactions. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
|
Size Change: +2.99 kB (+0.01%) Total Size: 33.9 MB
ℹ️ View Unchanged
|
There was a problem hiding this comment.
Code Review
This pull request implements enhanced retry logic for auto-selected models, introducing a configurable maximum attempt count (defaulting to 3) and 'sticky' retry behavior for transient failures. The changes span the model availability service, policy configurations, and the core retry utility, which now dynamically retrieves the maximum allowed attempts from the active policy context. Additionally, error classification is updated to support name-based matching for cross-realm errors, and environment stubbing is added to tests for consistency. Review feedback highlights an opportunity to reduce redundant logic in policyHelpers.ts and suggests a performance optimization in the retry loop to avoid repeated expensive policy resolutions.
…s explicit model selection
f5ede9d to
55a14fd
Compare
…allback-attempts # Conflicts: # schemas/settings.schema.json
…s explicit model selection (google-gemini#26163)
…s explicit model selection (google-gemini#26163)
Summary
This PR addresses discrepancies between explicit model selections and auto-routing, specifically targeting the
maxAttemptsfall-through bug and ensuring consistent behavior for fallback chains.Details
Previously, the routing logic forced any explicit selection of a Gemini 3 model into the auto-routing chains, limiting it to 3 attempts and silently downgrading to Flash on failure. Furthermore, a bug existed in
geminiChat.tswheremaxAttemptswas resolved statically at the start of a turn, causing a model like Flash (which should have 10 attempts) to inherit Pro's 3-attempt limit if it fell back mid-turn.This PR introduces the following changes:
policyCatalog.tsanddefaultModelConfigs.tsto cleanly separate auto-routing logic from explicit selections using anisAutoSelectionflag.auto-previewandauto-defaultchains for dynamic model routing.AUTO_ROUTING_OVERRIDESconstant inpolicyCatalog.ts.preferredModel && isAutoModel(preferredModel, config)) inpolicyHelpers.tsto evaluate the auto selection logic cleanly.geminiChat.tsto dynamically resolvecurrentMaxAttemptson each iteration ofretryWithBackoffusing the existinggetAvailabilityContextcallback, fixing the fallback limit inheritance bug.Related Issues
How to Validate
gemini-cli-core(npm test -w @google/gemini-cli-core -- src/core/geminiChat.test.ts src/availability/policyHelpers.test.ts src/availability/policyCatalog.test.ts src/utils/retry.test.ts) to ensure strict parity between dynamic and legacy chains holds, and that retries evaluate dynamically.--model gemini-3-pro-preview) and mock a persistent 429 error to verify it retries 10 times and prompts the user.--model autoand mock a persistent 429 error to verify it silently falls back to Flash after 3 attempts.Pre-Merge Checklist