-
-
Notifications
You must be signed in to change notification settings - Fork 54.7k
Description
Description:
When using the google-gemini-cli provider with a configured fallback chain (e.g., gemini-3-pro -> gemini-2.5-flash), receiving a 429 Quota Exhausted error on the primary model causes the provider to mark the entire Auth Profile as "cooldown/unavailable."
This prevents the fallback models from running, even though they (likely) have independent quotas or are cheaper/unlimited. The agent fails immediately with "No available auth profile" for all subsequent models in the chain.
Expected Behavior:
A 429 error on gemini-3-pro should only mark gemini-3-pro as unavailable. The provider should proceed to try gemini-2.5-flash using the same Auth Profile (since quotas are often model-specific).
Actual Behavior:
The first model hits 429. The system logs:
google-gemini-cli/gemini-2.5-pro: No available auth profile for google-gemini-cli (all in cooldown or unavailable).
Result: Total agent failure despite valid fallbacks.
Logs:
google-gemini-cli/gemini-3-pro-preview: Cloud Code Assist API error (429): You have exhausted your capacity on this model. (rate_limit)
| google-gemini-cli/gemini-2.5-pro: No available auth profile for google-gemini-cli (all in cooldown or unavailable). (rate_limit)
| google-gemini-cli/gemini-2.5-flash: No available auth profile for google-gemini-cli (all in cooldown or unavailable). (rate_limit).
Environment:
• Provider: google-gemini-cli
• Auth Mode: OAuth (Single Profile)
• Agent Config: Primary + Fallbacks list