Skip to content

fix(core): Improve API error retry logic#9763

Merged
SandyTao520 merged 1 commit into
mainfrom
st/fix-retry
Sep 25, 2025
Merged

fix(core): Improve API error retry logic#9763
SandyTao520 merged 1 commit into
mainfrom
st/fix-retry

Conversation

@SandyTao520

Copy link
Copy Markdown
Contributor

TLDR

This pull request refactors the API error handling and retry mechanism within the core chat functionality. Previously, we relied on string-matching error messages to decide whether to retry an API call. This update switches to using the structured ApiError from the GenAI SDK, checking the HTTP status code directly. This makes our retry logic more robust and reliable, ensuring we correctly retry on transient errors like rate limits (429) and server errors (5xx) while failing fast on client-side issues like bad requests (400).

Dive Deeper

The previous implementation of our retry logic was brittle because it parsed the text of an error message to find status codes. This approach can easily break if the upstream SDK changes its error message formatting.

This change improves our resilience by:

  1. Type-Safe Error Checking: We now check if an error is an instanceof ApiError from @google/genai.
  2. Using Status Codes: We use the error.status property to make decisions, which is a much more stable API than the error message.
  3. Clearer Logic: The shouldRetry function is now more explicit:
    • Do not retry on 400 Bad Request errors.
    • Do not retry on "maximum schema depth exceeded" errors.
    • Retry on 429 Rate Limit Exceeded errors.
    • Retry on all 5xx server-side errors.

This ensures the CLI is more predictable when interacting with the Gemini API, leading to a better user experience during transient network or server issues.

Reviewer Test Plan

Validating this change manually is difficult as it requires forcing the backend to produce specific HTTP error codes.

The most effective way to review this PR is to:

  1. Use the prompt what's in @packages/, this would exceed token limit and get 400 bad request.
  2. Verify this API error won't trigger a retry.

Testing Matrix

🍏 🪟 🐧
npm run
npx
Docker
Podman - -
Seatbelt - -

@SandyTao520 SandyTao520 marked this pull request as ready for review September 25, 2025 17:45
@SandyTao520 SandyTao520 requested a review from a team as a code owner September 25, 2025 17:45
@github-actions

Copy link
Copy Markdown

Size Change: +58 B (0%)

Total Size: 17.4 MB

ℹ️ View Unchanged
Filename Size Change
./bundle/gemini.js 17.4 MB +58 B (0%)
./bundle/sandbox-macos-permissive-closed.sb 1.03 kB 0 B
./bundle/sandbox-macos-permissive-open.sb 830 B 0 B
./bundle/sandbox-macos-permissive-proxied.sb 1.31 kB 0 B
./bundle/sandbox-macos-restrictive-closed.sb 3.29 kB 0 B
./bundle/sandbox-macos-restrictive-open.sb 3.36 kB 0 B
./bundle/sandbox-macos-restrictive-proxied.sb 3.56 kB 0 B

compressed-size-action

@anj-s anj-s left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice one!

@SandyTao520 SandyTao520 added this pull request to the merge queue Sep 25, 2025
Merged via the queue into main with commit e209724 Sep 25, 2025
17 of 19 checks passed
@SandyTao520 SandyTao520 deleted the st/fix-retry branch September 25, 2025 18:06
geoffdowns pushed a commit to geoffdowns/gemini-cli that referenced this pull request Sep 26, 2025
giraffe-tree pushed a commit to giraffe-tree/gemini-cli that referenced this pull request Oct 10, 2025
@sripasg sripasg added the size/m A medium sized PR label Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/m A medium sized PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants