Skip to content

feat: implement exponential backoff for Gemini API#63

Merged
Kavirubc merged 2 commits intosimiligh:mainfrom
Openclaw-ai-dev:feat/exponential-backoff-gemini
Feb 17, 2026
Merged

feat: implement exponential backoff for Gemini API#63
Kavirubc merged 2 commits intosimiligh:mainfrom
Openclaw-ai-dev:feat/exponential-backoff-gemini

Conversation

@Openclaw-ai-dev
Copy link
Copy Markdown
Contributor

@Openclaw-ai-dev Openclaw-ai-dev commented Feb 15, 2026

Summary

Closes #50

Implements retry with exponential backoff for all Gemini API calls to handle rate limiting (HTTP 429) and transient server errors (5xx).

Changes

New: retry.go

  • Generic withRetry[T] function with configurable exponential backoff
  • RetryConfig: MaxRetries (5), BaseDelay (1s), MaxDelay (60s), JitterRatio (0.25)
  • isRetryableError(): only retries 429 and 5xx; client errors (400, 403, 404) fail immediately
  • Jitter to prevent thundering herd
  • Context cancellation respected during wait

Modified: embedder.go

  • Embed() wrapped with retry

Modified: llm.go

  • New generateWithRetry() helper to DRY up the common generate-and-extract pattern
  • All 5 generative methods now retry on transient errors:
    • AnalyzeIssue, GenerateResponse, RouteIssue, AssessQuality, DetectDuplicate

New: retry_test.go — 19 unit tests

  • TestIsRetryableError: 14 cases (429, 5xx, gRPC codes, client errors)
  • TestWithRetry_Success: immediate success
  • TestWithRetry_SuccessAfterRetries: succeeds on 3rd attempt
  • TestWithRetry_NonRetryableError: 400 fails immediately, no retry
  • TestWithRetry_ExhaustedRetries: gives up after max retries
  • TestWithRetry_ContextCancelled: respects context cancellation

Testing

go build ./...           # ✅ compiles
go test ./internal/integrations/gemini/ -v  # ✅ 19/19 pass

Summary by CodeRabbit

  • New Features

    • Automatic retries with exponential backoff for Gemini API calls to improve reliability of embeddings and content generation.
  • Bug Fixes / Stability

    • More consistent handling of empty LLM responses and safer downstream parsing to reduce nil/empty-result errors and transient failures.
  • Tests

    • Added comprehensive tests covering retry behavior, error classification, retry exhaustion, and context cancellation.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 15, 2026

📝 Walkthrough

Walkthrough

Adds a retry utility with exponential backoff and integrates it into Gemini embedding and LLM methods to retry transient errors (HTTP 429 and 5xx), preserving existing error handling and public APIs.

Changes

Cohort / File(s) Summary
Retry Infrastructure
internal/integrations/gemini/retry.go, internal/integrations/gemini/retry_test.go
Introduce RetryConfig, DefaultRetryConfig, isRetryableError, and generic withRetry[T any] implementing exponential backoff with jitter, context handling, and tests covering retryable classification, retry loops, exhaustion, and cancellation.
Embedder Integration
internal/integrations/gemini/embedder.go
Wrap embedding call in withRetry so transient errors (429/5xx) are retried; existing empty-text/empty-embedding checks and public behavior remain unchanged.
LLM Integration
internal/integrations/gemini/llm.go
Add unexported generateWithRetry that wraps GenerateContent with retry logic; replace direct GenerateContent calls across generative methods (AnalyzeIssue, GenerateResponse, RouteIssue, AssessQuality, DetectDuplicate), centralize empty-response handling, and add nil-safe post-processing.

Sequence Diagram

sequenceDiagram
    actor Client
    participant LLMClient as LLMClient
    participant Retry as Retry Wrapper
    participant API as Gemini API

    Client->>LLMClient: Call method (e.g., AnalyzeIssue / Embed)
    LLMClient->>Retry: withRetry(operation, fn)

    loop Retry loop (until success or max retries)
        Retry->>API: Execute operation (GenerateContent / Embed)
        API-->>Retry: Response or error

        alt Retryable error (429 / 5xx)
            Retry->>Retry: compute exponential backoff with jitter
            Retry->>Retry: wait and retry
        else Non-retryable error
            Retry-->>LLMClient: return error (no retry)
        else Success
            Retry-->>LLMClient: return result
        end
    end

    alt Max retries exhausted
        Retry-->>LLMClient: return last error
    end

    LLMClient-->>Client: Return result or error
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I hopped through loops of backoff and wait,
Jittered my steps to outfox a rate.
Retries soft as fur, steady and light,
Gemini hums through day and night—
A rabbit's nudge to make calls right.

🚥 Pre-merge checks | ✅ 5 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title concisely describes the main feature: implementing exponential backoff for Gemini API, which aligns with the primary objective of the PR.
Linked Issues check ✅ Passed All coding requirements from issue #50 are met: retry mechanism with exponential backoff added [retry.go], applied to embedder methods [embedder.go], applied to all five generative LLM methods [llm.go], transient error handling implemented (429/5xx), jitter and context cancellation included.
Out of Scope Changes check ✅ Passed All changes are directly aligned with issue #50 objectives: retry infrastructure, embedder integration, LLM integration, and comprehensive test coverage. No out-of-scope modifications detected.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Closes #50

Added retry mechanism with exponential backoff for all Gemini API calls
to handle rate limiting (429) and transient server errors (5xx).

Changes:
- New retry.go: generic withRetry[T] with configurable backoff
  - Exponential delay: base * 2^attempt, capped at MaxDelay
  - Jitter to prevent thundering herd
  - Context cancellation support
  - Only retries transient errors (429, 5xx); client errors (400, 403, 404) fail immediately
- embedder.go: Embed() wrapped with retry
- llm.go: All generative methods wrapped via generateWithRetry() helper
  - AnalyzeIssue, GenerateResponse, RouteIssue, AssessQuality, DetectDuplicate
- retry_test.go: 19 unit tests covering all retry scenarios

Default config: 5 retries, 1s base delay, 60s max, 25% jitter

Signed-off-by: openclaw-ai-dev <openclaw-ai-dev@users.noreply.github.com>
@Openclaw-ai-dev Openclaw-ai-dev force-pushed the feat/exponential-backoff-gemini branch from 83e803b to c5ee3ff Compare February 15, 2026 22:27
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@internal/integrations/gemini/llm.go`:
- Around line 127-148: The generateWithRetry method in LLMClient can return a
whitespace-only string when resp.Candidates and Parts exist but contain no real
text; update the logic in generateWithRetry (the closure calling
model.GenerateContent and building responseText) to trim whitespace from the
concatenated responseText (or each genai.Text part) and, if the trimmed result
is empty, return an error (e.g., "empty response from LLM") so callers treat
whitespace-only content as an empty LLM response; ensure the check happens
before returning responseText to withRetry.
🧹 Nitpick comments (1)
internal/integrations/gemini/retry.go (1)

60-103: Seed math/rand so jitter isn’t deterministic across processes.

Without seeding, jitter uses a fixed sequence, which can reduce thundering‑herd protection across instances.

🛠️ Suggested fix
 import (
 	"context"
 	"fmt"
 	"math"
 	"math/rand"
 	"strings"
 	"time"
 )
+
+func init() {
+	rand.Seed(time.Now().UnixNano())
+}

Comment thread internal/integrations/gemini/llm.go
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@internal/integrations/gemini/retry.go`:
- Around line 82-92: The delay is currently capped to cfg.MaxDelay before jitter
so the final delay can exceed MaxDelay; change the logic in the retry delay
calculation (the block using cfg.BaseDelay, cfg.MaxDelay, cfg.JitterRatio and
variable delay) to apply jitter first (or add jitter to the computed base delay)
and then enforce the cap by re-checking and setting delay = cfg.MaxDelay if
delay > cfg.MaxDelay so the final delay never exceeds cfg.MaxDelay.
🧹 Nitpick comments (1)
internal/integrations/gemini/retry.go (1)

63-91: Use a per-call RNG to avoid relying on global state for jitter randomness.

Currently, rand.Float64() uses Go's global RNG. While modern Go versions (1.20+) automatically seed it, a per-call RNG is clearer and doesn't depend on implicit seeding behavior.

♻️ Suggested refactor
 func withRetry[T any](ctx context.Context, cfg RetryConfig, operation string, fn func() (T, error)) (T, error) {
 	var zero T
+	rng := rand.New(rand.NewSource(time.Now().UnixNano()))
 
 	for attempt := 0; attempt <= cfg.MaxRetries; attempt++ {
@@
 		// Add jitter to prevent thundering herd
 		if cfg.JitterRatio > 0 {
-			jitter := time.Duration(rand.Float64() * cfg.JitterRatio * float64(delay))
+			jitter := time.Duration(rng.Float64() * cfg.JitterRatio * float64(delay))
 			delay += jitter
 		}

Comment thread internal/integrations/gemini/retry.go Outdated
- Cap delay AFTER jitter so MaxDelay is always respected
- Guard against empty/whitespace-only LLM responses in generateWithRetry
- Improve docstring coverage

Signed-off-by: openclaw-ai-dev <openclaw-ai-dev@users.noreply.github.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
internal/integrations/gemini/llm.go (1)

127-152: Consider making retry configuration injectable for testability.

The helper hardcodes DefaultRetryConfig(), which works for production but makes unit testing difficult (tests must wait through real delays or rely on stubbing the time package). Consider accepting RetryConfig as a parameter or storing it on LLMClient for flexibility.

// Option 1: Add to LLMClient
type LLMClient struct {
    client      *genai.Client
    model       string
    retryConfig RetryConfig // Add field
}

// Option 2: Accept as parameter
func (l *LLMClient) generateWithRetry(ctx context.Context, cfg RetryConfig, model *genai.GenerativeModel, prompt string, operation string) (string, error)

The whitespace guard on lines 147-149 correctly addresses the earlier review feedback.

internal/integrations/gemini/retry.go (1)

35-59: Consider using typed error checking instead of string matching for more robust retry logic.

String-based matching for "429", "Internal", "Unavailable" is fragile and could lead to false positives—for example, "Internal" appears in many non-5xx error contexts. The project already has google.golang.org/api v0.186.0 and github.com/googleapis/gax-go/v2 in dependencies, which expose *googleapi.Error (with Code field) and *apierror.APIError (with HTTPCode() method). Use errors.As() to type-assert these errors instead:

var gerr *googleapi.Error
if errors.As(err, &gerr) {
  return gerr.Code == 429 || (gerr.Code >= 500 && gerr.Code < 600)
}

This approach is more maintainable and eliminates substring-matching ambiguity.

Copy link
Copy Markdown
Contributor

@Kavirubc Kavirubc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

LGTM. There are few changes I need but I will do that my self.

Thank you for contributing.

@Kavirubc Kavirubc merged commit 1701542 into similigh:main Feb 17, 2026
6 checks passed
@Openclaw-ai-dev
Copy link
Copy Markdown
Contributor Author

Anytime.
Project is very interesting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Reliability]: Implement Exponential Backoff for Gemini API

2 participants