feat: implement exponential backoff for Gemini API#63
feat: implement exponential backoff for Gemini API#63Kavirubc merged 2 commits intosimiligh:mainfrom
Conversation
📝 WalkthroughWalkthroughAdds a retry utility with exponential backoff and integrates it into Gemini embedding and LLM methods to retry transient errors (HTTP 429 and 5xx), preserving existing error handling and public APIs. Changes
Sequence DiagramsequenceDiagram
actor Client
participant LLMClient as LLMClient
participant Retry as Retry Wrapper
participant API as Gemini API
Client->>LLMClient: Call method (e.g., AnalyzeIssue / Embed)
LLMClient->>Retry: withRetry(operation, fn)
loop Retry loop (until success or max retries)
Retry->>API: Execute operation (GenerateContent / Embed)
API-->>Retry: Response or error
alt Retryable error (429 / 5xx)
Retry->>Retry: compute exponential backoff with jitter
Retry->>Retry: wait and retry
else Non-retryable error
Retry-->>LLMClient: return error (no retry)
else Success
Retry-->>LLMClient: return result
end
end
alt Max retries exhausted
Retry-->>LLMClient: return last error
end
LLMClient-->>Client: Return result or error
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 5 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
Closes #50 Added retry mechanism with exponential backoff for all Gemini API calls to handle rate limiting (429) and transient server errors (5xx). Changes: - New retry.go: generic withRetry[T] with configurable backoff - Exponential delay: base * 2^attempt, capped at MaxDelay - Jitter to prevent thundering herd - Context cancellation support - Only retries transient errors (429, 5xx); client errors (400, 403, 404) fail immediately - embedder.go: Embed() wrapped with retry - llm.go: All generative methods wrapped via generateWithRetry() helper - AnalyzeIssue, GenerateResponse, RouteIssue, AssessQuality, DetectDuplicate - retry_test.go: 19 unit tests covering all retry scenarios Default config: 5 retries, 1s base delay, 60s max, 25% jitter Signed-off-by: openclaw-ai-dev <openclaw-ai-dev@users.noreply.github.com>
83e803b to
c5ee3ff
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@internal/integrations/gemini/llm.go`:
- Around line 127-148: The generateWithRetry method in LLMClient can return a
whitespace-only string when resp.Candidates and Parts exist but contain no real
text; update the logic in generateWithRetry (the closure calling
model.GenerateContent and building responseText) to trim whitespace from the
concatenated responseText (or each genai.Text part) and, if the trimmed result
is empty, return an error (e.g., "empty response from LLM") so callers treat
whitespace-only content as an empty LLM response; ensure the check happens
before returning responseText to withRetry.
🧹 Nitpick comments (1)
internal/integrations/gemini/retry.go (1)
60-103: Seedmath/randso jitter isn’t deterministic across processes.Without seeding, jitter uses a fixed sequence, which can reduce thundering‑herd protection across instances.
🛠️ Suggested fix
import ( "context" "fmt" "math" "math/rand" "strings" "time" ) + +func init() { + rand.Seed(time.Now().UnixNano()) +}
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@internal/integrations/gemini/retry.go`:
- Around line 82-92: The delay is currently capped to cfg.MaxDelay before jitter
so the final delay can exceed MaxDelay; change the logic in the retry delay
calculation (the block using cfg.BaseDelay, cfg.MaxDelay, cfg.JitterRatio and
variable delay) to apply jitter first (or add jitter to the computed base delay)
and then enforce the cap by re-checking and setting delay = cfg.MaxDelay if
delay > cfg.MaxDelay so the final delay never exceeds cfg.MaxDelay.
🧹 Nitpick comments (1)
internal/integrations/gemini/retry.go (1)
63-91: Use a per-call RNG to avoid relying on global state for jitter randomness.Currently,
rand.Float64()uses Go's global RNG. While modern Go versions (1.20+) automatically seed it, a per-call RNG is clearer and doesn't depend on implicit seeding behavior.♻️ Suggested refactor
func withRetry[T any](ctx context.Context, cfg RetryConfig, operation string, fn func() (T, error)) (T, error) { var zero T + rng := rand.New(rand.NewSource(time.Now().UnixNano())) for attempt := 0; attempt <= cfg.MaxRetries; attempt++ { @@ // Add jitter to prevent thundering herd if cfg.JitterRatio > 0 { - jitter := time.Duration(rand.Float64() * cfg.JitterRatio * float64(delay)) + jitter := time.Duration(rng.Float64() * cfg.JitterRatio * float64(delay)) delay += jitter }
- Cap delay AFTER jitter so MaxDelay is always respected - Guard against empty/whitespace-only LLM responses in generateWithRetry - Improve docstring coverage Signed-off-by: openclaw-ai-dev <openclaw-ai-dev@users.noreply.github.com>
There was a problem hiding this comment.
🧹 Nitpick comments (2)
internal/integrations/gemini/llm.go (1)
127-152: Consider making retry configuration injectable for testability.The helper hardcodes
DefaultRetryConfig(), which works for production but makes unit testing difficult (tests must wait through real delays or rely on stubbing the time package). Consider acceptingRetryConfigas a parameter or storing it onLLMClientfor flexibility.// Option 1: Add to LLMClient type LLMClient struct { client *genai.Client model string retryConfig RetryConfig // Add field } // Option 2: Accept as parameter func (l *LLMClient) generateWithRetry(ctx context.Context, cfg RetryConfig, model *genai.GenerativeModel, prompt string, operation string) (string, error)The whitespace guard on lines 147-149 correctly addresses the earlier review feedback.
internal/integrations/gemini/retry.go (1)
35-59: Consider using typed error checking instead of string matching for more robust retry logic.String-based matching for
"429","Internal","Unavailable"is fragile and could lead to false positives—for example,"Internal"appears in many non-5xx error contexts. The project already hasgoogle.golang.org/api v0.186.0andgithub.com/googleapis/gax-go/v2in dependencies, which expose*googleapi.Error(withCodefield) and*apierror.APIError(withHTTPCode()method). Useerrors.As()to type-assert these errors instead:var gerr *googleapi.Error if errors.As(err, &gerr) { return gerr.Code == 429 || (gerr.Code >= 500 && gerr.Code < 600) }This approach is more maintainable and eliminates substring-matching ambiguity.
Kavirubc
left a comment
There was a problem hiding this comment.
Hi,
LGTM. There are few changes I need but I will do that my self.
Thank you for contributing.
|
Anytime. |
Summary
Closes #50
Implements retry with exponential backoff for all Gemini API calls to handle rate limiting (HTTP 429) and transient server errors (5xx).
Changes
New:
retry.gowithRetry[T]function with configurable exponential backoffRetryConfig: MaxRetries (5), BaseDelay (1s), MaxDelay (60s), JitterRatio (0.25)isRetryableError(): only retries 429 and 5xx; client errors (400, 403, 404) fail immediatelyModified:
embedder.goEmbed()wrapped with retryModified:
llm.gogenerateWithRetry()helper to DRY up the common generate-and-extract patternAnalyzeIssue,GenerateResponse,RouteIssue,AssessQuality,DetectDuplicateNew:
retry_test.go— 19 unit testsTestIsRetryableError: 14 cases (429, 5xx, gRPC codes, client errors)TestWithRetry_Success: immediate successTestWithRetry_SuccessAfterRetries: succeeds on 3rd attemptTestWithRetry_NonRetryableError: 400 fails immediately, no retryTestWithRetry_ExhaustedRetries: gives up after max retriesTestWithRetry_ContextCancelled: respects context cancellationTesting
Summary by CodeRabbit
New Features
Bug Fixes / Stability
Tests