-
Notifications
You must be signed in to change notification settings - Fork 0
[FEATURE] Add Retry Logic with Exponential Backoff for HTTP Requests #19
Copy link
Copy link
Labels
Description
Summary
The SDK currently lacks retry logic for failed HTTP requests, causing applications to fail on transient network issues or temporary server errors. This is particularly problematic when dealing with:
- Network timeouts
- Temporary service unavailability (503 errors)
- Rate limiting (429 errors)
- Gateway timeouts (504 errors)
Current Behavior
When an HTTP request fails, the SDK immediately returns the error to the caller without attempting any retries:
// Current: Fails immediately on error
response, err := client.GenerateContent(ctx, provider, model, messages)
if err != nil {
return err // No retry attempted
}Expected Behavior
The SDK should automatically retry failed requests with configurable exponential backoff, similar to other production SDKs.
Proposed Solution
1. Add Retry Configuration to ClientOptions
type ClientOptions struct {
BaseURL string
APIKey string
RetryConfig *RetryConfig // New optional field
}
type RetryConfig struct {
Enabled bool // Default: true
MaxAttempts int // Default: 3
InitialBackoffSec int // Default: 2
MaxBackoffSec int // Default: 30
BackoffMultiplier int // Default: 2
}2. Default Retry Conditions
The SDK should retry on:
HTTP Status Codes:
408Request Timeout429Too Many Requests500Internal Server Error502Bad Gateway503Service Unavailable504Gateway Timeout
Network Errors:
- Connection refused
- Connection reset
- Timeout errors (including "timeout awaiting response headers")
- EOF errors
- DNS resolution failures
3. Implementation Example
func NewClient(options *ClientOptions) Client {
// Use sensible defaults if no retry config provided
if options.RetryConfig == nil {
options.RetryConfig = &RetryConfig{
Enabled: true,
MaxAttempts: 3,
InitialBackoffSec: 2,
MaxBackoffSec: 30,
BackoffMultiplier: 2,
}
}
// Create HTTP client with retry wrapper
httpClient := newRetryableHTTPClient(options.RetryConfig)
return &client{
baseURL: options.BaseURL,
apiKey: options.APIKey,
httpClient: httpClient,
}
}4. Usage Examples
With default retry behavior:
client := sdk.NewClient(&sdk.ClientOptions{
BaseURL: "https://api.example.com",
APIKey: "key",
})With custom retry configuration:
client := sdk.NewClient(&sdk.ClientOptions{
BaseURL: "https://api.example.com",
APIKey: "key",
RetryConfig: &sdk.RetryConfig{
Enabled: true,
MaxAttempts: 5,
InitialBackoffSec: 1,
MaxBackoffSec: 60,
BackoffMultiplier: 3,
},
})Disable retries:
client := sdk.NewClient(&sdk.ClientOptions{
BaseURL: "https://api.example.com",
APIKey: "key",
RetryConfig: &sdk.RetryConfig{
Enabled: false,
},
})Benefits
- Improved Reliability: Automatic recovery from transient failures
- Better UX: Applications using the SDK won't fail on temporary issues
- Production Ready: Aligns with industry best practices for API clients
- Configurable: Users can tune retry behavior for their specific needs
Additional Considerations
- Request Idempotency: Only retry safe/idempotent operations (GET, PUT, DELETE)
- Request Body: Ensure request body can be re-read for retries (use
GetBodyfunc) - Context Cancellation: Respect context cancellation between retries
- Logging: Log retry attempts for debugging (at debug level)
- Metrics: Consider exposing retry metrics for monitoring
References
- https://raw.githubusercontent.com/cenkalti/backoff/refs/heads/v5/backoff.go
- Google Cloud Go retry logic: https://github.com/googleapis/google-cloud-go/blob/main/internal/retry.go
- OpenAI Go SDK retry: https://github.com/sashabaranov/go-openai/blob/master/client.go
Testing
- Unit tests for retry logic with various error conditions
- Integration tests with mock server returning retryable errors
- Timeout and cancellation handling tests
- Exponential backoff calculation tests
Acceptance Criteria
- It's possible to configure the SDK client with backoff retry
- It's tested
- It's documented
Reactions are currently unavailable