Skip to content

[FEATURE] Add Retry Logic with Exponential Backoff for HTTP Requests #19

@edenreich

Description

@edenreich

Summary

The SDK currently lacks retry logic for failed HTTP requests, causing applications to fail on transient network issues or temporary server errors. This is particularly problematic when dealing with:

  • Network timeouts
  • Temporary service unavailability (503 errors)
  • Rate limiting (429 errors)
  • Gateway timeouts (504 errors)

Current Behavior

When an HTTP request fails, the SDK immediately returns the error to the caller without attempting any retries:

// Current: Fails immediately on error
response, err := client.GenerateContent(ctx, provider, model, messages)
if err != nil {
    return err // No retry attempted
}

Expected Behavior

The SDK should automatically retry failed requests with configurable exponential backoff, similar to other production SDKs.

Proposed Solution

1. Add Retry Configuration to ClientOptions

type ClientOptions struct {
    BaseURL     string
    APIKey      string
    RetryConfig *RetryConfig // New optional field
}

type RetryConfig struct {
    Enabled           bool // Default: true
    MaxAttempts       int  // Default: 3
    InitialBackoffSec int  // Default: 2
    MaxBackoffSec     int  // Default: 30
    BackoffMultiplier int  // Default: 2
}

2. Default Retry Conditions

The SDK should retry on:

HTTP Status Codes:

  • 408 Request Timeout
  • 429 Too Many Requests
  • 500 Internal Server Error
  • 502 Bad Gateway
  • 503 Service Unavailable
  • 504 Gateway Timeout

Network Errors:

  • Connection refused
  • Connection reset
  • Timeout errors (including "timeout awaiting response headers")
  • EOF errors
  • DNS resolution failures

3. Implementation Example

func NewClient(options *ClientOptions) Client {
    // Use sensible defaults if no retry config provided
    if options.RetryConfig == nil {
        options.RetryConfig = &RetryConfig{
            Enabled:           true,
            MaxAttempts:       3,
            InitialBackoffSec: 2,
            MaxBackoffSec:     30,
            BackoffMultiplier: 2,
        }
    }
    
    // Create HTTP client with retry wrapper
    httpClient := newRetryableHTTPClient(options.RetryConfig)
    
    return &client{
        baseURL:    options.BaseURL,
        apiKey:     options.APIKey,
        httpClient: httpClient,
    }
}

4. Usage Examples

With default retry behavior:

client := sdk.NewClient(&sdk.ClientOptions{
    BaseURL: "https://api.example.com",
    APIKey:  "key",
})

With custom retry configuration:

client := sdk.NewClient(&sdk.ClientOptions{
    BaseURL: "https://api.example.com",
    APIKey:  "key",
    RetryConfig: &sdk.RetryConfig{
        Enabled:           true,
        MaxAttempts:       5,
        InitialBackoffSec: 1,
        MaxBackoffSec:     60,
        BackoffMultiplier: 3,
    },
})

Disable retries:

client := sdk.NewClient(&sdk.ClientOptions{
    BaseURL: "https://api.example.com",
    APIKey:  "key",
    RetryConfig: &sdk.RetryConfig{
        Enabled: false,
    },
})

Benefits

  1. Improved Reliability: Automatic recovery from transient failures
  2. Better UX: Applications using the SDK won't fail on temporary issues
  3. Production Ready: Aligns with industry best practices for API clients
  4. Configurable: Users can tune retry behavior for their specific needs

Additional Considerations

  • Request Idempotency: Only retry safe/idempotent operations (GET, PUT, DELETE)
  • Request Body: Ensure request body can be re-read for retries (use GetBody func)
  • Context Cancellation: Respect context cancellation between retries
  • Logging: Log retry attempts for debugging (at debug level)
  • Metrics: Consider exposing retry metrics for monitoring

References

Testing

  • Unit tests for retry logic with various error conditions
  • Integration tests with mock server returning retryable errors
  • Timeout and cancellation handling tests
  • Exponential backoff calculation tests

Acceptance Criteria

  • It's possible to configure the SDK client with backoff retry
  • It's tested
  • It's documented

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions