feat: Add automatic retry with exponential backoff for rate limits (429)

## Problem

When a model provider returns a 429 rate limit error, the current behavior is to fail the request and potentially fall back to another profile/model. There's no automatic retry with delay.

## Current Behavior

- Rate limits are detected (`FailoverReason: 'rate_limit'`)
- Profile gets marked with a cooldown (`calculateAuthProfileCooldownMs`)
- Request fails immediately

## Proposed Behavior

For 429 responses:
1. Check `Retry-After` header if present
2. Apply exponential backoff: 1s, 2s, 4s, 8s, 16s (max 3-5 retries)
3. Retry the same request
4. Only fail/fallback after exhausting retries

## Why This Matters

Rate limits are often transient. Batch operations that hit rate limits shouldn't fail entirely - they should wait and retry. This is especially important for:
- Long-running batch jobs
- Multi-model operations
- High-throughput scenarios

## Implementation Notes

The retry logic should integrate at the API call level, possibly in:
- `src/agents/pi-embedded-runner/run/attempt.ts`
- Or a wrapper around the provider SDK calls

The existing `calculateAuthProfileCooldownMs` shows the backoff pattern already in use for profile cooldowns.

## Prior Art

- OpenAI SDK has built-in retry
- Anthropic SDK has configurable retry
- Most HTTP clients support retry middleware

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add automatic retry with exponential backoff for rate limits (429) #8894

Problem

Current Behavior

Proposed Behavior

Why This Matters

Implementation Notes

Prior Art

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

feat: Add automatic retry with exponential backoff for rate limits (429) #8894

Description

Problem

Current Behavior

Proposed Behavior

Why This Matters

Implementation Notes

Prior Art

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions