fix(cron): stale announce retries drain one-per-trigger instead of bulk flush

## Summary

When cron announce delivery fails for multiple consecutive runs (e.g., due to misconfigured delivery target), the failed entries accumulate in the subagent registry. When the configuration is fixed, these entries drain **one per heartbeat/trigger** instead of being flushed sequentially in a single pass. Each trigger processes one entry, forcing the user to manually trigger N times to clear the backlog.

## Reproduction

1. Configure an isolated cron job with a broken delivery target (e.g., `user:D...` instead of `channel:D...`)
2. Let 6+ scheduled runs execute — each produces output but delivery fails
3. Fix the delivery target
4. Observe: each subsequent trigger delivers one old result, not the full queue
5. Must trigger 6+ times to see all past results + fresh output

## Expected Behavior

When delivery starts working again, the system should flush the **entire pending queue** in one pass, oldest-to-newest. In normal operation the queue is 1 (the latest run). When delivery was broken, the queue accumulates — and once delivery recovers, all entries should be delivered sequentially in a single pass, preserving chronological order.

This is NOT a staleness problem — every run's output is valuable (it was the delivery that failed, not the content). The user expects to receive all missed results in order once the delivery issue is resolved.

## Root Cause

In `src/agents/subagent-registry.ts`:

1. **One-at-a-time processing**: `retryDeferredCompletedAnnounces` (line 750-776) triggers `resumeSubagentRun` for entries, but only one entry gets processed per cycle
2. **Requires external trigger**: Each retry needs a heartbeat tick or manual `cron run` to process the next entry
3. **No queue drain loop**: There's no mechanism to say "delivery is working now, flush all pending entries for this job"

## Proposed Fix

When a deferred announce succeeds (delivery confirmed), immediately check for additional pending entries for the same cron job and process them in a loop until the queue is empty:

```
retryDeferredCompletedAnnounces:
  for each pending entry (oldest first):
    attempt delivery
    if success → continue to next entry
    if failure → stop (delivery still broken)
```

This gives the correct behavior:
- **Normal operation**: Queue is 1, delivers immediately
- **Recovery after outage**: Queue is N, flushes all N entries oldest-to-newest in one pass
- **Partial recovery**: If delivery breaks again mid-flush, stops at the failure point

## Related

- #2083 — auth clobber bug that caused initial run failures
- #2087 — Slack prefix validation that caused delivery failures
- Both bugs contributed to accumulating the entries this issue describes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cron): stale announce retries drain one-per-trigger instead of bulk flush #2088

Summary

Reproduction

Expected Behavior

Root Cause

Proposed Fix

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

fix(cron): stale announce retries drain one-per-trigger instead of bulk flush #2088

Description

Summary

Reproduction

Expected Behavior

Root Cause

Proposed Fix

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions