-
-
Notifications
You must be signed in to change notification settings - Fork 54.6k
Description
Summary
Add configurable TTL for delivery queue messages to prevent stale/orphaned entries from flooding channels on gateway restart
Problem to solve
The delivery queue (introduced in v2026.2.13) persists outbound messages to disk indefinitely. When the gateway restarts, recoverPendingDeliveries() attempts to re-deliver ALL queued messages regardless of age, causing "message dumps" where stale/orphaned messages flood channels.
This is particularly problematic for users with daily session resets or who experience crashes - overnight accumulation leads to bursts of stale messages being replayed the next morning. There's currently no native way to prevent this behavior without external cleanup scripts.
Proposed solution
Add a configurable TTL that recoverPendingDeliveries() checks before attempting replay:
Config Schema:
{
"messages": {
"delivery": {
"maxAgeMs": 7200000, // Default: 2 hours
"expireAction": "move-to-failed" // or "skip" or "delete"
}
}
}Behavior:
When processing queue entries:
- Read
enqueuedAttimestamp from each JSON file -
- Calculate age:
Date.now() - enqueuedAt
- Calculate age:
-
- If
age > maxAgeMs:
- If
-
"skip"→ Ignore during recovery (leave file in place)
-
"delete"→ Delete the file silently
-
"move-to-failed"→ Move tofailed/folder with.expiredsuffix (recommended default)
Why 2 Hours?
- Retry schedule completes in ~12.5 minutes (5 attempts)
-
- 2 hours = 10x safety margin, longer than any legitimate delivery delay
-
-
- Shorter than typical daily reset cycles
Backward Compatibility:
DefaultmaxAgeMs: undefinedpreserves current behavior (no age check). Users opt-in by setting the config value.
- Shorter than typical daily reset cycles
-
Alternatives considered
1. Clear queue on daily reset - Too aggressive, loses legitimate retry attempts still in progress
2. External cron cleanup - Works but shouldn't be the user's responsibility for basic message hygiene
3. Increase retry attempts - Doesn't solve the staleness issue, just delays it
Impact
Affected: Users with messaging integrations (especially iMessage, Discord, Telegram), particularly those using daily session resets or experiencing crashes
Severity: Annoying - Causes user confusion and message pollution but doesn't break core functionality
Frequency: Daily for users with session resets; intermittent for crash recovery scenarios
Consequence: Message dumps create poor UX, users receive bursts of stale/duplicate messages on gateway restart
Evidence/examples
Observed in production (v2026.2.13):
- Users report morning "message dumps" to messaging channels after daily session resets
-
- Queue directory (
~/.openclaw/delivery-queue/) grows over time without manual cleanup
- Queue directory (
-
-
recoverPendingDeliveries()replays all entries on gateway startup regardless of age
-
-
-
-
- No native config option to skip replay of old messages
Retry Schedule Reference:
- No native config option to skip replay of old messages
-
-
- Attempts: 5s → 25s → 2min → 10min (max 5 attempts)
-
- Total window: ~12.5 minutes
-
-
- Current behavior: Messages persist indefinitely beyond retry window
-
Additional information
Implementation Notes:
- Must remain backward-compatible with existing config keys
-
- Default
maxAgeMs: undefinedpreserves current behavior (no TTL check)
- Default
-
-
- Recommend
move-to-failedas safer default overdeletefor debugging
- Recommend
-
-
-
-
- Related to delivery queue feature introduced in v2026.2.13
Optional Enhancement:
Consider logging when messages are expired (e.g., "Skipped 5 expired messages (age > 2h)") for visibility
- Related to delivery queue feature introduced in v2026.2.13
-
-