Skip to content

fix: prevent followup queue from delivering same message multiple times (closes #30604)#46170

Closed
Br1an67 wants to merge 1 commit into
openclaw:mainfrom
Br1an67:fix/30604
Closed

fix: prevent followup queue from delivering same message multiple times (closes #30604)#46170
Br1an67 wants to merge 1 commit into
openclaw:mainfrom
Br1an67:fix/30604

Conversation

@Br1an67

@Br1an67 Br1an67 commented Mar 14, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Problem: When an inbound message arrives while the agent is busy, the followup queue can deliver the same message 2-3 times across successive drain cycles. After queue.items.splice, the dedupe check in isRunAlreadyQueued passes because the delivered item is no longer in the array.
  • Why it matters: Users see duplicate responses; agent wastes tokens processing the same message multiple times.
  • What changed: Added a TTL-based delivered-message dedupe cache (20 min window) to the followup queue. After items are drained, their message IDs are recorded. Re-delivery of the same provider message is rejected. Wired markFollowupRunsDelivered into the drain loop.
  • What did NOT change (scope boundary): Inbound dedupe unchanged. Queue structure unchanged. No changes to agent processing.

Change Type

  • Bug fix

Scope

  • Gateway / orchestration

Linked Issue/PR

Security Impact

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No

Repro + Verification

Steps

  1. Agent busy with long-running task (30+ min)
  2. User sends single message during busy period
  3. Message queued, then delivered multiple times across drain cycles

Expected

Each message delivered exactly once.

Actual (before fix)

Same message delivered 2-3 times.

Evidence

  • Failing test/log before + passing after
  • pnpm build passes

Human Verification

  • Verified scenarios: pnpm build passing; dedupe cache follows same pattern as inbound-dedupe.ts
  • Edge cases checked: messages without IDs bypass dedupe; cache cleared on reset; TTL prevents unbounded growth
  • What you did NOT verify: runtime end-to-end with actual long-running agent task

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No

Failure Recovery

  • How to disable/revert: revert this commit
  • Files/config to restore: src/auto-reply/reply/queue/enqueue.ts, src/auto-reply/reply/queue/drain.ts

Risks and Mitigations

None — additive cache following established pattern.


This PR was AI-assisted (fully tested with pnpm build/check/test).

@greptile-apps

greptile-apps Bot commented Mar 14, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR introduces a TTL-based delivered-message dedupe cache (DELIVERED_QUEUE_MESSAGE_IDS) to prevent the followup queue from re-delivering the same provider message across successive drain cycles. The root cause was that after queue.items.splice removes a drained item, the existing isRunAlreadyQueued check could no longer find it, and the enqueue-time RECENT_QUEUE_MESSAGE_IDS cache (5-min TTL) could expire before a re-delivery attempt — leaving a gap where the same message could be re-enqueued and processed again.

Key changes:

  • A new createDedupeCache instance (DELIVERED_QUEUE_MESSAGE_IDS) with a 20-minute TTL is added, following the same pattern as the inbound dedupe cache.
  • markFollowupRunsDelivered is called after each of the three drain paths (collect-individual, collect-batch, and next-item/summary), using a "capture item before drain, mark after success" pattern.
  • enqueueFollowupRun gains an early-exit check against DELIVERED_QUEUE_MESSAGE_IDS before re-enqueue.
  • resetRecentQueuedMessageIdDedupe now also clears the delivered cache, maintaining consistent test/reset behavior.

The implementation is logically sound across all drain paths, and the peek/check semantics from createDedupeCache are used correctly.

Confidence Score: 4/5

  • This PR is safe to merge; the additive cache correctly closes the dedup gap without touching agent processing or queue structure.
  • The logic is correct across all three drain paths, follows the established dedupe pattern, and is purely additive. The slight reduction from 5 is because the fix is complex enough to warrant runtime end-to-end verification (explicitly called out as not done by the author), and because the collect "individual" path captures drainedCollectItem slightly early (before isCrossChannel is checked), which, while harmless today, slightly obscures intent.
  • No files require special attention; both changed files are well-structured.

Last reviewed commit: 5117b7b

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 32becb6e1a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@@ -128,6 +129,7 @@ export function scheduleFollowupDrain(
...routing,
});
queue.items.splice(0, items.length);
markFollowupRunsDelivered(items, key);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Mark delivered IDs for all drain paths

This new delivered-dedupe mechanism is only updated in the collect-batch splice path, but other successful drain paths still remove items via drainNextQueueItem without recording them (for example individual collect drains when collectDrainResult === "drained", summary drains, and non-collect modes). In those flows, enqueueFollowupRun checks DELIVERED_QUEUE_MESSAGE_IDS but never finds the message, so after the 5-minute RECENT_QUEUE_MESSAGE_IDS window expires, the same provider message can still be re-enqueued and processed again.

Useful? React with 👍 / 👎.

@Br1an67 Br1an67 force-pushed the fix/30604 branch 2 times, most recently from 3545050 to 5117b7b Compare March 15, 2026 04:33
@Br1an67 Br1an67 closed this Mar 15, 2026
@Br1an67 Br1an67 reopened this Mar 15, 2026
…es (closes openclaw#30604)

Add a TTL-based delivered-message dedupe cache to the followup queue.
After items are drained via queue.items.splice, their message IDs are
recorded so re-delivery of the same provider message is rejected for
20 minutes instead of being re-enqueued.

- Add DELIVERED_QUEUE_MESSAGE_IDS cache in enqueue.ts
- Check delivered cache during enqueueFollowupRun
- Export markFollowupRunsDelivered for drain loop
- Wire markFollowupRunsDelivered in drain.ts after splice

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@Br1an67

Br1an67 commented Mar 17, 2026

Copy link
Copy Markdown
Contributor Author

Closing to manage active PR count. Will reopen when slot is available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Followup queue delivers same message multiple times when agent is busy

1 participant