Skip to content

fix(queue): preserve queued platform message ids in collect drains (Fixes #36212)#91272

Open
deepujain wants to merge 1 commit into
openclaw:mainfrom
deepujain:fix/36212-queued-message-id
Open

fix(queue): preserve queued platform message ids in collect drains (Fixes #36212)#91272
deepujain wants to merge 1 commit into
openclaw:mainfrom
deepujain:fix/36212-queued-message-id

Conversation

@deepujain

Copy link
Copy Markdown
Contributor

Summary

Fixes #36212.

Collected queued followups were dropping the original platform message_id when the drain synthesized a combined [Queued messages while agent was busy] turn. The queue already stores each inbound platform message id, but the collect and overflow-summary drain paths were rebuilding a new followup run without carrying that metadata forward.

This change keeps the latest queued item's messageId (and reply-target id when present) on the synthesized followup run so downstream message tooling can still target the real platform message instead of a synthetic placeholder id.

Non-goals:

  • no queue-mode behavior changes
  • no changes to dedupe keys or routing selection
  • no provider-specific reply policy changes

Tests and validation

node scripts/run-vitest.mjs src/auto-reply/reply/queue.collect.test.ts
node scripts/run-vitest.mjs src/auto-reply/reply/followup-runner.test.ts --testNamePattern "restart sentinel routed followups use reply target ids for CLI current message context"
git diff --check

Real behavior proof

Behavior addressed:

  • collect-mode queued followups should preserve the latest queued platform message_id when they collapse multiple queued inbound messages into a single deferred turn

Environment tested:

  • local source checkout on current main-based branch

Evidence after the fix:

  • added a focused regression in src/auto-reply/reply/queue.collect.test.ts that enqueues two Feishu-style queued messages with distinct messageId values and verifies the drained collected followup keeps platform-msg-2
  • the existing followup-runner routing test for reply-target/current-message-id behavior remains green

Observed result:

  • the collect drain now forwards the latest queued message id instead of dropping back to a synthetic followup-only id

What was not tested:

  • a live Feishu im.message.reply call against a real channel account

Proof limitations:

  • I attempted a standalone ad-hoc local queue-drain probe, but the fresh worktree did not have the hydrated dependency layout needed for that direct runtime script, so the meaningful proof here is the focused regression test over the real queue drain code path

Risk

Low. This only changes metadata preservation on synthesized queued followup runs. It does not alter queue admission, batching rules, or provider routing. The main behavior change is that downstream tooling sees the real latest queued platform message id where previously it saw no preserved queued message id.

Keep the latest queued platform message id when collect-mode followups synthesize a combined deferred turn.

Fixes openclaw#36212

Copy link
Copy Markdown
Contributor Author

Focused fix for the collect/overflow queue drain path. I validated the new regression in src/auto-reply/reply/queue.collect.test.ts, kept the existing followup current-message-id routing test green, and left the proof limits in the PR body since I did not have a live Feishu account to exercise im.message.reply directly.

@openclaw-barnacle openclaw-barnacle Bot added size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels Jun 8, 2026
@clawsweeper

clawsweeper Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge. Reviewed June 7, 2026, 8:15 PM ET / 00:15 UTC.

Summary
The PR preserves queued platform message metadata when collect-mode and overflow-summary queue drains synthesize followup runs.

PR surface: Source +23, Tests +44. Total +67 across 2 files.

Reproducibility: yes. Source inspection gives a high-confidence path: enqueue same-route collect-mode FollowupRun items with messageId values, then drain; current main synthesizes the collected run without carrying messageId forward.

Review metrics: none identified.

Merge readiness
Overall: 🦪 silver shellfish
Proof: 🦪 silver shellfish
Patch quality: 🐚 platinum hermit
Result: blocked until real behavior proof from a real setup is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P1] Add redacted after-fix proof from a real or maintainer-credible Feishu/channel setup showing the collected queued message preserves the platform message id.

Proof guidance:

  • [P1] Needs real behavior proof before merge: The PR body provides tests only and says no live Feishu account was used; the contributor should add redacted live output, logs, terminal proof, or a screenshot/recording that shows the real post-fix behavior, then update the PR body for re-review.

Risk before merge

  • [P1] Real Feishu/native reply behavior remains unproven after the patch; the submitted evidence is regression tests plus an explicit note that no live account was exercised.

Maintainer options:

  1. Decide the mitigation before merge
    Land this focused queue-layer preservation only after redacted real behavior proof or maintainer-run channel proof confirms the collected queued turn can use the preserved platform message id.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

  • [P1] The remaining blocker is contributor-provided real behavior proof, not a concrete code repair that automation can safely supply.

Security
Cleared: The diff only changes in-process queue metadata propagation and a focused test; it does not touch secrets, dependencies, CI, packaging, or code-execution surfaces.

Review details

Best possible solution:

Land this focused queue-layer preservation only after redacted real behavior proof or maintainer-run channel proof confirms the collected queued turn can use the preserved platform message id.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection gives a high-confidence path: enqueue same-route collect-mode FollowupRun items with messageId values, then drain; current main synthesizes the collected run without carrying messageId forward.

Is this the best way to solve the issue?

Yes. The queue drain is the owner boundary where collected followup runs are synthesized, so preserving the already-stored message metadata there is narrower and cleaner than a Feishu-only workaround.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 5c5391836b1e.

Label changes

Label justifications:

  • P2: The PR addresses a normal-priority queued message reply-target bug with limited blast radius in auto-reply delivery.
  • rating: 🦪 silver shellfish: Overall readiness is 🦪 silver shellfish; proof is 🦪 silver shellfish and patch quality is 🐚 platinum hermit.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR body provides tests only and says no live Feishu account was used; the contributor should add redacted live output, logs, terminal proof, or a screenshot/recording that shows the real post-fix behavior, then update the PR body for re-review.
Evidence reviewed

PR surface:

Source +23, Tests +44. Total +67 across 2 files.

View PR surface stats
Area Files Added Removed Net
Source 1 23 0 +23
Tests 1 44 0 +44
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 2 67 0 +67

What I checked:

  • Repository policy applied: Root AGENTS.md was read fully, no scoped AGENTS.md owns src/auto-reply, and the relevant rule here is the OpenClaw PR proof/review gate for message-delivery changes. (AGENTS.md:9, 5c5391836b1e)
  • Current main stores inbound message ids before queueing: runPreparedReply builds FollowupRun with messageId from MessageSidFull/MessageSid and originatingReplyToId from ReplyToId, so the data exists before queue drain synthesis. (src/auto-reply/reply/get-reply-run.ts:1255, 5c5391836b1e)
  • Current main drops message id in collect synthesis: The current collect drain rebuilds a synthetic FollowupRun with prompt, run, routing, runtime metadata, and images, but no messageId or originatingReplyToId. (src/auto-reply/reply/queue/drain.ts:484, 5c5391836b1e)
  • Downstream runtime depends on queued messageId: createFollowupRunner passes queued.messageId as currentMessageId for normal queued followups in both CLI and embedded runtime paths. (src/auto-reply/reply/followup-runner.ts:766, 5c5391836b1e)
  • PR diff addresses the source gap: The patch adds resolveOriginMessageMetadata, spreads it into collected followup runs, and copies item messageId/originatingReplyToId for overflow-summary drained items. (src/auto-reply/reply/queue/drain.ts:506, a332c2630940)
  • Related issue remains canonical context: The linked issue reports production Feishu failures where queued message ids become internal placeholders; its latest ClawSweeper review already marked the issue source-reproducible and queueable.

Likely related people:

  • steipete: Local blame for the current queue drain, followup-runner, and get-reply-run message metadata paths points to Peter Steinberger's ce015ce refactor; the related issue review also links steipete to adjacent followup origin routing work. (role: recent area contributor; confidence: high; commits: ce015cef5775, 54648a9cf15d, 125dc322f5c5; files: src/auto-reply/reply/queue/drain.ts, src/auto-reply/reply/followup-runner.ts, src/auto-reply/reply/get-reply-run.ts)
  • Agustin Rivera: The prior ClawSweeper review on the linked issue identifies auth-context collect batching work in the same queue drain and collect-test surface. (role: recent area contributor; confidence: medium; commits: 43d4be902755; files: src/auto-reply/reply/queue/drain.ts, src/auto-reply/reply/queue.collect.test.ts)
  • vincentkoc: The linked issue review identifies recent Feishu and queue-adjacent work relevant to the provider side of the reported reply-target behavior. (role: recent adjacent contributor; confidence: medium; commits: 268824ff3ab9, 2b96f53f9782; files: extensions/feishu/src/bot.ts, extensions/feishu/src/channel.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal backlog priority with limited blast radius. labels Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P2 Normal backlog priority with limited blast radius. rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. size: S status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Queued messages should preserve original platform message_id

1 participant