Skip to content

[Feature]: Feishu inbound dedup cache lost on SIGUSR1 restart, causing duplicate message processing #14431

@Jesse-HH

Description

@Jesse-HH

Summary

When the gateway restarts in-process (SIGUSR1, triggered by config changes, model switches, or /new), the inboundDedupeCache is re-initialized as an empty Map. The Feishu/Lark SDK's WebSocket client then reconnects and replays recent events. Since the cache is empty, previously-processed messages pass shouldSkipDuplicateInbound() and get dispatched to the agent again — resulting in duplicate replies to the user.

Reproduction:
Send a few messages via Feishu DM
Trigger a gateway restart (e.g. config.patch to change model)
Observe: one or more of the recently-sent messages get re-dispatched and replied to again
Log evidence:
02:59:09.082 received message from ou_xxx (p2p)
02:59:09.087 dispatching to agent
02:59:09.095 received message from ou_xxx (p2p) ← same event, 13ms later
02:59:09.098 dispatching to agent
02:59:09.100 dispatch complete (replies=0) ← second copy gets no reply (race)
02:59:20.156 dispatch complete (replies=1) ← first copy gets replied

In other cases the duplicate does get a full reply, causing the user to see a response to a message they didn't just send.

Root cause:
inboundDedupeCache in src/auto-reply/reply/inbound-dedupe.ts is a pure in-memory Map (via createDedupeCache). It does not survive process restarts. The Lark SDK WSClient reconnects after restart and re-delivers recent events (standard at-least-once semantics). With an empty cache, all replayed events are treated as new.

Suggested fix (any of):
Persist the dedup cache to SQLite (already available in the codebase) and restore on startup
On SIGUSR1 in-process restart, preserve the inboundDedupeCache instance across the reload cycle instead of re-initializing
Add a startup grace period: after Feishu WS reconnects, ignore events with timestamps older than N seconds before the restart
Impact: Affects any channel using WebSocket with at-least-once delivery (Feishu confirmed; potentially others). Users see "ghost replies" to messages they didn't just send, eroding trust in the system.

Environment: OpenClaw 2026.2.9, Feishu channel (websocket mode), macOS

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions