Skip to content

feat(feishu): persistent message deduplication to prevent duplicate replies#23377

Merged
steipete merged 3 commits intoopenclaw:mainfrom
Sid-Qin:feat/feishu-persistent-dedup
Feb 22, 2026
Merged

feat(feishu): persistent message deduplication to prevent duplicate replies#23377
steipete merged 3 commits intoopenclaw:mainfrom
Sid-Qin:feat/feishu-persistent-dedup

Conversation

@Sid-Qin
Copy link
Contributor

@Sid-Qin Sid-Qin commented Feb 22, 2026

Summary

  • Problem: Feishu redelivers the same message during WebSocket reconnects, process restarts, or network instability. The current in-memory dedup map is lost on restart, causing duplicate replies in one-on-one chats.
  • Why it matters: Users receive the same reply multiple times after any OpenClaw restart or network hiccup, degrading the conversational experience.
  • What changed: Added a dual-layer (memory + filesystem) dedup strategy with 24h TTL. New file dedup-store.ts provides atomic, per-account persistent storage under ~/.openclaw/feishu/dedup/. Updated dedup.ts with an async tryRecordMessagePersistent() that checks memory first (fast path) then disk. Updated bot.ts to call the new async function.
  • What did NOT change: The synchronous tryRecordMessage() is preserved for backward compatibility. No changes to message handling logic, reply dispatch, or any other extension.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

  • Duplicate replies in Feishu DM conversations after restart/reconnect are eliminated.
  • Dedup TTL extended from 30 minutes to 24 hours for better reliability.
  • A ~/.openclaw/feishu/dedup/ directory is created on first message to store dedup state.

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No — only writes message IDs + timestamps to a local directory

Repro + Verification

Environment

  • OS: macOS 15.4 (Sequoia)
  • Runtime: Node.js 22
  • Integration/channel: Feishu (飞书)

Steps

  1. Start OpenClaw with a Feishu account configured
  2. Send a message in a DM conversation
  3. Restart OpenClaw (or simulate WebSocket reconnect)
  4. Feishu redelivers the same message_id

Expected

  • The redelivered message is detected as duplicate and skipped

Actual

  • Before fix: duplicate reply sent to user
  • After fix: feishu: skipping duplicate message <id> logged, no duplicate reply

Evidence

Key changes:

dedup-store.ts — new persistent store:

// Atomic write: tmp file → rename to avoid partial reads
// Probabilistic cleanup (~2% chance per write) keeps file bounded
// Max 10k entries per account file
export class DedupStore {
  async has(namespace, messageId, ttlMs): Promise<boolean>
  async record(namespace, messageId, ttlMs): Promise<void>
}

dedup.ts — dual-layer async dedup:

export async function tryRecordMessagePersistent(
  messageId: string,
  namespace = "global",
  log?: (...args: unknown[]) => void,
): Promise<boolean> {
  // 1. Check memory (sync fast path)
  // 2. Check disk (async, survives restarts)
  // 3. Record to both layers
  // Disk errors → fallback to memory-only, never blocks
}

bot.ts — caller updated:

// Before:
if (!tryRecordMessage(messageId)) { ... }
// After:
if (!(await tryRecordMessagePersistent(messageId, account.accountId, log))) { ... }

Human Verification (required)

  • Verified scenarios: TypeScript compilation passes (tsc --noEmit), code review of all 3 files
  • Edge cases checked: disk write failure (graceful fallback), concurrent writes (atomic rename), file corruption (JSON parse catch), memory cache overflow (LRU eviction)
  • What I did not verify: live Feishu environment end-to-end test (no access to Feishu bot credentials in CI)

Compatibility / Migration

  • Backward compatible? Yes — sync tryRecordMessage still exported, existing callers unaffected
  • Config/env changes? No
  • Migration needed? No — dedup directory is created automatically on first use

Failure Recovery (if this breaks)

  • How to disable/revert: revert this PR; dedup falls back to memory-only (original behavior)
  • Files/config to restore: delete ~/.openclaw/feishu/dedup/ directory
  • Known bad symptoms: if filesystem is read-only, disk layer silently fails and logs a warning; memory-only dedup still works

Risks and Mitigations

  • Disk I/O latency: mitigated by checking memory first (sync fast path); disk is only hit on cold start or cache miss
  • File growth: mitigated by probabilistic cleanup + 10k entry cap per account
  • Partial writes: mitigated by atomic tmp-file + rename pattern

Greptile Summary

Adds persistent message deduplication for Feishu to prevent duplicate replies after restarts or reconnects. The implementation uses a dual-layer approach (memory + filesystem) with a 24-hour TTL, storing state in ~/.openclaw/feishu/dedup/.

Critical issues found:

  • Namespace isolation bug: The global memoryCache doesn't include namespace in keys, causing cross-account collisions when different accounts receive messages with identical IDs
  • Race condition: Concurrent messages with the same ID can both pass the check-then-record sequence, allowing duplicates through
  • Cache staleness: The DedupStore cache is never invalidated after external writes, potentially serving stale data indefinitely
  • Incomplete TTL enforcement: has() deletes expired entries from memory but never flushes to disk, causing them to reappear after restart

The approach is sound but the implementation has multiple correctness issues that undermine the deduplication guarantee. These issues particularly affect multi-account setups and high-concurrency scenarios.

Confidence Score: 2/5

  • Not safe to merge - contains multiple correctness bugs that break deduplication guarantees
  • Score reflects four critical logic errors: namespace isolation bug causes cross-account message collisions, race condition allows concurrent duplicates through, cache never invalidates causing stale reads, and incomplete TTL enforcement allows expired entries to resurrect. These aren't edge cases - they affect core functionality in multi-account and concurrent scenarios.
  • Pay close attention to extensions/feishu/src/dedup.ts (namespace isolation + race condition) and extensions/feishu/src/dedup-store.ts (cache invalidation + TTL consistency)

Last reviewed commit: aece3d3

@openclaw-barnacle openclaw-barnacle bot added channel: feishu Channel integration: feishu size: S labels Feb 22, 2026
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 60 to 66
if (memoryCache.has(messageId)) return false;

// Throttled cleanup: evict expired entries at most once per interval.
if (now - lastCleanupTime > DEDUP_CLEANUP_INTERVAL_MS) {
for (const [id, ts] of processedMessageIds) {
if (now - ts > DEDUP_TTL_MS) {
processedMessageIds.delete(id);
}
try {
if (await store.has(namespace, messageId, DEDUP_TTL_MS)) {
memoryCache.set(messageId, Date.now());
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

race condition between memory and disk checks allows duplicates

if two identical messages arrive concurrently:

  1. both pass memory check (line 60)
  2. both check disk and find nothing (line 63)
  3. both record to memory+disk (lines 75, 78)
  4. both return true → duplicate processing

the check (line 63) and record (line 78) are not atomic, so concurrent calls with the same messageId can both see "not found" and both proceed

Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/feishu/src/dedup.ts
Line: 60-66

Comment:
race condition between memory and disk checks allows duplicates

if two identical messages arrive concurrently:
1. both pass memory check (line 60)
2. both check disk and find nothing (line 63) 
3. both record to memory+disk (lines 75, 78)
4. both return true → duplicate processing

the check (line 63) and record (line 78) are not atomic, so concurrent calls with the same `messageId` can both see "not found" and both proceed

How can I resolve this? If you propose a fix, please make it concise.

This comment was marked as spam.

Comment on lines +47 to +56
async has(namespace: string, messageId: string, ttlMs: number): Promise<boolean> {
const data = await this.load(namespace);
const ts = data[messageId];
if (ts == null) return false;
if (Date.now() - ts > ttlMs) {
delete data[messageId];
return false;
}
return true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mutating cached data without flush creates inconsistency

has() deletes expired entries from the in-memory cache (line 52) but never writes back to disk. if process restarts before next record() call, the expired entry reappears from disk

either skip the delete, or call flush() after modifying cache

Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/feishu/src/dedup-store.ts
Line: 47-56

Comment:
mutating cached data without flush creates inconsistency

`has()` deletes expired entries from the in-memory cache (line 52) but never writes back to disk. if process restarts before next `record()` call, the expired entry reappears from disk

either skip the delete, or call `flush()` after modifying cache

How can I resolve this? If you propose a fix, please make it concise.

This comment was marked as spam.

Comment on lines +31 to +45
async load(namespace: string): Promise<DedupData> {
const cached = this.cache.get(namespace);
if (cached) return cached;

try {
const raw = await fs.promises.readFile(this.filePath(namespace), "utf-8");
const data: DedupData = JSON.parse(raw);
this.cache.set(namespace, data);
return data;
} catch {
const data: DedupData = {};
this.cache.set(namespace, data);
return data;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cache is never invalidated, can return stale data after concurrent writes

the in-memory cache Map (line 20) is populated once per namespace and never refreshed. if another process (or concurrent call in same process) writes to disk via flush(), this instance keeps serving stale data from cache

consider removing the cache entirely (always read from disk), or use file watchers / mtime checks to detect external changes

Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/feishu/src/dedup-store.ts
Line: 31-45

Comment:
cache is never invalidated, can return stale data after concurrent writes

the in-memory `cache` Map (line 20) is populated once per namespace and never refreshed. if another process (or concurrent call in same process) writes to disk via `flush()`, this instance keeps serving stale data from cache

consider removing the cache entirely (always read from disk), or use file watchers / mtime checks to detect external changes

How can I resolve this? If you propose a fix, please make it concise.

This comment was marked as spam.

Comment on lines 9 to 10
const memoryCache = new Map<string, number>();
let lastCleanupTime = Date.now();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shared memory cache breaks namespace isolation

memoryCache is global (line 9) but doesn't include namespace in the key. if two accounts both receive message_id="abc", the second account's check (line 60) incorrectly returns false (duplicate) even though it's a different namespace

prefix keys with namespace: memoryCache.set(\${namespace}:${messageId}`, ...)`

Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/feishu/src/dedup.ts
Line: 9-10

Comment:
shared memory cache breaks namespace isolation

`memoryCache` is global (line 9) but doesn't include namespace in the key. if two accounts both receive `message_id="abc"`, the second account's check (line 60) incorrectly returns false (duplicate) even though it's a different namespace

prefix keys with namespace: `memoryCache.set(\`\${namespace}:\${messageId}\`, ...)`

How can I resolve this? If you propose a fix, please make it concise.

This comment was marked as spam.

SidQin-cyber and others added 3 commits February 22, 2026 11:13
…eplies

Closes openclaw#23369

Feishu may redeliver the same message during WebSocket reconnects or process
restarts.  The existing in-memory dedup map is lost on restart, so duplicates
slip through.

This adds a dual-layer dedup strategy:
- Memory cache (fast synchronous path, unchanged capacity)
- Filesystem store (~/.openclaw/feishu/dedup/) that survives restarts

TTL is extended from 30 min to 24 h.  Disk writes use atomic rename and
probabilistic cleanup to keep each per-account file under 10 k entries.
Disk errors are caught and logged — message handling falls back to
memory-only behaviour so it is never blocked.
…ache staleness

- Prefix memoryCache keys with namespace to prevent cross-account false
  positives when different accounts receive the same message_id
- Add inflight tracking map to prevent TOCTOU race where concurrent
  async calls for the same message both pass the check and both proceed
- Remove expired-entry deletion from has() to avoid silent cache/disk
  divergence; actual cleanup happens probabilistically inside record()
- Add time-based cache invalidation (30s) to DedupStore.load() so
  external writes are eventually picked up
- Refresh cacheLoadedAt after flush() so we don't immediately re-read
  data we just wrote

Co-authored-by: Cursor <cursoragent@cursor.com>
@steipete steipete force-pushed the feat/feishu-persistent-dedup branch from f37eacb to a73ef4c Compare February 22, 2026 10:13
@vercel
Copy link

vercel bot commented Feb 22, 2026

@steipete is attempting to deploy a commit to the 0xBuns Team on Vercel.

A member of the Team first needs to authorize it.

@steipete steipete merged commit bf56196 into openclaw:main Feb 22, 2026
9 of 10 checks passed
@steipete
Copy link
Contributor

Landed via temp rebase onto main.

  • Gate: pnpm lint && pnpm build && pnpm test
  • Land commit: a73ef4c
  • Merge commit: bf56196

Thanks @SidQin-cyber!

hughmadden pushed a commit to turquoisebaydev/openclaw that referenced this pull request Feb 23, 2026
* test(cli): use lightweight clears in daemon lifecycle setup

* test(models): use lightweight clears in shared config setup

* test(agents): use lightweight clears for stable subagent announce defaults

* Sessions: persist prompt-token totals without usage

* fix(security): normalize hook auth rate-limit client keys

* refactor(cli): dedupe skills command report loading

* refactor(cli): dedupe channel auth resolution flow

* refactor(cli): dedupe allowlist command wiring

* test(cli): dedupe update restart fallback scenario setup

* test(cli): dedupe cron shared test fixtures

* refactor(cli): extract fish completion line builders

* test(cli): share nodes ios fixture helpers

* refactor(cli): share npm install metadata helpers

* refactor(cli): share pinned npm install record helper

* refactor(slack): dedupe modal lifecycle interaction handlers

* refactor(commands): share preview streaming migration logic

* test(gateway): reuse last agent command assertion helper

* test(discord): share provider lifecycle test harness

* test(discord): share thread binding sweep fixtures

* test(infra): dedupe shell env fallback test setup

* refactor(discord): dedupe voice command runtime checks

* test(discord): share model picker fallback fixtures

* test(discord): share message handler draft fixtures

* test(discord): share resolve-users guild probe fixture

* test(inbound): share dispatch capture mock across channels

* test(security): dedupe external marker sanitization assertions

* test(wizard): share onboarding prompter scaffold

* test(memory): share memory-tool manager mock fixture

* test(subagents): dedupe focus thread setup fixtures

* test(auth-profiles): dedupe cleared-state assertions

* test(memory): share short-timeout test helper

* test(outbound): share resolveOutboundTarget test suite

* test(auth-profiles): dedupe oauth mode resolution setup

* test(gateway): dedupe transcript seed fixtures in fs session tests

* refactor(text): share code-region parsing for reasoning tags

* refactor(node-host): share invoke type definitions

* refactor(logging): share node createRequire resolution

* test(models): dedupe auth-sync command assertions

* test(pi): share overflow-compaction test setup

* test(discord): dedupe guild permission route mocks

* refactor(config): dedupe legacy stream-mode migration paths

* test(gateway): dedupe tailscale header auth fixtures

* test(browser): dedupe relay probe server scaffolding

* test(cron): dedupe delivered-status run scaffolding

* test(gateway): dedupe control-ui not-found fixture assertions

* test(gateway): dedupe openai context assertions

* test(config): dedupe traversal include assertions

* test(config): dedupe nested redaction round-trip assertions

* test(gateway): reuse shared openai timeout e2e helpers

* test(gateway): dedupe chat history transcript helpers

* test(gateway): dedupe canvas ws connect assertions

* test(hooks): dedupe unsupported npm spec assertion

* test(agent): reuse isolated agent mock setup

* test(utils): share temp-dir helper across cli and web tests

* test(browser): dedupe generated-token persistence assertions

* test(browser): dedupe pw-session playwright mock wiring

* test(agents): dedupe spawn-hook wait mocks and add readiness error coverage

* test(agents): dedupe sanitize-session-history copilot fixtures

* test: dedupe lifecycle oauth and prompt-limit fixtures

* refactor(agents): share volc model catalog helpers

* refactor(agents): reuse shared tool-policy base helpers

* refactor: eliminate remaining duplicate blocks across draft streams and tests

* refactor(core): dedupe gateway runtime and config tests

* refactor(channels): dedupe message routing and telegram helpers

* refactor(agents): dedupe plugin hooks and test helpers

* chore: remove dead plugin hook loader

* fix(security): harden gateway command/audit guardrails

* test: dedupe telegram draft stream setup and extend state-dir env coverage

* Agents: drop stale pre-compaction usage snapshots

* docs(changelog): note next npm release for hook auth fix

* test(telegram): dedupe native-command test setup

* fix(gateway): block avatar symlink escapes

* test: dedupe cron and slack monitor test harness setup

* refactor(security): unify hook rate-limit and hook module loading

* test(gateway): dedupe loopback cases and trim setup resets

* test(agents): use lightweight clears in supervisor and session-status setup

* test(auto-reply): centralize subagent command test reset setup

* test(agents): centralize sessions tool gateway mock reset

* test(telegram): centralize native command session-meta mock setup

* test(browser): use lightweight clears in server lifecycle setup

* test(gateway): use lightweight clears in cron service setup

* test(commands): use lightweight clears in doctor memory search setup

* test(outbound): dedupe shared setup hooks in message e2e

* test(gateway): use lightweight clears in push handler setup

* test(gateway): use lightweight clears in node invoke wake setup

* test(gateway): use lightweight clears in node event setup

* test(gateway): use lightweight clears for hook cron run fences

* test(auto-reply): use lightweight clears in dispatch setup

* test(agents): use lightweight clears in sandbox browser create setup

* test(auto-reply): use lightweight clears in agent runner setup

* test(plugins): use lightweight clears in wired hooks setup

* test(gateway): use lightweight clears in client close setup

* test(ui): use lightweight clears in theme and telegram media retry setup

* test(agents): use lightweight clears in skills install e2e setup

* test(gateway): use lightweight clears for chat-b reply spy fences

* test(gateway): use lightweight clears for openai http agent fences

* test(gateway): use lightweight clears for openresponses agent fences

* test(core): use lightweight clears in update, child adapter, and copilot token setup

* test(agents): dedupe sessions_spawn e2e reset setup

* test(core): use lightweight clears in stable mock setup

* test(agents): dedupe sessions_spawn allowlist reset setup

* test(agents): drop redundant subagent registry cleanups

* test(core): trim redundant mock resets in heartbeat suites

* test(daemon): use lightweight clears in systemd mocks

* test(infra): use lightweight clears in update startup mocks

* test(gateway): use lightweight clears in agent handler tests

* test(infra): use lightweight clears in message action threading setup

* test(telegram): use lightweight clears in media handler setup

* test(commands): use lightweight clears in agents/channels setup

* fix: align draft/outbound typings and tests

* test: stabilize pw-session cdp mocking in parallel runs

* chore(docs): normalize security finding table formatting

* fix(ci): add explicit mock types in pw-session mock setup

* test(core): use lightweight clears in command and dispatch setup

* test(agents): use lightweight clears in skills/sandbox setup

* test(core): use lightweight clears in subagent and browser setup

* test(core): use lightweight clears in runtime and telegram setup

* test(core): trim redundant test resets and use mockClear

* test(slack): use lightweight clear in interactions modal-close case

* test(slack): avoid redundant reset in slash metadata wait case

* test(reply): replace heavy resets in media and runner helper specs

* test(agents): reduce reset overhead in session visibility and hooks specs

* test(subagents): lighten session delete mock reset in announce spec

* test(memory): prefer clear over reset in qmd spawn setup

* test(agents): keep targeted resets minimal in overflow retry spec

* chore: remove verified dead code paths

* test(core): reduce mock reset overhead across unit and e2e specs

* Agents: add fallback reply for tool-only completions

* test(core): trim reset usage in gateway and install source specs

* test(commands): use lightweight clears in config snapshot specs

* refactor(gateway)!: remove legacy v1 device-auth handshake

* test(subagents): use lightweight clears in sessions spawn suites

* test(core): continue mock reset reductions in auth, gateway, npm install

* test(core): continue reset-to-clear cleanup in subagent focus and web fetch

* test(config): use lightweight clear in session pruning e2e setup

* test(core): reduce reset overhead in messaging and agent e2e mocks

* test(core): tighten reset usage in auth, registry restart, and memory search

* fix: decouple owner display secret from gateway auth token

* chore: remove dead macos relay and daemon code

* test(core): use lightweight clear in cron, claude runner, and telegram delivery specs

* Agents/Subagents: honor subagent alsoAllow grants

* test(core): reduce mock reset overhead in targeted suites

* fix(security): block HOME and ZDOTDIR env override injection

* test(core): dedupe auth rotation and credential injection specs

* test(agents): dedupe subagent announce direct-send variants

* docs(changelog): add shell startup env override fix note

* chore(test): make shell-env trusted-shell assertion platform-aware

* test(commands): dedupe subagent status assertions

* fix: harden exec allowlist wrapper resolution

* test(agents): avoid full mock resets in cli credential specs

* chore(test): harden models status mock restoration

* test(core): dedupe command gating and trim announce reset overhead

* test(agents): unify hook thread-target announce assertions

* test(agents): collapse repeated announce direct-send scenarios

* test(reply): merge duplicate runReplyAgent streaming and fallback cases

* test(agents): use lightweight clear for active-run announce mock

* test(agents): remove overflow compaction mock reset dependency

* test(reply): use lightweight clears for runner-level mocks

* test(agents): consolidate repeated announce deferral and fallback matrices

* test(commands): replace subagent gateway reset with lightweight clear

* TUI: preserve RTL text order in terminal output

* docs(security): clarify dangerous control-ui bypass policy

* feat(security): warn on dangerous config flags at startup

* perf(test): bypass queue debounce in fast mode and tighten announce defaults

* fix(security): harden channel token and id generation

* refactor(security): unify secure id paths and guard weak patterns

* fix(gateway): remove hello-ok host and commit fields

* fix(security): block hook transform symlink escapes

* refactor: unify exec wrapper resolution and parity fixtures

* TUI: make Ctrl+C exit behavior reliably responsive

* test(heartbeat): dedupe sandbox/session helpers and collapse ack cases

* test(agents): simplify subagent announce suite imports and call assertions

* test(heartbeat): reuse shared temp sandbox in model override suite

* test(heartbeat): reuse shared sandbox for ghost reminder scenarios

* perf(test): compact heartbeat session fixture writes

* perf(test): shrink subagent announce fast-mode settle waits

* fix: use SID-based ACL classification for non-English Windows

* fix: detect zombie processes in isPidAlive on Linux

kill(pid, 0) succeeds for zombie processes, causing the gateway lock
to treat a zombie lock owner as alive. Read /proc/<pid>/status on
Linux to check for 'Z' (zombie) state before reporting the process
as alive. This prevents the lock from being held indefinitely by a
zombie process during gateway restart.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: release gateway lock before process.exit in run-loop

process.exit() called from inside an async IIFE bypasses the outer
try/finally block that releases the gateway lock. This leaves a stale
lock file pointing to a zombie PID, preventing the spawned child or
systemctl restart from acquiring the lock. Release the lock explicitly
before calling exit in both the restart-spawned and stop code paths.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: release gateway lock before spawning restart child

Move lock.release() before restartGatewayProcessWithFreshPid() so the
spawned child can immediately acquire the lock without racing against
a zombie parent. This eliminates the root cause of the restart loop
where the child times out waiting for a lock held by its now-dead parent.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: guard entry.ts top-level code with isMainModule to prevent duplicate gateway start

The bundler exports shared symbols from dist/entry.js, so other chunks
import it as a dependency. When dist/index.js is the actual entry point
(e.g. systemd service), lazy module loading eventually imports entry.js,
triggering its unguarded top-level code which calls runCli(process.argv)
a second time. This starts a duplicate gateway that fails on lock/port
contention and crashes the process with exit(1), causing a restart loop.

Wrap all top-level executable code in an isMainModule() check so it only
runs when entry.ts is the actual main module, not when imported as a
shared dependency by the bundler.

* fix: tighten gateway restart loop handling (openclaw#23416) (thanks @jeffwnli)

* chore: fix temp-path guard skip for *.test-helpers.ts

* fix: include modelByChannel in config validator allowedChannels

The hand-written config validator rejects `channels.modelByChannel` as
"unknown channel id: modelByChannel" even though the Zod schema, TypeScript
types, runtime code, and CLI docs all treat it as valid. The `defaults`
meta-key was already whitelisted but `modelByChannel` was missed when
the feature was added in 2026.2.21.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* also skip modelByChannel in plugin-auto-enable channel iteration

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: cover channels.modelByChannel validation/auto-enable

* fix: finalize modelByChannel validator landing (openclaw#23412) (thanks @ProspectOre)

* refactor: simplify windows ACL parsing and expand coverage

* refactor(gateway): simplify restart flow and expand lock tests

* refactor(plugin-sdk): unify channel dedupe primitives

* fix(acp): wait for gateway connection before processing ACP messages

- Move gateway.start() before AgentSideConnection creation
- Wait for hello message to confirm connection is established
- This fixes issues where messages were processed before gateway was ready

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: harden ACP gateway startup sequencing (openclaw#23390) (thanks @janckerchen)

* Memory/QMD: normalize Han-script BM25 search queries

* fix(stability): patch regex retries and timeout abort handling

* fix: handle intentional signal daemon shutdown on abort (openclaw#23379) (thanks @frankekn)

* refactor(signal): extract daemon lifecycle and typed exit handling

* Exec: fail closed when sandbox host is unavailable

* fix: harden exec sandbox fallback semantics (openclaw#23398) (thanks @bmendonca3)

* test: stabilize temp-path guard across runtimes (openclaw#23398)

* test: harden temp path guard detection (openclaw#23398)

* fix(feishu): avoid template tmpdir join in dedup state path (openclaw#23398)

* feat(feishu): persistent message deduplication to prevent duplicate replies

Closes openclaw#23369

Feishu may redeliver the same message during WebSocket reconnects or process
restarts.  The existing in-memory dedup map is lost on restart, so duplicates
slip through.

This adds a dual-layer dedup strategy:
- Memory cache (fast synchronous path, unchanged capacity)
- Filesystem store (~/.openclaw/feishu/dedup/) that survives restarts

TTL is extended from 30 min to 24 h.  Disk writes use atomic rename and
probabilistic cleanup to keep each per-account file under 10 k entries.
Disk errors are caught and logged — message handling falls back to
memory-only behaviour so it is never blocked.

* fix(feishu): address dedup race condition, namespace isolation, and cache staleness

- Prefix memoryCache keys with namespace to prevent cross-account false
  positives when different accounts receive the same message_id
- Add inflight tracking map to prevent TOCTOU race where concurrent
  async calls for the same message both pass the check and both proceed
- Remove expired-entry deletion from has() to avoid silent cache/disk
  divergence; actual cleanup happens probabilistically inside record()
- Add time-based cache invalidation (30s) to DedupStore.load() so
  external writes are eventually picked up
- Refresh cacheLoadedAt after flush() so we don't immediately re-read
  data we just wrote

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix: tighten feishu dedupe boundary (openclaw#23377) (thanks @SidQin-cyber)

* Feat/logger support log level validation0222 (openclaw#23436)

* 1、环境变量**:新增 `OPENCLAW_LOG_LEVEL`,可取值 `silent|fatal|error|warn|info|debug|trace`。设置后同时覆盖**文件日志**与**控制台**的级别,优先级高于配置文件。
2、启动参数**:在 `openclaw gateway run` 上新增 `--log-level <level>`,对该次进程同时生效于文件与控制台;未传时仍使用环境变量或配置文件。

* fix(logging): make log-level override global and precedence-safe

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>

* fix(telegram): prevent update offset skipping queued updates (openclaw#23284)

Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: 92efaf9
Co-authored-by: frankekn <4488090+frankekn@users.noreply.github.com>
Co-authored-by: obviyus <22031114+obviyus@users.noreply.github.com>
Reviewed-by: @obviyus

* fix: stop hardcoded channel fallback and auto-pick sole configured channel (openclaw#23357) (thanks @lbo728)

Co-authored-by: lbo728 <extreme0728@gmail.com>

* docs(security): clarify workspace memory trust boundary

* Security: expand audit checks for mDNS and real-IP fallback

* fix: land security audit severity + temp-path guard fixes (openclaw#23428) (thanks @bmendonca3)

* test(heartbeat): use shared sandbox in sender target suite

* perf(test): compact remaining heartbeat fixture writes

* test(reply): align native trigger suite with fast-test fixture patterns

* perf(test): speed subagent announce retry polling in fast mode

* test(agents): dedupe auth profile rotation fixture setup

* perf(test): trim background abort settle waits and dedupe cmd fixture

* perf(test): trim nested subagent output wait floor in fast mode

* perf(test): lower fast-mode nested output wait floor to 80ms

* test(agents): remove dead shell-timeout override in safeBins suite

* perf(test): lower fast-mode nested output wait floor to 70ms

* perf(test): remove flaky transport timeout and dedupe safeBins checks

* perf(test): mock compact module in auth rotation e2e

* perf(test): reduce subagent announce fast-mode polling waits

* perf(test): lower subagent fast-mode wait floors

* perf(test): trim bash e2e sleep and poll windows

* perf(test): narrow pi-embedded runner e2e import path

* test: reclassify mocked runner/safe-bins suites as unit tests

* test: reclassify auth-profile-rotation suite as unit test

* test: reclassify mocked announce and sandbox suites as unit tests

* perf(test): tighten background abort timing windows

* test: reclassify sandbox merge and exec path suites as unit tests

* perf(test): speed up sessions_spawn lifecycle suite setup

* test: reclassify sessions_spawn lifecycle suite as unit test

* perf(test): reduce bash e2e wait windows

* fix(gateway): strip directive tags from non-streaming webchat broadcasts

Closes openclaw#23053

The streaming path already strips [[reply_to_current]] and other
directive tags via stripInlineDirectiveTagsForDisplay, but the
non-streaming broadcastChatFinal path and the chat.inject path
sent raw message content to webchat clients, causing tags to
appear in rendered messages after streaming completes.

* fix: add non-streaming directive-tag regression tests (openclaw#23298) (thanks @SidQin-cyber)

* test: reclassify skills suites from e2e to unit lane

* test: reclassify models-config suites from e2e to unit lane

* test: harden models-config env isolation list

* refactor: clarify strict loopback proxy audit rules

* fix(session): resolve agent session path with configured sessions dir

Co-authored-by: David Rudduck <david@rudduck.org.au>

* fix(telegram): classify undici fetch errors as recoverable for retry (openclaw#16699)

Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: 67b5bce
Co-authored-by: Glucksberg <80581902+Glucksberg@users.noreply.github.com>
Co-authored-by: obviyus <22031114+obviyus@users.noreply.github.com>
Reviewed-by: @obviyus

* fix(config): add missing comment field to BindingsSchema

Strict validation (added in d1e9490) rejects the legitimate 'comment'
field on bindings. This field is used for annotations in config files.

Changes:
- BindingsSchema: added comment: z.string().optional()
- AgentBinding type: added comment?: string

Fixes openclaw#23385

* fix: add bindings comment regression test (openclaw#23458) (thanks @echoVic)

* fix(bluebubbles): treat null privateApiStatus as disabled, not enabled

Bug: privateApiStatus cache expires after 10 minutes, returning null.
The check '!== false' treats null as truthy, causing 500 errors when
trying to use Private API features that aren't actually available.

Root cause: In JavaScript, null !== false evaluates to true.

Fix: Changed all checks from '!== false' to '=== true', so null (cache
expired/unknown) is treated as disabled (safe default).

Files changed:
- extensions/bluebubbles/src/send.ts (line 376)
- extensions/bluebubbles/src/monitor-processing.ts (line 423)
- extensions/bluebubbles/src/attachments.ts (lines 210, 220)

Fixes openclaw#23393

* fix: align BlueBubbles private-api null fallback + warning (openclaw#23459) (thanks @echoVic)

* refactor(session): centralize transcript path option resolution

* fix: add operator.read and operator.write to default CLI scopes (openclaw#22582)

Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: 8569fc8
Co-authored-by: YuzuruS <1485195+YuzuruS@users.noreply.github.com>
Co-authored-by: obviyus <22031114+obviyus@users.noreply.github.com>
Reviewed-by: @obviyus

* feat(workspace): add PROFILE-<name>.md bootstrap file support

When OPENCLAW_PROFILE is set (and not "default"), automatically load a
PROFILE-<profileName>.md file from the workspace as an additional
bootstrap context file. This gives each profile instance its own
personality/context overlay without needing hook configuration.

Changes:
- Add isProfileBootstrapName() helper to validate PROFILE-*.md pattern
- Update loadWorkspaceBootstrapFiles() to load profile file when env var is set
- Insert profile file in correct order (after USER.md, before HEARTBEAT.md)
- Update loadExtraBootstrapFiles() to accept PROFILE-*.md filenames
- Update filterBootstrapFilesForSession() to preserve profile files in subagent/cron sessions
- Widen WorkspaceBootstrapFileName type to include dynamic profile filenames
- Add comprehensive test coverage for all profile file scenarios
- Update bootstrap-extra-files hook documentation

The profile file is optional - if it doesn't exist, it's silently skipped
without adding a [MISSING] marker. This makes it zero-config for
multi-instance setups like hive clusters.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* ci: add promoted release workflow for v*-turq.* tags

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
Co-authored-by: Vignesh Natarajan <vigneshnatarajan92@gmail.com>
Co-authored-by: SK Akram <skcodewizard786@gmail.com>
Co-authored-by: jeffr <jeffr@local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: pickaxe <54486432+ProspectOre@users.noreply.github.com>
Co-authored-by: janckerchen <janckerchen@gmail.com>
Co-authored-by: Frank Yang <frank.ekn@gmail.com>
Co-authored-by: Brian Mendonca <brianmendonca@Brians-MacBook-Air.local>
Co-authored-by: SidQin-cyber <sidqin0410@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: maweibin <532282155@qq.com>
Co-authored-by: frankekn <4488090+frankekn@users.noreply.github.com>
Co-authored-by: obviyus <22031114+obviyus@users.noreply.github.com>
Co-authored-by: lbo728 <extreme0728@gmail.com>
Co-authored-by: David Rudduck <david@rudduck.org.au>
Co-authored-by: Glucksberg <80581902+Glucksberg@users.noreply.github.com>
Co-authored-by: echoVic <echoVic@users.noreply.github.com>
Co-authored-by: Yuzuru Suzuki <navitima@gmail.com>
Co-authored-by: YuzuruS <1485195+YuzuruS@users.noreply.github.com>
centminmod added a commit to centminmod/clawdbot that referenced this pull request Feb 23, 2026
- Security audit enhancements: mDNS/real-IP fallback detection
- Directive tag stripping in webchat broadcasts
- Hardcoded channel fallback removal (openclaw#23357)
- Telegram update offset race condition fix (openclaw#23284)
- Exec sandbox fallback hardening (openclaw#23398)
- Feishu dedupe boundary tightening (openclaw#23377)
gabrielkoo pushed a commit to gabrielkoo/openclaw that referenced this pull request Feb 23, 2026
mreedr pushed a commit to mreedr/openclaw-custom that referenced this pull request Feb 24, 2026
mylukin pushed a commit to mylukin/openclaw that referenced this pull request Feb 26, 2026
hughdidit pushed a commit to hughdidit/DAISy-Agency that referenced this pull request Mar 1, 2026
…cyber)

(cherry picked from commit bf56196)

# Conflicts:
#	CHANGELOG.md
#	extensions/feishu/src/dedup.ts
hughdidit pushed a commit to hughdidit/DAISy-Agency that referenced this pull request Mar 3, 2026
…cyber)

(cherry picked from commit bf56196)

# Conflicts:
#	CHANGELOG.md
#	extensions/feishu/src/dedup.ts
zooqueen pushed a commit to hanzoai/bot that referenced this pull request Mar 6, 2026
zooqueen pushed a commit to hanzoai/bot that referenced this pull request Mar 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

channel: feishu Channel integration: feishu size: XS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(feishu): persistent message deduplication to prevent duplicate replies

2 participants