Skip to content

feat(openclaw): improve extraction quality with noise filtering, deduplication, and better instructions#4302

Merged
whysosaket merged 13 commits intomainfrom
feat/openclaw-improvements
Mar 18, 2026
Merged

feat(openclaw): improve extraction quality with noise filtering, deduplication, and better instructions#4302
whysosaket merged 13 commits intomainfrom
feat/openclaw-improvements

Conversation

@utkarsh240799
Copy link
Copy Markdown
Contributor

@utkarsh240799 utkarsh240799 commented Mar 11, 2026

Description

Problem

The OpenClaw mem0 plugin was suffering from poor extraction and recall quality in production. Analysis of real-world conversation data (~29K events) revealed compounding issues:

  1. Noise pollution in extraction pipeline — System messages (heartbeats, timestamps, routing logs), single-word acknowledgments, and pre-compaction flushes were being sent to Mem0 for extraction, wasting API calls and diluting memory quality.
  2. Generic assistant responses diluting signal — Boilerplate assistant replies were being included in extraction context, causing Mem0 to extract non-durable pleasantries instead of actual user facts.
  3. Weak recall precision — No client-side threshold filtering or dynamic scoring meant irrelevant memories were injected into agent context.
  4. Extraction instructions lacked specificity — Custom instructions were too generic, leading to verbose, non-temporal, intent-focused memories.
  5. No trigger filtering — Cron jobs and heartbeats polluted memory with system-generated noise.
  6. No multi-agent awareness — Subagent sessions had no memory isolation, causing namespace collisions and orphaned memories.
  7. No user attribution in recall — Recalled memories lacked user identity context, risking misattribution.
  8. Subagent hallucination — Ephemeral subagent UUIDs created empty namespaces for recall and orphaned namespaces for capture.
  9. SQLite init failures in OSS mode — Native SQLite binding resolution fails under jiti, crashing the plugin with no recovery path.
  10. Session race condition — Shared mutable currentSessionId variable caused cross-session data leaks when multiple sessions ran concurrently.

Solution

Noise filtering pipeline (isNoiseMessageisGenericAssistantMessagestripNoiseFromContenttruncateMessage):

  • Pattern-based detection for heartbeats, timestamps, single-word acks, system routing messages, and post-compaction audit logs
  • Generic assistant response detection (short boilerplate acknowledgments, help offers, and signaling phrases)
  • Inline noise stripping: removes embedded timestamps, system prefixes, and routing lines from otherwise useful messages
  • Message truncation to 2000 chars to avoid sending excessive context

Improved recall pipeline:

  • Client-side threshold filtering raised to 0.6 for auto-recall (stricter than explicit tool searches at 0.5)
  • Dynamic thresholding: drop memories scoring < 50% of top result's score
  • Broad recall for short/new-session prompts to provide better initial context (cold-start broadening)

Rewritten custom extraction instructions:

  • Temporal anchoring: memories are prefixed with dates for time-aware recall
  • Outcome-over-intent: store what happened, not what was planned
  • Language preservation: store memories in the user's original language
  • Related facts kept together to preserve context (not forced into atomic statements)

Non-interactive trigger filtering (isNonInteractiveTrigger):

  • Skips recall and capture for cron, heartbeat, automation, schedule triggers
  • Also detects trigger type from session key patterns (:cron:, :heartbeat:)

User identity in recall preamble:

  • Recall preamble now includes cfg.userId for better attribution
  • Extraction preamble includes user identity so memories are stored with correct attribution

Subagent hallucination prevention (isSubagentSession):

  • Detects ephemeral subagent sessions via :subagent: in session keys
  • Routes subagent recall to parent (main user) namespace — subagents get the user's long-term context
  • Skips capture for subagents — prevents orphaned memories
  • Subagent-specific preamble to prevent identity assumption

Multi-agent isolation:

  • extractAgentId correctly parses subagent session keys
  • effectiveUserId produces isolated namespaces per named agent
  • Named agents have completely isolated memory namespaces

Session race condition fix:

  • Lifecycle hooks now use ctx.sessionKey directly from the event context instead of shared mutable currentSessionId
  • Tools still read currentSessionId as best-effort fallback (they don't receive ctx)

SQLite resilience for OSS mode:

  • Provider init promises reset on failure, allowing retry
  • OSSProvider retries with history disabled when native SQLite bindings fail
  • New oss.disableHistory config option

User-content guard: Skip extraction when no meaningful user content remains after filtering.

Expanded extraction window: from last 10 → last 20 messages + earlier summary messages.

Code refactor: Split monolithic 1772-line index.ts into 6 focused modules.

Build pipeline: Fixed tsconfig for modular .ts imports and typed MemoryClient opts for clean DTS generation.

Type of change

  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

Unit Tests (78 tests passing)

cd openclaw && npx vitest run   # 78 tests, 0 failures
npx tsc --noEmit                # 0 type errors
npx tsup                        # ESM + DTS build success
  • isNoiseMessage: heartbeats, NO_REPLY, timestamps, single-word acks, system routing, post-compaction audit, real content passthrough
  • isGenericAssistantMessage: boilerplate detection, long messages passthrough, substantive responses passthrough
  • stripNoiseFromContent: embedded timestamps, system prefixes, routing lines, mixed content preservation
  • filterMessagesForExtraction: end-to-end pipeline, generic assistant dropping, assistant-only payload filtering
  • extractAgentId: main agent, named agents, subagent session keys
  • effectiveUserId: main user, agent-scoped users
  • isNonInteractiveTrigger: cron, heartbeat, automation, schedule triggers; session key patterns (10 tests)
  • isSubagentSession: subagent detection, main agent passthrough, named agent passthrough (4 tests)
  • mem0ConfigSchema — disableHistory: config parsing for oss.disableHistory (4 tests)
  • OSSProvider — disableHistory passthrough: Memory constructor receives flag (2 tests)
  • OSSProvider — SQLite fallback: retry with history disabled on failure (3 tests)
  • PlatformProvider — init recovery: initPromise reset on failure (1 test)

Live E2E Tests — npm v0.3.3 vs Local v0.4.0 Comparison

Identical test inputs run on both plugin versions with clean mem0 platform between runs.

Test scenarios:

  1. Professional + personal context (staff ML engineer, team, family, pets, hobbies)
  2. Technical decision with reasoning (Feast → Tecton migration with benchmarks)
  3. Multi-turn debug session (model drift → root cause → fix)
  4. Korean language policy update
  5. Noise filtering (ok / thanks / HEARTBEAT_OK)
  6. Recall — relationships ("Who are important people?")
  7. Recall — decisions ("What technical decisions and why?")
  8. Recall — unrelated query (sourdough bread)
  9. Subagent spawn + entity hygiene

Comparison results:

Test npm v0.3.3 Local v0.4.0 Winner
Temporal anchoring 0/8 memories (0%) 4/10 memories (40%) Local
Recall — relationships Failed (stale data from prior sessions) Passed (correct people with context) Local
Recall — decisions Empty response Passed (Feast→Tecton with reasoning) Local
Noise filtering Failed (7→8, 1 leak) Passed (9→9, 0 leaks) Local
Korean content Translated to English Preserved in Korean Local
Unrelated query Pass (no injection) Pass (no injection) Tie
Entity hygiene 3 entities 1 entity Local
Subagent orphaned entities 0 0 Tie

Additional Test Rounds (prior to reviewer feedback)

  • 7-phase comprehensive suite: 55 tests, 0 failures
  • Multi-agent namespace isolation: 12 tests across researcher/coder/reviewer agents
  • Concurrent session safety: 2 simultaneous sessions, both recalled correctly
  • userId exploit test: verified parameter behavior
  • GUI tests: trigger filtering and subagent isolation verified via OpenClaw GUI

Known Issues (Not Plugin Bugs)

  1. Dollar sign corruption ($0.08/bin/zsh.08) — upstream mem0 platform bug
  2. Fact updates sometimes create new memories instead of updating — platform-side extraction model behavior
  • Unit Test (78 tests)
  • Type Check (tsc --noEmit, 0 errors)
  • Build (tsup, ESM + DTS)
  • Live E2E Comparison (npm v0.3.3 vs Local v0.4.0)
  • GUI Test (trigger filtering, subagent isolation)

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

Maintainer Checklist

  • Made sure Checks passed

@utkarsh240799 utkarsh240799 force-pushed the feat/openclaw-improvements branch from c04edab to 009f5cb Compare March 16, 2026 16:03
utkarsh240799 and others added 10 commits March 16, 2026 21:44
…plication, and better instructions

- Add noise filtering pipeline (isNoiseMessage, stripNoiseFromContent, filterMessagesForExtraction) to drop cron heartbeats, acknowledgments, and system routing metadata before extraction
- Add word-overlap deduplication (deduplicateByContent) for recalled memories to avoid redundant context injection
- Rewrite DEFAULT_CUSTOM_INSTRUCTIONS with temporal anchoring, conciseness guidelines, outcome-over-intent extraction, and explicit exclusion rules
- Expand agent_end message selection from last 10 to last 20 messages plus earlier summary messages
- Add client-side threshold filtering and broad recall for short/new-session prompts in before_agent_start
- Add pre-check for near-duplicate memories in memory_store tool
- Add comprehensive unit tests for all new filtering and deduplication functions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add isGenericAssistantMessage() to detect boilerplate assistant responses
like "I see you've shared an update. How can I help?" that contain no
extractable facts. Integrates into filterMessagesForExtraction to drop
these before sending to mem0.add(), preventing the extraction model from
wasting capacity on empty acknowledgments.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extract types, providers, config, filtering, and isolation into
separate files for better maintainability. No behavioral changes —
all 55 tests pass unchanged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add message filtering and deduplication sections to README
- Fix searchThreshold default to 0.5 in code, docs, and config
- Add v0.3.1 changelog entry with all new features
- Bump version from 0.3.0 to 0.3.1

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… triggers

- Fix extractAgentId() to handle OpenClaw's subagent session key format
  (agent:main:subagent:<uuid>) so subagent memories go to isolated
  namespaces (utkarsh:agent:subagent-<uuid>) instead of the base userId
- Add isNonInteractiveTrigger() to skip autocapture/autorecall for cron,
  heartbeat, automation, and schedule triggers — prevents system prompts
  from polluting the user's memory store
- Fallback detection via session key patterns (:cron:, :heartbeat:) when
  ctx.trigger is not set

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add user identity to extraction preamble so memories are attributed to
  the correct user instead of cross-referencing cached patterns (OPE-6 #1)
- Skip mem0.add() when no user messages remain after noise filtering,
  avoiding wasted API calls on assistant-only payloads (OPE-6 #2)
- Raise auto-recall threshold to 0.6 (vs 0.5 for explicit search) and
  add dynamic thresholding that drops memories below 50% of the top
  result's score to reduce irrelevant context injection (OPE-6 #3)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ution

The auto-recall injection now tells the agent whose memories are being
provided, so it can correctly distinguish the current user from third
parties mentioned in memories.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…outing

Subagents get ephemeral UUIDs (agent:main:subagent:<uuid>) that create
empty, orphaned namespaces. This fix:

- Adds isSubagentSession() to detect subagent session keys
- Routes subagent recall to parent (main user) namespace so they get
  the user's long-term context instead of searching an empty namespace
- Skips capture for subagents to prevent orphaned memories that are
  never read again (main agent captures consolidated output)
- Adds subagent-specific preamble to prevent identity assumption

Tested with 8 subagent spawns: 0 orphaned entities, 0 orphaned
memories, 100% cross-session recall accuracy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds changelog entries for all enhancements since 0.3.1:
- Non-interactive trigger filtering (cron/heartbeat)
- Subagent hallucination prevention (namespace routing)
- User identity in recall/extraction preambles
- Dynamic recall thresholding
- User-content guard
- 72 unit tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… changelog

Ports the SQLite resilience fix (bfe730a) from main into the
refactored module files:
- types.ts: add disableHistory to oss config
- providers.ts: init error recovery + retry with history disabled
- index.ts: re-export mem0ConfigSchema and createProvider for tests
- CHANGELOG.md: include SQLite resilience in 0.4.0 release notes

All 82 tests passing (72 plugin + 10 SQLite resilience).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@utkarsh240799 utkarsh240799 force-pushed the feat/openclaw-improvements branch from 009f5cb to 9175f14 Compare March 16, 2026 17:33
utkarsh240799 and others added 2 commits March 17, 2026 22:03
…meter

- Hooks now use ctx.sessionKey directly instead of shared mutable
  currentSessionId, preventing cross-session data leaks when multiple
  sessions run concurrently (e.g. multiple Telegram users)
- Removed userId from memory_search, memory_store, memory_list tool
  parameters to prevent LLM prompt injection from accessing other
  users' namespaces. agentId is kept (safe — always namespaced)
- Fixed tsconfig for modular imports (allowImportingTsExtensions)
- Fixed providers.ts MemoryClient type for DTS generation
- Updated README with subagent handling, concurrency safety,
  trigger filtering, security note, and disableHistory docs
- Updated CHANGELOG with race condition fix and userId removal

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Regenerate lockfile to match package.json dependency bump.
Fixes CI frozen-lockfile failure.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 17, 2026

Codecov Report

❌ Patch coverage is 31.60714% with 383 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
openclaw/index.ts 0.00% 203 Missing and 84 partials ⚠️
openclaw/providers.ts 49.39% 63 Missing and 21 partials ⚠️
openclaw/config.ts 76.47% 11 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

…serId, relax prompts

Per Saket's review feedback:

1. Remove client-side deduplication (deduplicateByContent) — mem0
   handles dedup internally via its new algo
2. Restore userId tool parameter — not a security boundary since
   any user with org/project access can already see all memories
3. Relax extraction prompts — keep related facts together instead
   of forcing atomic 1-2 sentence memories, preserving context

What's kept: noise filtering, trigger filtering, subagent isolation,
temporal anchoring, dynamic thresholding, cold-start broadening,
race condition fix, SQLite resilience.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@whysosaket whysosaket merged commit b971b61 into main Mar 18, 2026
7 checks passed
@whysosaket whysosaket deleted the feat/openclaw-improvements branch March 18, 2026 22:21
jamebobob pushed a commit to jamebobob/mem0-vigil-recall that referenced this pull request Mar 29, 2026
…plication, and better instructions (mem0ai#4302)

Co-authored-by: utkarsh240799 <utkarsh240799@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants