feat(openclaw): improve extraction quality with noise filtering, deduplication, and better instructions by utkarsh240799 · Pull Request #4302 · mem0ai/mem0

utkarsh240799 · 2026-03-11T14:48:36Z

Description

Problem

The OpenClaw mem0 plugin was suffering from poor extraction and recall quality in production. Analysis of real-world conversation data (~29K events) revealed compounding issues:

Noise pollution in extraction pipeline — System messages (heartbeats, timestamps, routing logs), single-word acknowledgments, and pre-compaction flushes were being sent to Mem0 for extraction, wasting API calls and diluting memory quality.
Generic assistant responses diluting signal — Boilerplate assistant replies were being included in extraction context, causing Mem0 to extract non-durable pleasantries instead of actual user facts.
Weak recall precision — No client-side threshold filtering or dynamic scoring meant irrelevant memories were injected into agent context.
Extraction instructions lacked specificity — Custom instructions were too generic, leading to verbose, non-temporal, intent-focused memories.
No trigger filtering — Cron jobs and heartbeats polluted memory with system-generated noise.
No multi-agent awareness — Subagent sessions had no memory isolation, causing namespace collisions and orphaned memories.
No user attribution in recall — Recalled memories lacked user identity context, risking misattribution.
Subagent hallucination — Ephemeral subagent UUIDs created empty namespaces for recall and orphaned namespaces for capture.
SQLite init failures in OSS mode — Native SQLite binding resolution fails under jiti, crashing the plugin with no recovery path.
Session race condition — Shared mutable currentSessionId variable caused cross-session data leaks when multiple sessions ran concurrently.

Solution

Noise filtering pipeline (isNoiseMessage → isGenericAssistantMessage → stripNoiseFromContent → truncateMessage):

Pattern-based detection for heartbeats, timestamps, single-word acks, system routing messages, and post-compaction audit logs
Generic assistant response detection (short boilerplate acknowledgments, help offers, and signaling phrases)
Inline noise stripping: removes embedded timestamps, system prefixes, and routing lines from otherwise useful messages
Message truncation to 2000 chars to avoid sending excessive context

Improved recall pipeline:

Client-side threshold filtering raised to 0.6 for auto-recall (stricter than explicit tool searches at 0.5)
Dynamic thresholding: drop memories scoring < 50% of top result's score
Broad recall for short/new-session prompts to provide better initial context (cold-start broadening)

Rewritten custom extraction instructions:

Temporal anchoring: memories are prefixed with dates for time-aware recall
Outcome-over-intent: store what happened, not what was planned
Language preservation: store memories in the user's original language
Related facts kept together to preserve context (not forced into atomic statements)

Non-interactive trigger filtering (isNonInteractiveTrigger):

Skips recall and capture for cron, heartbeat, automation, schedule triggers
Also detects trigger type from session key patterns (:cron:, :heartbeat:)

User identity in recall preamble:

Recall preamble now includes cfg.userId for better attribution
Extraction preamble includes user identity so memories are stored with correct attribution

Subagent hallucination prevention (isSubagentSession):

Detects ephemeral subagent sessions via :subagent: in session keys
Routes subagent recall to parent (main user) namespace — subagents get the user's long-term context
Skips capture for subagents — prevents orphaned memories
Subagent-specific preamble to prevent identity assumption

Multi-agent isolation:

extractAgentId correctly parses subagent session keys
effectiveUserId produces isolated namespaces per named agent
Named agents have completely isolated memory namespaces

Session race condition fix:

Lifecycle hooks now use ctx.sessionKey directly from the event context instead of shared mutable currentSessionId
Tools still read currentSessionId as best-effort fallback (they don't receive ctx)

SQLite resilience for OSS mode:

Provider init promises reset on failure, allowing retry
OSSProvider retries with history disabled when native SQLite bindings fail
New oss.disableHistory config option

User-content guard: Skip extraction when no meaningful user content remains after filtering.

Expanded extraction window: from last 10 → last 20 messages + earlier summary messages.

Code refactor: Split monolithic 1772-line index.ts into 6 focused modules.

Build pipeline: Fixed tsconfig for modular .ts imports and typed MemoryClient opts for clean DTS generation.

Type of change

New feature (non-breaking change which adds functionality)
Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

Unit Tests (78 tests passing)

cd openclaw && npx vitest run   # 78 tests, 0 failures
npx tsc --noEmit                # 0 type errors
npx tsup                        # ESM + DTS build success

isNoiseMessage: heartbeats, NO_REPLY, timestamps, single-word acks, system routing, post-compaction audit, real content passthrough
isGenericAssistantMessage: boilerplate detection, long messages passthrough, substantive responses passthrough
stripNoiseFromContent: embedded timestamps, system prefixes, routing lines, mixed content preservation
filterMessagesForExtraction: end-to-end pipeline, generic assistant dropping, assistant-only payload filtering
extractAgentId: main agent, named agents, subagent session keys
effectiveUserId: main user, agent-scoped users
isNonInteractiveTrigger: cron, heartbeat, automation, schedule triggers; session key patterns (10 tests)
isSubagentSession: subagent detection, main agent passthrough, named agent passthrough (4 tests)
mem0ConfigSchema — disableHistory: config parsing for oss.disableHistory (4 tests)
OSSProvider — disableHistory passthrough: Memory constructor receives flag (2 tests)
OSSProvider — SQLite fallback: retry with history disabled on failure (3 tests)
PlatformProvider — init recovery: initPromise reset on failure (1 test)

Live E2E Tests — npm v0.3.3 vs Local v0.4.0 Comparison

Identical test inputs run on both plugin versions with clean mem0 platform between runs.

Test scenarios:

Professional + personal context (staff ML engineer, team, family, pets, hobbies)
Technical decision with reasoning (Feast → Tecton migration with benchmarks)
Multi-turn debug session (model drift → root cause → fix)
Korean language policy update
Noise filtering (ok / thanks / HEARTBEAT_OK)
Recall — relationships ("Who are important people?")
Recall — decisions ("What technical decisions and why?")
Recall — unrelated query (sourdough bread)
Subagent spawn + entity hygiene

Comparison results:

Test	npm v0.3.3	Local v0.4.0	Winner
Temporal anchoring	0/8 memories (0%)	4/10 memories (40%)	Local
Recall — relationships	Failed (stale data from prior sessions)	Passed (correct people with context)	Local
Recall — decisions	Empty response	Passed (Feast→Tecton with reasoning)	Local
Noise filtering	Failed (7→8, 1 leak)	Passed (9→9, 0 leaks)	Local
Korean content	Translated to English	Preserved in Korean	Local
Unrelated query	Pass (no injection)	Pass (no injection)	Tie
Entity hygiene	3 entities	1 entity	Local
Subagent orphaned entities	0	0	Tie

Additional Test Rounds (prior to reviewer feedback)

7-phase comprehensive suite: 55 tests, 0 failures
Multi-agent namespace isolation: 12 tests across researcher/coder/reviewer agents
Concurrent session safety: 2 simultaneous sessions, both recalled correctly
userId exploit test: verified parameter behavior
GUI tests: trigger filtering and subagent isolation verified via OpenClaw GUI

Known Issues (Not Plugin Bugs)

Dollar sign corruption ($0.08 → /bin/zsh.08) — upstream mem0 platform bug
Fact updates sometimes create new memories instead of updating — platform-side extraction model behavior

Unit Test (78 tests)
Type Check (tsc --noEmit, 0 errors)
Build (tsup, ESM + DTS)
Live E2E Comparison (npm v0.3.3 vs Local v0.4.0)
GUI Test (trigger filtering, subagent isolation)

Checklist:

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
Any dependent changes have been merged and published in downstream modules
I have checked my code and corrected any misspellings

Maintainer Checklist

Made sure Checks passed

…plication, and better instructions - Add noise filtering pipeline (isNoiseMessage, stripNoiseFromContent, filterMessagesForExtraction) to drop cron heartbeats, acknowledgments, and system routing metadata before extraction - Add word-overlap deduplication (deduplicateByContent) for recalled memories to avoid redundant context injection - Rewrite DEFAULT_CUSTOM_INSTRUCTIONS with temporal anchoring, conciseness guidelines, outcome-over-intent extraction, and explicit exclusion rules - Expand agent_end message selection from last 10 to last 20 messages plus earlier summary messages - Add client-side threshold filtering and broad recall for short/new-session prompts in before_agent_start - Add pre-check for near-duplicate memories in memory_store tool - Add comprehensive unit tests for all new filtering and deduplication functions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add isGenericAssistantMessage() to detect boilerplate assistant responses like "I see you've shared an update. How can I help?" that contain no extractable facts. Integrates into filterMessagesForExtraction to drop these before sending to mem0.add(), preventing the extraction model from wasting capacity on empty acknowledgments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Extract types, providers, config, filtering, and isolation into separate files for better maintainability. No behavioral changes — all 55 tests pass unchanged. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add message filtering and deduplication sections to README - Fix searchThreshold default to 0.5 in code, docs, and config - Add v0.3.1 changelog entry with all new features - Bump version from 0.3.0 to 0.3.1 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… triggers - Fix extractAgentId() to handle OpenClaw's subagent session key format (agent:main:subagent:<uuid>) so subagent memories go to isolated namespaces (utkarsh:agent:subagent-<uuid>) instead of the base userId - Add isNonInteractiveTrigger() to skip autocapture/autorecall for cron, heartbeat, automation, and schedule triggers — prevents system prompts from polluting the user's memory store - Fallback detection via session key patterns (:cron:, :heartbeat:) when ctx.trigger is not set Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add user identity to extraction preamble so memories are attributed to the correct user instead of cross-referencing cached patterns (OPE-6 #1) - Skip mem0.add() when no user messages remain after noise filtering, avoiding wasted API calls on assistant-only payloads (OPE-6 #2) - Raise auto-recall threshold to 0.6 (vs 0.5 for explicit search) and add dynamic thresholding that drops memories below 50% of the top result's score to reduce irrelevant context injection (OPE-6 #3) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ution The auto-recall injection now tells the agent whose memories are being provided, so it can correctly distinguish the current user from third parties mentioned in memories. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…outing Subagents get ephemeral UUIDs (agent:main:subagent:<uuid>) that create empty, orphaned namespaces. This fix: - Adds isSubagentSession() to detect subagent session keys - Routes subagent recall to parent (main user) namespace so they get the user's long-term context instead of searching an empty namespace - Skips capture for subagents to prevent orphaned memories that are never read again (main agent captures consolidated output) - Adds subagent-specific preamble to prevent identity assumption Tested with 8 subagent spawns: 0 orphaned entities, 0 orphaned memories, 100% cross-session recall accuracy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Adds changelog entries for all enhancements since 0.3.1: - Non-interactive trigger filtering (cron/heartbeat) - Subagent hallucination prevention (namespace routing) - User identity in recall/extraction preambles - Dynamic recall thresholding - User-content guard - 72 unit tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… changelog Ports the SQLite resilience fix (bfe730a) from main into the refactored module files: - types.ts: add disableHistory to oss config - providers.ts: init error recovery + retry with history disabled - index.ts: re-export mem0ConfigSchema and createProvider for tests - CHANGELOG.md: include SQLite resilience in 0.4.0 release notes All 82 tests passing (72 plugin + 10 SQLite resilience). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…meter - Hooks now use ctx.sessionKey directly instead of shared mutable currentSessionId, preventing cross-session data leaks when multiple sessions run concurrently (e.g. multiple Telegram users) - Removed userId from memory_search, memory_store, memory_list tool parameters to prevent LLM prompt injection from accessing other users' namespaces. agentId is kept (safe — always namespaced) - Fixed tsconfig for modular imports (allowImportingTsExtensions) - Fixed providers.ts MemoryClient type for DTS generation - Updated README with subagent handling, concurrency safety, trigger filtering, security note, and disableHistory docs - Updated CHANGELOG with race condition fix and userId removal Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Regenerate lockfile to match package.json dependency bump. Fixes CI frozen-lockfile failure. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

codecov-commenter · 2026-03-17T17:40:26Z

Codecov Report

❌ Patch coverage is 31.60714% with 383 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
openclaw/index.ts	0.00%	203 Missing and 84 partials ⚠️
openclaw/providers.ts	49.39%	63 Missing and 21 partials ⚠️
openclaw/config.ts	76.47%	11 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

…serId, relax prompts Per Saket's review feedback: 1. Remove client-side deduplication (deduplicateByContent) — mem0 handles dedup internally via its new algo 2. Restore userId tool parameter — not a security boundary since any user with org/project access can already see all memories 3. Relax extraction prompts — keep related facts together instead of forcing atomic 1-2 sentence memories, preserving context What's kept: noise filtering, trigger filtering, subagent isolation, temporal anchoring, dynamic thresholding, cold-start broadening, race condition fix, SQLite resilience. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…plication, and better instructions (mem0ai#4302) Co-authored-by: utkarsh240799 <utkarsh240799@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

utkarsh240799 force-pushed the feat/openclaw-improvements branch from c04edab to 009f5cb Compare March 16, 2026 16:03

utkarsh240799 and others added 10 commits March 16, 2026 21:44

refactor(openclaw): split monolithic index.ts into focused modules

a9703f2

Extract types, providers, config, filtering, and isolation into separate files for better maintainability. No behavioral changes — all 55 tests pass unchanged. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

utkarsh240799 requested review from deshraj and whysosaket March 16, 2026 16:31

utkarsh240799 force-pushed the feat/openclaw-improvements branch from 009f5cb to 9175f14 Compare March 16, 2026 17:33

utkarsh240799 and others added 2 commits March 17, 2026 22:03

chore(openclaw): update pnpm-lock.yaml for mem0ai ^2.3.0

c1d3ae5

Regenerate lockfile to match package.json dependency bump. Fixes CI frozen-lockfile failure. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

whysosaket approved these changes Mar 18, 2026

View reviewed changes

utkarsh240799 mentioned this pull request Mar 18, 2026

docs(openclaw): align searchThreshold default with plugin code #4230

Closed

whysosaket merged commit b971b61 into main Mar 18, 2026
7 checks passed

whysosaket deleted the feat/openclaw-improvements branch March 18, 2026 22:21

jamebobob mentioned this pull request Mar 27, 2026

What we found after auditing 10,134 mem0 entries: 97.8% were junk #4573

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(openclaw): improve extraction quality with noise filtering, deduplication, and better instructions#4302

feat(openclaw): improve extraction quality with noise filtering, deduplication, and better instructions#4302
whysosaket merged 13 commits intomainfrom
feat/openclaw-improvements

utkarsh240799 commented Mar 11, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Mar 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

utkarsh240799 commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Problem

Solution

Type of change

How Has This Been Tested?

Unit Tests (78 tests passing)

Live E2E Tests — npm v0.3.3 vs Local v0.4.0 Comparison

Additional Test Rounds (prior to reviewer feedback)

Known Issues (Not Plugin Bugs)

Checklist:

Maintainer Checklist

Uh oh!

codecov-commenter commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

utkarsh240799 commented Mar 11, 2026 •

edited

Loading

codecov-commenter commented Mar 17, 2026 •

edited

Loading