Fix: prevent double embedding in mem0.add (fixes #3723)#3996
Merged
kartik-mem0 merged 2 commits intomem0ai:mainfrom Mar 25, 2026
Merged
Fix: prevent double embedding in mem0.add (fixes #3723)#3996kartik-mem0 merged 2 commits intomem0ai:mainfrom
kartik-mem0 merged 2 commits intomem0ai:mainfrom
Conversation
Contributor
Author
Manual Test ResultsVerified the fix prevents double embedding in Test 1: Old Behavior - infer=False PathIssue: Embeddings passed directly without dict wrapper Test 2: New Behavior - infer=False PathFix: Wrap embeddings in dict before calling Test 3: Old Behavior - infer=True PathIssue: LLM rephrases facts, cache key doesn't match Test 4: New Behavior - infer=True PathFix: Proactively cache action_text before processing ADD/UPDATE Test 5: Performance and Cost ImpactAssumptions:
OLD BEHAVIOR (duplicate embeddings): NEW BEHAVIOR (fixed): SAVINGS:
Summary
Conclusion: This fix significantly improves performance and reduces costs for all |
Contributor
Author
|
Friendly ping - any chance someone could take a look at this when they get a chance? Happy to make any changes if needed. |
This fix addresses issue mem0ai#3723 where mem0.add() was calling the embedding API twice, unnecessarily doubling costs and latency. Changes made: 1. Modified _create_memory() to accept embeddings as either a dict (for caching) or a precomputed vector, preventing redundant calls 2. Updated infer=False path to pass embeddings as a dict 3. Added caching for action_text embeddings in infer=True path for both ADD and UPDATE operations, since the LLM may rephrase facts 4. Applied same fixes to both sync and async Memory classes 5. Added regression test to verify embedding is called only once The root cause was that when infer=False, embeddings were passed directly to _create_memory without a dict wrapper, causing it to re-embed. When infer=True, if the LLM rephrased extracted facts, the action_text wouldn't match the cache key, triggering re-embedding.
8815ae2 to
211a757
Compare
…st (mem0ai#3723) Make isinstance checks numpy-safe by using `not isinstance(dict)` instead of `isinstance(list)`, add type hints to existing_embeddings parameter, and add regression test for the infer=True UPDATE path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
kartik-mem0
approved these changes
Mar 25, 2026
This was referenced Mar 26, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR fixes issue #3723 where
mem0.add()was calling the embedding API twice, unnecessarily doubling costs and latency for users.Root Cause
The issue had two causes:
infer=False path: When
infer=False, embeddings were passed directly to_create_memory()without a dict wrapper. The method checkedif data in existing_embeddings, which failed sinceexisting_embeddingswas a vector list, not a dict, triggering a redundant embedding call.infer=True path: When
infer=True, facts extracted from messages were embedded and cached using the original fact text as the key. However, the LLM might rephrase these facts when generating ADD/UPDATE actions. Ifaction_textdidn't exactly match the cache key,_create_memory()would embed the text again.Solution
Modified
_create_memory()and_update_memory()to handle embeddings as either:Updated
infer=Falsepath to wrap embeddings in a dict before calling_create_memory()Added proactive caching in
infer=Truepath: before processing ADD/UPDATE events, check ifaction_textis already in the cache; if not, embed it once and cache itApplied all fixes to both sync (
Memory) and async (AsyncMemory) classesMade
isinstancechecks numpy-safe by usingnot isinstance(dict)instead ofisinstance(list), so embedding models returning numpy arrays or other vector types are handled correctlyAdded type hints (
Union[Dict[str, List[float]], List[float]]) toexisting_embeddingsparameter on all four methods to document the dual contractTesting
Automated tests (all passing)
tests/test_memory.py— 24/24 passed, including 3 regression tests for this fix:test_add_infer_false_embeds_once— verifiesinfer=Falsepath calls embed exactly oncetest_add_infer_true_caches_embedding_on_llm_rewrite— verifiesinfer=TrueADD path pre-caches rewritten text, no redundant embed inside_create_memorytest_update_infer_true_caches_embedding_on_llm_rewrite— verifiesinfer=TrueUPDATE path pre-caches rewritten text, no redundant embed inside_update_memorytests/memory/test_main.py— 12/12 passed (timestamps, error handling, async variants)tests/memory/test_graph_memory_soft_delete.py— 20/20 passedtests/memory/full directory — 216 passed, 43 skipped (skips are external services like Neo4j/Kuzu)tests/llms/test_openai.py— 7/7 passedtests/vector_stores/test_qdrant.py— 58/58 passedtests/embeddings/test_openai_embeddings.py— 6/6 passedBackward compatibility verification
_create_memoryand_update_memory(across production code and tests) were audited — every one passes a dict, so the primary code path is unchangednot isinstance(dict)fallback branch is purely defensive for the documentedList[float]typeNone, so the relaxed check introduces no new failure modesManual testing
infer=Trueandinfer=Falsepaths now reuse cached embeddingsImpact
Closes #3723