fix(oss): normalize malformed LLM fact output before embedding by amahuli03 · Pull Request #4224 · mem0ai/mem0

amahuli03 · 2026-03-05T17:41:24Z

Description

When calling POST /api/v1/memories/ with infer=true, smaller LLMs (e.g. Ollama's llama3.1:8b) intermittently return facts as objects ({"fact": "..."} or {"text": "..."}) instead of plain strings. These non-string values are passed directly to embedding_model.embed(), which calls .replace("\n", " ") on them, causing AttributeError: 'list' object has no attribute 'replace'.

Fixes #4100

Root cause

The fact extraction path in mem0/memory/main.py parses the LLM's JSON response and passes each fact directly to the embedding model without validating its type:

new_retrieved_facts = json.loads(response)["facts"]
# ...
for new_mem in new_retrieved_facts:
    messages_embeddings = self.embedding_model.embed(new_mem, "add")  # assumes string

The prompt asks the LLM to return {"facts": ["string1", "string2"]}, but smaller models don't reliably follow this format. Instead they return structures like:

{"facts": [{"fact": "User likes Python"}, {"text": "User is a developer"}]}

This is a known issue — the TypeScript SDK already fixed it in db15d5c6 by adding a FactRetrievalSchema that normalizes these malformed shapes before embedding. See mem0-ts/src/oss/src/prompts/index.ts. The Python SDK was missing equivalent validation.

Fix

This PR ports that validation to Python by adding normalize_facts() in mem0/memory/utils.py, which handles:

Plain strings (passthrough)
{"fact": "..."} objects (extracts the fact value)
{"text": "..."} objects (extracts the text value)
Other types (converts via str())
Empty strings (filtered out)

It is called in both the sync and async _add_to_vector_store paths immediately after JSON parsing, before facts reach the embedder.

Related issues: #3439, #3238

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

Please delete options that are not relevant.

Unit Test
hatch run pytest tests/test_memory.py — 17/17 pass.

Tests added:

Reproduction test (test_add_infer_with_malformed_llm_facts): mocks LLM returning dict-shaped facts with infer=True, confirms no AttributeError.
normalize_facts unit tests: plain strings, {"fact": ...}, {"text": ...}, mixed lists, empty string filtering

Checklist:

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have checked my code and corrected any misspellings

Maintainer Checklist

closes OpenMemory infer=true fails with 'list' object has no attribute 'replace' (Qdrant path) #4100
Made sure Checks passed

…facts When smaller LLMs return facts as objects ({"fact": "..."}) instead of plain strings, embedding_model.embed() crashes with 'dict' object has no attribute 'replace'. This test confirms the bug by mocking the LLM to return dict-shaped facts and asserting the failure at mem0/memory/main.py:473.

Port of TypeScript FactRetrievalSchema to Python. Normalizes facts that smaller LLMs return as {"fact": "..."} or {"text": "..."} objects back into plain strings before embedding.

After parsing LLM JSON response, normalize facts before passing them to the embedding model. Fixes 'list'/'dict' object has no attribute 'replace' when smaller LLMs return malformed fact objects.

Tests plain strings, {"fact": ...}, {"text": ...}, mixed lists, and empty string filtering.

kartik-mem0

thank you for your contribution @amahuli03

the solution lgtm!

…i#4224) Co-authored-by: kartik-mem0 <kartik.labhshetwar@mem0.ai>

amahuli03 and others added 5 commits March 5, 2026 11:41

Add normalize_facts() utility for malformed LLM fact extraction output

3ae1120

Port of TypeScript FactRetrievalSchema to Python. Normalizes facts that smaller LLMs return as {"fact": "..."} or {"text": "..."} objects back into plain strings before embedding.

Wire normalize_facts into sync and async fact extraction paths

206ec35

After parsing LLM JSON response, normalize facts before passing them to the embedding model. Fixes 'list'/'dict' object has no attribute 'replace' when smaller LLMs return malformed fact objects.

Add unit tests for normalize_facts covering all LLM fact shapes

99797a1

Tests plain strings, {"fact": ...}, {"text": ...}, mixed lists, and empty string filtering.

fix: harden normalize_facts with None guard and skip unknown dict shapes

2fc9368

kartik-mem0 self-requested a review March 18, 2026 14:42

kartik-mem0 approved these changes Mar 18, 2026

View reviewed changes

kartik-mem0 merged commit 577a5a2 into mem0ai:main Mar 18, 2026
8 checks passed

jamebobob pushed a commit to jamebobob/mem0-vigil-recall that referenced this pull request Mar 29, 2026

fix(oss): normalize malformed LLM fact output before embedding (mem0a…

bb4afab

…i#4224) Co-authored-by: kartik-mem0 <kartik.labhshetwar@mem0.ai>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(oss): normalize malformed LLM fact output before embedding#4224

fix(oss): normalize malformed LLM fact output before embedding#4224
kartik-mem0 merged 5 commits intomem0ai:mainfrom
amahuli03:4100/post-memories-with-infer-true

amahuli03 commented Mar 5, 2026

Uh oh!

kartik-mem0 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

amahuli03 commented Mar 5, 2026

Description

Root cause

Fix

Type of change

How Has This Been Tested?

Checklist:

Maintainer Checklist

Uh oh!

kartik-mem0 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants