fix(oss): validate LLM fact output via FactRetrievalSchema before embedding by mgoulart · Pull Request #4083 · mem0ai/mem0

mgoulart · 2026-02-20T00:41:59Z

Summary

addToVectorStore passes raw LLM-extracted facts directly to embedder.embed() without validating they are strings. Local/smaller LLMs (e.g. llama3.1:8b via Ollama) sometimes return facts as objects instead of plain strings:

// Expected
{"facts": ["User prefers dark mode", "Name is Alice"]}

// What llama3.1:8b sometimes returns
{"facts": [{"fact": "User prefers dark mode"}, {"fact": "Name is Alice"}]}

This causes Ollama's Go server to crash with:

ResponseError: json: cannot unmarshal object into Go struct field EmbeddingRequest.prompt of type string

Approach

FactRetrievalSchema already exists in prompts/index.ts but was never used at the parse site. This PR:

prompts/index.ts — Extends FactRetrievalSchema with a z.union transform that accepts string | { fact: string } | { text: string } and normalizes to string[], filtering empties.
memory/index.ts — Replaces raw JSON.parse().facts with FactRetrievalSchema.parse() so the schema is actually used.
embeddings/ollama.ts — One-line safety net (typeof text === "string" ? text : JSON.stringify(text)) in case non-string values reach the embedder from other callers.

Test plan

memory.add() with llama3.1:8b returning [{ fact: "..." }] — normalizes and stores correctly
memory.add() with models returning ["string"] — no behavior change
Zod .parse() catches completely invalid JSON and falls through to facts = []
Empty facts are filtered out and don't enter the vector store

🤖 Generated with Claude Code

…edding Local/smaller LLMs (e.g. llama3.1:8b via Ollama) sometimes return facts as objects ({ fact: "..." }) instead of plain strings. addToVectorStore passed these directly to the embedder, crashing Ollama's Go server: ResponseError: json: cannot unmarshal object into Go struct field EmbeddingRequest.prompt of type string Fix: use the existing (but unused) FactRetrievalSchema with a z.union transform to accept string | { fact } | { text } shapes and normalize to string[]. Add a one-line safety net in OllamaEmbedder.embed(). Fixes mem0ai#4081 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

CLAassistant · 2026-02-20T00:42:07Z

All committers have signed the CLA.

mem0-bot · 2026-02-23T03:51:26Z

Great fix for a real production issue! 👍

Thank you for tackling this crash with local LLMs. The approach is solid and the implementation is clean.

What I like:

Smart use of existing infrastructure: Leveraging the already-defined FactRetrievalSchema that wasn't being used
Robust normalization: The Zod union with transforms elegantly handles the common malformed shapes from smaller LLMs
Defense in depth: Adding the safety net in OllamaEmbedder is good defensive programming
Backward compatibility: Existing working code continues to work unchanged
Clear problem statement: The issue description and reproduction steps are excellent

Code quality observations:

mem0-ts/src/oss/src/prompts/index.ts: The factItem union type is well-designed. The transformation logic cleanly extracts text from {fact: string} and {text: string} objects while preserving plain strings.

mem0-ts/src/oss/src/memory/index.ts: Good replacement of raw JSON parsing with proper schema validation. Error handling maintains existing behavior.

mem0-ts/src/oss/src/embeddings/ollama.ts: The defensive JSON.stringify() fallback is reasonable, though for complex objects it might produce less meaningful embeddings than extracting a text field would.

The fix is focused, low-risk, and addresses the root cause appropriately. Nice work! 🚀

…edding (mem0ai#4083)

deshraj approved these changes Feb 23, 2026

View reviewed changes

deshraj merged commit db15d5c into mem0ai:main Feb 23, 2026
1 of 2 checks passed

jamebobob pushed a commit to jamebobob/mem0-vigil-recall that referenced this pull request Mar 29, 2026

fix(oss): validate LLM fact output via FactRetrievalSchema before emb…

6b89f06

…edding (mem0ai#4083)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(oss): validate LLM fact output via FactRetrievalSchema before embedding#4083

fix(oss): validate LLM fact output via FactRetrievalSchema before embedding#4083
deshraj merged 1 commit intomem0ai:mainfrom
mgoulart:fix/ollama-embed-non-string-facts

mgoulart commented Feb 20, 2026

Uh oh!

CLAassistant commented Feb 20, 2026 •

edited

Loading

Uh oh!

mem0-bot bot commented Feb 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mgoulart commented Feb 20, 2026

Summary

Approach

Test plan

Uh oh!

CLAassistant commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mem0-bot bot commented Feb 23, 2026

Great fix for a real production issue! 👍

What I like:

Code quality observations:

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CLAassistant commented Feb 20, 2026 •

edited

Loading