Skip to content

research(memory): MemMachine — retrieval-depth-first personalized memory architecture (arXiv:2604.04853) #3325

@bug-ops

Description

@bug-ops

Description

MemMachine presents a ground-truth-preserving personalized memory system that stores full conversational episodes alongside processed summaries. Key finding: retrieval-stage optimization substantially outperforms ingestion-stage optimization.

Key Insights

Retrieval improvements (total ~9.5%):

  • Retrieval depth tuning: +4.2%
  • Context formatting: +2.0%
  • Search prompt design: +1.8%
  • Query bias correction: +1.4%

Ingestion improvements:

  • Sentence chunking: +0.8% (much smaller gain)

Conclusion: how you retrieve stored information matters far more than how you chunk/process it at ingestion time.

Efficiency: 80% fewer input tokens vs. competing systems; 93% accuracy on LongMemEvalS.

Three Memory Layers

  1. Short-term (in-context)
  2. Long-term episodic (full conversation episodes preserved)
  3. Profile memory (distilled user model)

Relevance to Zeph

Zeph's uses summarization at write time (ingestion-side). MemMachine's findings suggest that investing more in retrieval-side tuning (search prompt design, depth, formatting, query bias correction) would yield higher returns than improving the summarization pipeline.

Concrete actions:

  • Add retrieval depth config: number of candidates to fetch before MMR/reranking
  • Improve search prompt templates sent to the embedding/retrieval model
  • Add query bias correction: detect when query is first-person vs. topic query and adjust embedding
  • Preserve raw episode data alongside summaries (current impl may summarize-only)

Source

Metadata

Metadata

Assignees

Labels

P3Research — medium-high complexitymemoryzeph-memory crate (SQLite)researchResearch-driven improvement

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions