Description
arXiv:2604.01007 (April 1, 2026) — Omni-SimpleMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory by Liu et al. (UNC-Chapel Hill, UC Berkeley, UC Santa Cruz, Cisco).
OmniMem proposes an autonomous research pipeline that systematically discovers and validates memory architecture improvements through self-experimentation — running ~50 experiments across two benchmarks to diagnose failure modes and propose architectural modifications.
Code: https://github.com/aiming-lab/OmniMem
Key Results
- LoCoMo benchmark: F₁ improved from 0.117 → 0.598 (+411%)
- Mem-Gallery benchmark: F₁ improved from 0.254 → 0.797 (+214%)
What the Autoresearch Pipeline Discovered
Top gains by category:
- Bug fixes: +175% — the autoresearch pipeline finds and patches its own memory bugs
- Prompt engineering: +188% on specific categories
- Architectural changes: +44%
Crucially, architecture changes from self-experimentation outperformed all hyperparameter tuning combined.
Relevance to Zeph
Zeph's self-learning system (zeph-skills, ReasoningMemory, SkilHeuristics) already accumulates outcomes and extracts heuristics. OmniMem extends this concept to the memory architecture itself:
- Memory architecture self-improvement: Rather than only tuning skill strategies, the pipeline diagnoses why memory retrieval fails and proposes architectural patches
- Multimodal memory: OmniMem handles image + text memories; Zeph is currently text-only. This opens a path to image/attachment memory.
- Autoresearch evaluation loop: The pipeline runs automated evaluations (LoCoMo-style) without human oversight — a natural fit for Zeph's CI cycle
Proposed Design Direction
- Extend the
skill_outcomes / reasoning_strategies self-learning loop to also log memory retrieval failures (no-hit turns, low-confidence recalls)
- Add a periodic background task (via
zeph-scheduler) that runs a self-evaluation micro-benchmark on recent memory retrievals
- Use retrieval failure analysis to tune SYNAPSE spreading activation parameters (decay factor, depth, MMR threshold)
- Track memory architecture improvement suggestions in
skill_heuristics table, distinct from skill heuristics
References
Description
arXiv:2604.01007 (April 1, 2026) — Omni-SimpleMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory by Liu et al. (UNC-Chapel Hill, UC Berkeley, UC Santa Cruz, Cisco).
OmniMem proposes an autonomous research pipeline that systematically discovers and validates memory architecture improvements through self-experimentation — running ~50 experiments across two benchmarks to diagnose failure modes and propose architectural modifications.
Code: https://github.com/aiming-lab/OmniMem
Key Results
What the Autoresearch Pipeline Discovered
Top gains by category:
Crucially, architecture changes from self-experimentation outperformed all hyperparameter tuning combined.
Relevance to Zeph
Zeph's self-learning system (
zeph-skills,ReasoningMemory,SkilHeuristics) already accumulates outcomes and extracts heuristics. OmniMem extends this concept to the memory architecture itself:Proposed Design Direction
skill_outcomes/reasoning_strategiesself-learning loop to also log memory retrieval failures (no-hit turns, low-confidence recalls)zeph-scheduler) that runs a self-evaluation micro-benchmark on recent memory retrievalsskill_heuristicstable, distinct from skill heuristicsReferences