feat: memory consolidation upgrades (LLM Merge, Search-and-Ask, diversity penalty)

## Context

Deep dive on ["Complementary Reinforcement Learning" (arXiv:2603.17621)](https://arxiv.org/abs/2603.17621) -- co-evolutionary actor + experience-extractor framework with a living experience bank.

The RL training loop itself is out of scope (no model weights), but four patterns are adoptable today without it. Key risk from paper's own ablation (Figure 3a): static extractor without RL yields only marginal gains -- architecture is adoptable but performance claims are RL-specific.

## Action Items (No RL Required)

- [ ] **LLMConsolidationStrategy**: Implement LLM-based Merge operation behind existing `ConsolidationStrategy` protocol. Feeds related memories + trajectory outcomes to LLM for semantic deduplication and synthesis. Addresses gap that `SimpleConsolidationStrategy` does no semantic dedup.
- [ ] **Search-and-Ask retrieval tool**: Add `search_memory` as a tool-based retrieval strategy in `memory/injection.py`. Agent can explicitly query memory mid-execution rather than relying on pre-injected context.
- [ ] **Retrieval diversity penalty**: Augment `memory/ranking.py` scoring to penalize retrieving memories too similar to each other. Promotes coverage over redundancy.
- [ ] **DistillationRequest capture**: New model capturing (trajectory_summary, outcome, retrieved_memories) at task completion. Feeds into consolidation pipeline for outcome-driven memory curation.

## Comparison with Prior Research

- EvoSkill (2026-03-14): Failure-driven skill discovery. Complementary RL adds reward-signal-driven curation.
- Memex(RL) (2026-03-14): Indexed experience memory. Complementary RL adds the extractor LLM concept.
- XSkill (2026-03-22): Dual-stream extraction. Complementary RL provides the co-evolutionary training framework.

## References

- [arXiv:2603.17621](https://arxiv.org/abs/2603.17621)
- Related: #694 (hybrid search), #695 (embedding selection), #702 (memory consistency)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: memory consolidation upgrades (LLM Merge, Search-and-Ask, diversity penalty) #704

Context

Action Items (No RL Required)

Comparison with Prior Research

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

feat: memory consolidation upgrades (LLM Merge, Search-and-Ask, diversity penalty) #704

Description

Context

Action Items (No RL Required)

Comparison with Prior Research

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions