Skip to content

[Feature]: Memory Compression and Deduplication Optimization #141

@Teingi

Description

@Teingi

Describe your use case

Over time, memory databases can accumulate:

  • Duplicate or highly similar memories
  • Redundant information that can be consolidated
  • Outdated memories that should be compressed or merged
  • Low-importance memories taking up storage

Users need automatic optimization to:

  • Reduce storage costs
  • Improve search performance
  • Maintain memory quality
  • Consolidate related memories

Describe the solution you'd like

Implement an intelligent compression and deduplication system:

  1. Deduplication:

    • Detect duplicate memories using semantic similarity (embedding-based)
    • Detect near-duplicates with configurable similarity threshold
    • Merge duplicates intelligently (keep most important, combine metadata)
    • Report deduplication results
  2. Compression:

    • Identify similar memories that can be merged
    • Use LLM to summarize and consolidate related memories
    • Preserve key information while reducing redundancy
    • Maintain memory relationships and metadata
  3. Optimization strategies:

    • Automatic: Run periodically based on configuration
    • Manual: memory.optimize(strategy="deduplicate") or memory.compress()
    • Selective: Optimize specific user/agent memories or date ranges
  4. Configuration:

    • Similarity threshold for deduplication
    • Compression aggressiveness (conservative vs. aggressive)
    • Scheduling (daily, weekly, on-demand)
    • Dry-run mode to preview changes
  5. API:

    # Deduplication
    results = memory.deduplicate(user_id="user123", threshold=0.95)
    # Returns: {"duplicates_found": 10, "merged": 10, "saved_space": "..."}
    
    # Compression
    results = memory.compress(user_id="user123", strategy="conservative")
    # Returns: {"compressed": 5, "original_count": 20, "compressed_count": 15}

The solution should:

  • Be safe (backup before optimization, rollback on errors)
  • Preserve important information
  • Maintain memory relationships and graph connections
  • Provide detailed reports of optimizations
  • Support incremental optimization (process in batches)

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

Status
Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions