Skip to content

Feature: Structured Memory System — Typed Nodes, Graph Edges, and Hybrid Search #346

@teknium1

Description

@teknium1

Overview

Hermes Agent's current memory system uses flat text files (MEMORY.md and USER.md) with character limits and delimiter-separated entries. While functional and simple, it cannot express relationships between memories, distinguish between types of knowledge, decay stale information, or perform semantic search. As conversations accumulate, the flat format becomes a bottleneck — important facts compete for limited space with transient observations, and the agent has no way to know which memories relate to or contradict each other.

Spacedrive's Spacebot implements a substantially more sophisticated memory architecture with 8 typed memory nodes, 6 graph edge types, hybrid vector+FTS+graph search, importance decay, and a Memory Bulletin (periodic knowledge synthesis injected into all conversations). Additionally, Spacebot's compactor extracts and saves memories during context compression, ensuring no information is permanently lost — just moved from ephemeral context to persistent storage.

This feature proposes upgrading Hermes Agent's memory to a structured system inspired by Spacebot's approach, while preserving the simplicity that makes the current system easy to understand.

Research source: Spacebot memory source code (~2000+ lines across types.rs, store.rs, search.rs, lance.rs, maintenance.rs)


Research Findings

Spacebot's Memory Architecture

8 Memory Types (with default importance)

Type Default Importance Purpose
Identity 1.0 Core facts about the agent or user
Goal 0.9 Active objectives and aspirations
Decision 0.8 Choices made with their rationale
Todo 0.8 Action items and tasks
Preference 0.7 Likes, dislikes, style preferences
Fact 0.6 General knowledge and information
Event 0.4 Things that happened
Observation 0.3 Patterns noticed, inferences

6 Graph Edge Types

Relation Search Multiplier Purpose
Updates 1.5x This memory supersedes another
CausedBy 1.3x Causal relationship
ResultOf 1.3x Outcome of another memory
RelatedTo 1.0x General association
PartOf 0.8x Component relationship
Contradicts 0.5x Conflicting information

SQLite Schema

-- Memories table
CREATE TABLE memories (
    id TEXT PRIMARY KEY,
    content TEXT NOT NULL,
    memory_type TEXT NOT NULL,
    importance REAL,  -- 0.0 to 1.0
    created_at TIMESTAMP,
    updated_at TIMESTAMP,
    last_accessed_at TIMESTAMP,
    access_count INTEGER,
    source TEXT,
    channel_id TEXT,
    forgotten INTEGER  -- soft-delete
);

-- Graph edges (associations)
CREATE TABLE associations (
    id TEXT PRIMARY KEY,
    source_id TEXT REFERENCES memories(id),
    target_id TEXT REFERENCES memories(id),
    relation_type TEXT,
    weight REAL,  -- 0.0 to 1.0
    created_at TIMESTAMP,
    UNIQUE(source_id, target_id, relation_type)
);

Hybrid Search (Reciprocal Rank Fusion)

Three search signals merged via RRF (k=60):

  1. Full-text search — LanceDB FTS with keyword matching
  2. Vector similarity — all-MiniLM-L6-v2 embeddings (384 dims), HNSW index, cosine distance
  3. Graph traversal — BFS from high-importance seeds (>0.8) that keyword-match the query, following edges with typed multipliers

Memory Maintenance

  • Importance decay: importance *= age_decay * access_boost (Identity type exempt)
  • Pruning: Delete memories below 0.1 importance after 30 days of staleness
  • Association building: Periodic (every 5 min) similarity check; creates RelatedTo edges at 0.85 threshold, Updates edges at 0.95 threshold

Memory Bulletin (Cortex)

Every hour (configurable), the Cortex generates a bulletin:

  1. Programmatically gathers memories across 8 sections (identity, recent, decisions, important, preferences, goals, events, observations)
  2. Sends to LLM for synthesis into a concise briefing (max 1500 words)
  3. Result is injected into every conversation's system prompt via ArcSwap

Memory Extraction During Compaction

When the Compactor summarizes old context, it doesn't just create a text summary — it also calls memory_save to extract and persist facts, decisions, and preferences as typed memory nodes. This ensures information transitions from ephemeral context to permanent storage.


Current State in Hermes Agent

What We Have

Component Implementation Limitations
MEMORY.md Flat text, §-delimited entries, 2200 char limit No types, no importance, no relations
USER.md Flat text, §-delimited entries, 1375 char limit Fixed capacity, manual management
Session search FTS5 over SQLite session transcripts Keyword only, no vector/semantic
Context compression Head+tail protection, middle summarization Summarizes but doesn't extract memories
Memory injection Snapshot at session start, injected into system prompt Static per-session, no dynamic bulletin
Memory security Injection/exfiltration scanning Solid security layer

Key Gaps

  • No memory types — All memories are undifferentiated text entries
  • No graph structure — No way to express "this decision was caused by this event" or "this fact contradicts this other fact"
  • No vector search — Only FTS5 keyword matching; no semantic similarity
  • No importance scoring — All memories equally weighted; no decay
  • No memory extraction during compaction — Compressed context information is lost
  • No periodic bulletin — Memory is static per-session; no ambient awareness of evolving knowledge
  • Fixed capacity — Hard character limits force manual pruning instead of intelligent decay

Implementation Plan

Skill vs. Tool Classification

This should be a core codebase feature because:

  • It replaces/extends the existing memory tool (tools/memory_tool.py)
  • It requires SQLite schema changes (hermes_state.py or new DB)
  • It integrates with context compression (agent/context_compressor.py)
  • It modifies system prompt building (agent/prompt_builder.py)
  • It needs embedding infrastructure (new dependency for vector search)

What We'd Need

  1. SQLite schema for typed memories — memories + associations tables in state.db
  2. Memory tool upgrade — Add type, importance, and relation parameters
  3. Embedding infrastructure — Local embedding model (sentence-transformers, fastembed, or API-based)
  4. Hybrid search — FTS5 + vector similarity + graph traversal via RRF
  5. Memory maintenance — Importance decay, pruning, association building
  6. Compactor integration — Extract memories during context compression
  7. Migration — Import existing MEMORY.md/USER.md entries into the new system

Phased Rollout

Phase 1: Typed memories with importance in SQLite

  • Add memories table to state.db with type, importance, timestamps
  • Upgrade memory tool to accept type parameter (default: Fact)
  • Importance-based ordering when injecting into system prompt
  • Migrate existing MEMORY.md/USER.md entries on first run
  • Keep MEMORY.md/USER.md as human-readable exports (regenerated from DB)
  • Backward compatible: old add/replace/remove actions still work
  • Deliverable: Typed, importance-weighted memory with no capacity ceiling

Phase 2: Graph edges and enhanced search

  • Add associations table for graph edges between memories
  • Memory tool gains relate action: connect two memories with a relation type
  • Agent auto-creates Updates edges when replacing a memory
  • FTS5 search enhanced with graph traversal (BFS from relevant seeds)
  • Session search tool also queries memory graph for context
  • Deliverable: Connected knowledge graph with relationship-aware search

Phase 3: Vector search and hybrid retrieval

  • Add local embedding model (fastembed with all-MiniLM-L6-v2, or API-based)
  • Generate embeddings for all memory entries
  • Implement RRF-based hybrid search (FTS5 + vector + graph)
  • Memory retrieval during conversations: recall action for semantic search
  • Deliverable: Semantic memory search that finds related memories even without keyword overlap

Phase 4: Memory bulletin and compaction integration

  • Periodic memory bulletin generation (configurable interval, default 1hr)
  • Inject bulletin into system prompt alongside static memory
  • Context compressor gains memory extraction: during summarization, extract typed memories
  • Importance decay: periodic maintenance job (daily) that decays stale, unaccessed memories
  • Pruning: remove decayed memories below threshold after N days
  • Deliverable: Self-maintaining memory system with ambient awareness

Pros & Cons

Pros

  • Richer knowledge representation — Types and relations capture what flat text cannot
  • Scales better — Importance decay + pruning vs. fixed character limits
  • Semantic search — Find relevant memories without exact keyword matches
  • No information loss — Memory extraction during compaction preserves knowledge
  • Ambient awareness — Memory bulletin keeps all conversations contextually informed
  • Backward compatible — Phase 1 maintains the existing memory tool interface
  • Proven architecture — Spacebot's ~2000-line implementation validates the design

Cons / Risks

  • Complexity — Moving from 2 text files to SQLite + embeddings + graph is a major jump
  • Performance — Embedding generation and vector search add latency; must stay fast
  • Storage — Vector embeddings require more disk space than text files
  • Migration risk — Must carefully migrate existing MEMORY.md/USER.md without data loss
  • Dependency — Local embedding model adds a new dependency (fastembed, sentence-transformers)
  • Over-engineering risk — The current simple system works well for most users; complexity must be justified
  • LLM integration — Agent must learn when to create which memory type and relation

Open Questions

  • Should we use local embeddings (fastembed, ~100MB model) or API-based (OpenAI, Voyage)? Local is private but needs disk space.
  • Should the graph be agent-managed (LLM creates edges) or automatic (similarity-based association building)?
  • How should the memory bulletin interact with the existing system prompt memory injection?
  • Should we support user-readable exports (regenerated MEMORY.md) or fully move to SQLite?
  • What's the right default importance decay rate? Spacebot uses age_decay * access_boost with Identity exempt.
  • Should memories be per-user (scoped to conversations with one user) or global?

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions