Skip to content

feat(memory): native structured memory — typed SQLite fact store with FTS5#3093

Open
Mibayy wants to merge 1 commit into
NousResearch:mainfrom
Mibayy:feat/structured-memory-native
Open

feat(memory): native structured memory — typed SQLite fact store with FTS5#3093
Mibayy wants to merge 1 commit into
NousResearch:mainfrom
Mibayy:feat/structured-memory-native

Conversation

@Mibayy

@Mibayy Mibayy commented Mar 26, 2026

Copy link
Copy Markdown
Contributor

Closes #2692 (supersedes the MCP server prototype).

What changed from #2692

PR #2692 implemented this as a standalone MCP server (hermes-memory, published on PyPI). After reviewer feedback asking why MCP was necessary, the architecture was rethought: the MCP boundary adds subprocess + stdio transport overhead, requires the user to pip install hermes-memory and configure mcp: in config.yaml, and prevents the agent loop from calling memory_tick and injecting the gauge automatically.

The core logic is identical — same schema, same gauge tiers, same MEMORY_SPEC notation, same 52 tests, same abbreviation dictionary and compression map developed across #2692. Only the delivery changed.

What was dropped:

  • MCP subprocess and stdio transport
  • pip install hermes-memory requirement
  • mcp: config block
  • memory_tick as an explicit tool call (now automatic)
  • memory_status as an explicit tool call (now injected at prompt build time)

What was gained:

  • Zero user configuration beyond - structured_memory in enabled toolsets
  • memory_tick fires on every user message inside the agent loop, no turn consumed
  • Gauge + hot facts injected into system prompt at startup, no tool call consumed
  • Direct SQLite access to state.db — one DB for sessions, session search, and structured memory

Feature overview

A typed, searchable fact store using MEMORY_SPEC notation:

C[db.id]: UUID mndtry, nvr autoincrement    ← Constraint
D[auth]: JWT 7d refresh 6d                   ← Decision  
V[srv.prod]: api.example.com:3005            ← Value
?[deploy]: rolling or blue-green?            ← Unknown
✓[auth]: deployed to prod                    ← Done
~[db.id]: old autoincrement scheme           ← Obsolete

Facts live in state.db (sm_facts / sm_scopes / sm_sessions). A FTS5 virtual table gives sub-millisecond keyword search regardless of how many facts are stored.

Files

tools/structured_memory/
  constants.py   gauge thresholds, ABBREV_DICT, COMPRESS_MAP, TYPE_MAP, FACT_RE
  db.py          schema SQL, get_sm_connection(), sm_now()
  facts.py       write(), search(), get_hot(), purge(), parse_notation()
  gauge.py       read(), check_and_act(), merge/archive/push logic
  scopes.py      get_or_create(), tick(), touch(), close(), auto-cooling
  optimize.py    compress MEMORY.md/USER.md + migrate MEMORY_SPEC lines

tools/structured_memory_tool.py   7 registered tools + injection/tick helpers
toolsets.py                        new structured_memory toolset
model_tools.py                     module load entry
run_agent.py                       automatic tick hook + system prompt injection

The 7 tools

Tool Purpose
mcp_memory_write Store a typed fact (gauge check before every write)
mcp_memory_search FTS5 keyword search, default limit 5
mcp_memory_reflect Synthesize facts by topic, grouped by type
mcp_memory_export Dump all facts as MEMORY_SPEC notation
mcp_memory_purge Hard-delete superseded/archived facts
mcp_memory_optimize Compress flat-file memory + migrate to structured store
mcp_memory_gauge Return current pressure state

Automatic pressure management

gauge.check_and_act() fires before every write:

Threshold Action
≥70% Merge duplicate facts (same target + scope)
≥80% Warning in tool response
≥85% Archive facts from closed scopes to cold
≥95% Push oldest active facts to cold storage

Tests

52 tests ported from #2692's test suite, adapted for native imports and sm_* table names. All pass with isolated tmp_path fixtures.

tests/structured_memory/test_facts.py           9 tests
tests/structured_memory/test_gauge.py           4 tests
tests/structured_memory/test_scopes.py          6 tests
tests/structured_memory/test_status.py          5 tests
tests/structured_memory/test_reflect.py         4 tests
tests/structured_memory/test_export_archive.py  5 tests
tests/structured_memory/test_current_turn.py    4 tests
tests/structured_memory/test_optimize.py       15 tests
─────────────────────────────────────────────
Total                                          52 tests  ✓

Documentation

  • website/docs/user-guide/features/structured-memory.md — full feature doc (tools, notation, pressure tiers, scope lifecycle, comparison table with flat-file memory)
  • website/docs/user-guide/features/memory.md — cross-reference added
  • website/docs/user-guide/configuration.md — toolset config example

Note on semantic search

A reviewer on #2692 suggested combining FTS5 with semantic/vector search. For agent-written typed facts this would be marginally beneficial — the target field already acts as the semantic category (C[auth], D[auth], V[auth] all cluster under a single FTS5 query), and facts are short and written by a single author with consistent vocabulary. Tracked as a future enhancement for when the store grows to thousands of facts from multiple authors.

… FTS5

Closes NousResearch#2692 (supersedes the MCP server prototype).

Adds a typed, searchable fact store directly into hermes-agent with no
external process, no MCP transport, and zero user configuration beyond
enabling the toolset.

## Background

PR NousResearch#2692 implemented this feature as a standalone MCP server (hermes-memory
on PyPI). After review feedback, the MCP boundary was dropped in favour of
a tighter native integration: same core logic, same schema, same 52-test
suite — just without the subprocess overhead and configuration friction.

## What is structured memory

A SQLite-backed typed fact store using MEMORY_SPEC notation:

  C[db.id]: UUID mndtry, nvr autoincrement   ← Constraint
  D[auth]: JWT 7d refresh 6d                  ← Decision
  V[srv.prod]: api.example.com:3005           ← Value
  ?[deploy]: rolling or blue-green?           ← Unknown
  ✓[auth]: deployed to prod                   ← Done
  ~[db.id]: old autoincrement scheme          ← Obsolete

Facts are stored in state.db (sm_facts / sm_scopes / sm_sessions tables)
with a FTS5 virtual table for sub-millisecond keyword search.

## New files

tools/structured_memory/
  constants.py   — gauge thresholds, ABBREV_DICT, COMPRESS_MAP, TYPE_MAP, FACT_RE
  db.py          — schema SQL, get_sm_connection(), sm_now(); tables co-located in state.db
  facts.py       — write(), search(), get_hot(), purge(), parse_notation()
  gauge.py       — read(), check_and_act(), _merge_duplicates(), _archive_cold_scopes()
  scopes.py      — get_or_create(), tick(), touch(), close(), auto-cooling logic
  optimize.py    — compress MEMORY.md/USER.md + migrate MEMORY_SPEC lines to store

tools/structured_memory_tool.py
  7 tools registered in the structured_memory toolset:
    mcp_memory_write    — store a typed fact (gauge check before every write)
    mcp_memory_search   — FTS5 keyword search (default limit 5, max 20)
    mcp_memory_reflect  — synthesize facts by topic, grouped by type
    mcp_memory_export   — dump all facts as MEMORY_SPEC notation
    mcp_memory_purge    — hard-delete superseded/archived facts
    mcp_memory_optimize — compress flat-file memory + migrate to structured store
    mcp_memory_gauge    — return current pressure state

  Also exports:
    get_structured_memory_injection(session_id) — gauge + hot facts for system prompt
    tick_structured_memory(turn, message_text, session_id) — silent tick hook

## Wiring changes

run_agent.py
  - Automatic memory_tick on every user message (no tool-call turn consumed)
  - get_structured_memory_injection() called at system prompt build time
    (gauge + hot facts injected before session starts, zero tool calls)

toolsets.py
  - New structured_memory toolset with all 7 tools

model_tools.py
  - tools.structured_memory_tool added to the module load list

## Automatic pressure management

At each write, gauge.check_and_act() fires automatically:
  ≥70%  merge duplicate facts (same target + scope)
  ≥80%  warning in tool response
  ≥85%  archive facts from closed scopes to cold
  ≥95%  push oldest active facts to cold storage

## Tests

52 tests ported from hermes-memory test suite, adapted for native imports
and sm_* table names. All pass with isolated tmp_path fixtures.

  tests/structured_memory/test_facts.py           (9 tests)
  tests/structured_memory/test_gauge.py           (4 tests)
  tests/structured_memory/test_scopes.py          (6 tests)
  tests/structured_memory/test_status.py          (5 tests)
  tests/structured_memory/test_reflect.py         (4 tests)
  tests/structured_memory/test_export_archive.py  (5 tests)
  tests/structured_memory/test_current_turn.py    (4 tests)
  tests/structured_memory/test_optimize.py        (15 tests)

## Documentation

website/docs/user-guide/features/structured-memory.md  — full feature doc
website/docs/user-guide/features/memory.md             — cross-reference added
website/docs/user-guide/configuration.md               — toolset config example
Mibayy added a commit to Mibayy/hermes-agent that referenced this pull request Mar 27, 2026
Mibayy added a commit to Mibayy/hermes-agent that referenced this pull request Mar 27, 2026
Tranquil-Flow added a commit to Tranquil-Flow/hermes-agent that referenced this pull request Apr 4, 2026
…ore with FTS5)

Ported from Mibayy/hermes-agent:feat/structured-memory-native

Typed fact store using MEMORY_SPEC notation:
  C[db.id]: UUID mandatory    ← Constraint
  D[auth]: JWT 7d refresh 6d  ← Decision
  V[srv.prod]: api.example.com ← Value
  ?[deploy]: rolling?         ← Unknown
  ✓[auth]: deployed to prod   ← Done
  ~[db.id]: old scheme        ← Obsolete

7 tools: write, search, reflect, export, purge, optimize, gauge
Automatic pressure management (merge at 70%, warn 80%, archive 85%)
SQLite FTS5 for sub-ms keyword search
Scope lifecycle with auto-cooling

Coexists with cognitive_memory — both toolsets available.
52 tests included. No regression on existing benchmarks.
@ether-btc

Copy link
Copy Markdown
Contributor

Charon Code Review — PR #3093

Large PR (25 files, ~3100 diff lines). Reviewed all core modules: db.py, facts.py, scopes.py, gauge.py, optimize.py, structured_memory_tool.py, and run_agent.py integration points.


🔴 Critical — Tool activation guard silent failure

run_agent.py lines 2406 and 5650:
If the tool name is misspelled or changes, this silently does nothing. Memory injection stops working with no log, no warning. Fix: Add in the block so this is discoverable.


🔴 Critical — sm_gauge view DROP/CREATE on every connection

db.py get_sm_connection(): Runs on every call. Wasteful and can momentarily break concurrent queries. Fix: Use and only recreate when changes.


🟡 Warning — sm_gauge undercounts actual storage

The formula misses SQLite record overhead, 8 indexes, FTS5 shadow tables, and WAL log. Actual DB can be 3-5x the gauge estimate. is a lower bound.


🟡 Warning — tick() updates ALL sessions when session_id=None

scopes.py tick(): When called with from run_agent.py, every concurrent session gets its turn counter overwritten. Premature scope cooling across all users. Fix: Skip global update when no session_id.


🟡 Warning — FTS5 token escaping misses operators

(C++), (node:api), , are FTS5 operators that pass through unescaped. Fix: filter all FTS5 special chars before escaping.


🟡 Warning — FactTooLargeError not explicitly caught

In _handle_mcp_memory_write, add before the bare .


🟡 Warning — Duplicate regex lookbehind in _CLOSING_SIGNALS

appears twice. Remove one.


🟡 Warning — ABBREV_DICT from hermes_memory.core.db vs local constants.py

If installed package is older version without the re-export, import fails.


🟢 Nit — Two DB paths: state.db vs memory.db

structured_memory_tool uses . In-repo db.py uses . Document the split or unify.


✅ Solid

Schema (WAL, FKs, partial unique index, FTS5 triggers), threading (threading.local per-connection), gauge tiers (70/80/85/95/100 with merge/archive/cold/synthesis), scope cooling heuristics, FTS5 rank ordering, check_and_act re-reading gauge after each action, actionable user-facing errors.

Verdict: Request Changes — two criticals need fixes before merge.

@ether-btc ether-btc left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hermes Agent Code Review — PR #3093

Verdict: Request Changes

A large and well-scoped feature addition. Two correctness issues require fixes before merge.


🔴 Critical Issue 1 — Silent Tool Guard Bypass

Files: run_agent.py:24, run_agent.py:44

The structured memory integration silently swallows all exceptions with bare except Exception: pass:

if "mcp_memory_write" in self.valid_tool_names:
    try:
        from tools.structured_memory_tool import get_structured_memory_injection
        sm_block = get_structured_memory_injection(session_id=self._session_id)
        if sm_block:
            prompt_parts.append(sm_block)
    except Exception:   # ← silently swallows everything
        pass

If get_structured_memory_injection() raises — DB corruption, malformed data, schema mismatch — the user gets zero feedback. The system prompt is silently incomplete for the entire session. What happens if the DB is corrupted mid-session? The except Exception: pass means the gauge %, hot facts, and scope state are all silently dropped.

Additionally, "mcp_memory_write" in self.valid_tool_names is not a reliable gate — valid_tool_names is derived from _discover_tools() at startup. If structured_memory is temporarily unavailable, the toolset gate won't reflect it.

Suggested fix:

  1. Log failures: logger.warning("Structured memory injection failed: %s", exc) instead of bare pass
  2. Distinguish between "toolset not present" (continue silently) and "toolset present but failed" (log + continue). A ToolMissingError-style check vs catching runtime failures.

🔴 Critical Issue 2 — View DROP/CREATE Race on Connection Reuse

File: tools/structured_memory/db.py:1101

conn.executescript(f"""
DROP VIEW IF EXISTS sm_gauge;
CREATE VIEW sm_gauge AS
    SELECT ...;
""")

get_sm_connection() is called from multiple call sites in the same session (tick_structured_memory, get_structured_memory_injection, the tool handlers). If two threads/coroutines call get_sm_connection() simultaneously:

  1. Thread A drops sm_gauge (returns None for gauge queries)
  2. Thread B queries sm_gauge → SQLite error: "no such view"
  3. Thread B fails or gets wrong results

The connection has check_same_thread=False and is reused across turns. Between any two tick_structured_memory() calls, any other code path that calls get_sm_connection() will re-run this DROP VIEW script, potentially mid-session.

The sm_gauge view is purely for the gauge percentage in the system prompt — dropping and recreating it on every new connection is expensive and introduces a race window.

Suggested fix:

  • Only run the DROP VIEW IF EXISTS sm_gauge / CREATE VIEW block when the connection is first established (e.g., track a flag on the connection object), not on every get_sm_connection() call.
  • Alternatively, inline the gauge computation at query time instead of using a view.

⚠️ Minor — except ImportError: pass in Module Root

File: tools/structured_memory_tool.py:2093

except ImportError:
    pass

If litellm or another dependency is missing, the module imports successfully but all functions become no-ops. This is a deployment-time footgun — the user won't discover the missing dependency until a structured memory tool is actually called.


💡 Suggestions

  • Tests (tests/structured_memory/test_current_turn.py): The test fixture hardcodes /root/hermes-agent on sys.path. This will break in any non-root deployment. Consider using pytest.importorskip or detecting the project root dynamically.
  • Gauge view (tools/structured_memory/db.py:1100): The comment "CREATE VIEW IF NOT EXISTS is sticky — DROP + CREATE ensures it's current" is misleading. CREATE VIEW IF NOT EXISTS is not sticky — the DROP is there specifically to replace it. The comment should clarify why recreation is needed (MAX_ACTIVE_CHARS constant may change).

✅ Looks Good

  • Clean schema design with IF NOT EXISTS on all tables, indexes, triggers
  • sm_hot_facts view correctly joins sm_facts to sm_scopes with LEFT JOIN + NULL check
  • FTS5 trigger synchronization is correct (delete + insert on update, proper old/new row references)
  • Gauge pressure system is well-designed: four distinct pressure levels with escalating actions
  • Auto-cooling via scopes.tick() with SCOPE_COOL_TURNS constant is a good pattern

Reviewed by Hermes Agent

@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have tool/memory Memory tool and memory providers labels May 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P3 Low — cosmetic, nice to have tool/memory Memory tool and memory providers type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants