feat: add hermes-memory MCP integration (structured persistent memory) by Mibayy · Pull Request #2692 · NousResearch/hermes-agent

Mibayy · 2026-03-23T23:41:02Z

The problem, and why existing solutions miss it

Every LLM agent forgets. After ~30 turns, context compression kicks in and silently removes older messages. A constraint decided at turn 5 ("always use UUID, never autoincrement") vanishes by turn 50. The agent contradicts itself, re-asks questions, and the user has to repeat things.

The current memory tool (MEMORY.md) helps for user preferences, but it is unstructured free-text with no search, no lifecycle, and no pressure management. It cannot answer "what did we decide about auth?" without injecting everything.

Existing external solutions (Mem0, Zep, MemGPT) add cloud infra, embedding models, and vector stores. They treat memory as a retrieval problem. But the real problem is what to remember, when to forget, and how to keep the context small.

Why MCP, and why not a core integration

The short answer: MCP is the right boundary for this feature. But if the team prefers a native integration, the code is ready for it.

Why MCP makes sense here:

hermes-memory touches three concerns that are hard to integrate cleanly into a monolith: persistent storage, session lifecycle, and pressure-based memory management. As an MCP server it:

ships independently, versioned separately
works with any MCP-compatible agent, not just Hermes
can be replaced or extended without touching the agent core
keeps the agent codebase clean

What hermes-memory provides

A structured fact store with 8 MCP tools, typed notation, FTS5 search, scoped lifecycle, and automatic gauge-based pressure management:

Typed facts: C[target] (constraints), D[target] (decisions), V[target] (values), ?[target] (unknowns), ✓[target] (resolved), ~[target] (obsolete)
Scope lifecycle: auto-cooling after 6 turns of silence, topic shift detection
Gauge pressure: automatic dedup at 70%, archival at 85%, synthesis at 95%
Zero infra: SQLite + FTS5, no cloud, no embedding model, no API keys

The 8 MCP tools

Tool	Description
`memory_write`	Store a typed fact in MEMORY_SPEC notation
`memory_search`	FTS5 search across hot + cold facts
`memory_tick`	Advance turn counter, trigger scope cooling
`memory_status`	Session injection payload (gauge + hot facts)
`memory_reflect`	On-demand synthesis grouped by fact type
`memory_export`	Dump all facts as plain notation
`memory_purge`	Hard-delete superseded / archived facts
`memory_optimize`	[v0.3.0] Compress MEMORY.md/USER.md + migrate facts to DB

memory_optimize — automatic MEMORY.md pressure relief (v0.3.0)

MEMORY.md is injected into every turn — every byte costs tokens on every call. memory_optimize acts as a relief valve:

Reads MEMORY.md and USER.md from ~/.hermes/memories/
Scans for C[...] / D[...] / V[...] lines → migrates them to the hermes-memory DB, removes them from the flat file
Applies abbreviation compression on remaining entries (~40-60% reduction)
No-op if both files are below threshold (default 55%) — safe to call on a schedule

# output when action taken:
optimized:
  MEMORY: 85.0% → 38.2%  (3 facts migrated)
  USER:   62.0% → 41.5%
  3 fact(s) moved to hermes-memory DB

# output when healthy:
no action needed
MEMORY: 36.0%  USER: 37.0%
(both below 55% threshold)

Relationship to existing memory tool

Complementary, not a replacement.

	memory tool (MEMORY.md)	hermes-memory
Storage	flat text file	SQLite + FTS5
Search	substring match	full-text search with prefix matching
Structure	free-form entries	typed notation (C/D/V/?/✓/~)
Scoping	none	auto-scoped lifecycle with 3 closing triggers
Pressure	manual char limit	automatic gauge (merge → archive → synthesis)
Overflow relief	none	`memory_optimize` migrates + compresses automatically
Best for	user prefs, env facts	project constraints, decisions, values

Both run simultaneously. The memory tool handles "who is the user". hermes-memory handles "what did we decide".

Changes in this PR

docs/hermes-memory-integration.md: full integration guide including MEMORY.md pressure management strategy and cron automation pattern
Updated MCP usage guide with hermes-memory section and config example

Installation

pip install hermes-memory  # v0.3.0

mcp_servers:
  hermes-memory:
    command: "hermes-memory"

Set HERMES_MEMORY_DB to override default storage path (~/.hermes/memory.db).

Technical details

52 tests, all passing
Full code review by Claude Code (18 findings, 11 fixed, 7 accepted for v0.1)
PyPI: https://pypi.org/project/hermes-memory/0.3.0/
Repo: https://github.com/Mibayy/hermes-memory
Related: Add reply_in_thread config option for Slack platform #2662

hermes-memory is a structured memory layer for LLM agents that survives context compression. Ships as a PyPI package (pip install hermes-memory) exposing 7 MCP tools. Changes: - Add docs/hermes-memory-integration.md with full integration guide - Add hermes-memory section to the MCP usage guide - Document relationship to existing memory tool (complementary) PyPI: https://pypi.org/project/hermes-memory/ Related: NousResearch#2662

Mibayy · 2026-03-24T09:25:02Z

fix: memory_tick type validation (0.1.2)

Bugfix pushed to PyPI as 0.1.2.

Root cause: memory_tick(turn) was rejecting valid integer values with '1' is not of type 'integer'. The MCP JSON Schema validator runs before the handler, and some client contexts serialize turn as a JSON string rather than a number. The handler already coerced with int() but the schema blocked it first.

Fix: Changed turn schema type from "integer" to ["integer", "string"]. Coercion in the handler stays, so behavior is unchanged for well-typed clients.

pip install --upgrade hermes-memory  # 0.1.2

Mibayy · 2026-03-24T12:03:35Z

Live test audit — 2026-03-24

Running ongoing validation against a live instance (1-2 days). Updated as we go.

hermes-memory test suite

v0.1.3 — 42/42 passing, no regressions.

tests/test_current_turn.py     4/4   scope lifecycle, silence cooling, dedup
tests/test_export_archive.py   5/5   notation export, cold/archived, atomic close
tests/test_facts.py            9/9   write, dedup, contradiction, truncation, MemoryFullError
tests/test_gauge.py            4/4   threshold detection, cold push
tests/test_reflect.py          4/4   grouping, cold facts, empty LLM response guard
tests/test_scopes.py           6/6   get/create, lifecycle, tick, silence cooling
tests/test_status.py           5/5   type display, notation symbols, hot/cold filter

Live session audit — functional pass (in-process)

Every tool validated against the running MCP instance:

memory_status() — gauge %, hot facts, notation block correct
memory_write() — hash + gauge returned on each write
contradiction/supersession — rewriting C[audit.test] superseded previous entry (a89d0710 -> 5115b4fd); superseded fact excluded from search
memory_search() — FTS retrieves correct fact immediately
memory_reflect() — groups by type, correct output
memory_export() — clean MEMORY_SPEC notation, superseded facts excluded
memory_purge() — removes facts cleanly

Gauge pressure relief — all 3 mechanisms verified

Tier	Condition	Result
MERGE (70%)	5 active duplicates on same target+scope	4 superseded, 1 kept — PASS
ARCHIVE (85%)	10 cold facts in closed scope, >24h old	10 archived, grace window respected — PASS
PUSH TO COLD (95%, no LLM)	92 active facts at 95.6%	12 oldest pushed to cold, gauge 95.6% -> 83.2% — PASS

Note: actions: [] when there is nothing to merge/archive is correct behavior, not a bug.

Bug found and fixed during audit

memory_search(limit=N) and memory_reflect(limit=N) raised '10' is not of type 'integer' — same root cause as the memory_tick fix in 0.1.2. Schema type widened to ["integer", "string"] on both params. Released as 0.1.3 — https://pypi.org/project/hermes-memory/0.1.3/

Documentation

CHANGELOG.md added covering 0.1.1 -> 0.1.2 -> 0.1.3 with root cause notes
Inline comments in run_agent.py explain the [AGENT INSTRUCTION] injection pattern and why it exists (see lines 5103-5105)

Still pending

1-2 days live validation (automated healthcheck every 12h)
Request review once validation period is clean

Mibayy · 2026-03-24T12:46:25Z

Relationship to Honcho

Came up in discussion so worth documenting here.

hermes-memory and Honcho solve adjacent problems with different philosophies.

Honcho reasons over conversation history automatically — it infers conclusions, patterns, and user models without the agent explicitly deciding what to store. Powerful for cross-session personalization, costs $0.001-$0.50 per query, requires a hosted API.

hermes-memory is explicit: the agent decides what matters, stores it in typed notation, zero external calls, zero cost, works offline. The value proposition depends on staying zero-infra — introducing Honcho as a dependency would break that.

They're complementary rather than competing. In the 5-type taxonomy:

hermes-memory covers semantic memory (structured facts, decisions, constraints)
Honcho covers episodic + semantic via automated inference (user modeling, conversation patterns)

Both can run simultaneously without knowing about each other. Gaël is already working on the Honcho side in #2150 (startup context cache for recallMode=tools). That's the right place for Honcho integration — hermes-agent level, not hermes-memory.

Mibayy · 2026-03-24T13:01:59Z

Real-world integration test results

Ran 4 end-to-end tests in a live CLI session after the hermes-memory MCP server was installed and configured. No context given to Hermes between tests — cold session each time.

Query	Expected behavior	Result
"what's the port for [project X]?"	`memory_search("project port")` → find stored V[] fact	✔ Found correct port immediately
"how do I build [project Y]?"	`memory_search` → no result → fallback to skill	✔ Correctly fell back to `skill_view()` (procedural knowledge lives in skills, not memory — by design)
"what's the API token for [service Z]?"	`memory_search("service token")` → find stored V[] fact	✔ Found correct token immediately (value redacted)
"how do I open a GitHub PR?"	load `github-pr-workflow` skill	✔ Skill loaded correctly (minor: skipped memory_search before loading skill — acceptable)

Observations

hermes-memory correctly stores and retrieves structured values (ports, tokens, API keys, IDs)
The skill/memory split works as intended: volatile values → memory_write(), reusable procedures → skills
FTS5 retrieval is fast and accurate on short keyword queries
One minor behavioral note: for the GitHub PR question, Hermes jumped straight to the skill without calling memory_search first. Not a bug, but a behavior worth tracking — ideally memory_search should always be called first per the system prompt instructions.

Overall: integration is solid. The memory layer behaves correctly in a real multi-session context.

Mibayy · 2026-03-24T16:53:19Z

hermes-memory 0.2.0 — plugin integration following hermes-agent v0.4.0

Took advantage of the new plugin system in v0.4.0 to extend hermes-memory. Published on PyPI: https://pypi.org/project/hermes-memory/0.2.0/

What was added

Plugin ~/.hermes/plugins/hermes-memory/

Native integration with the v0.4.0 lifecycle:

on_session_end hook — auto-cools all active scopes when a session closes. Previously, facts would stay hot indefinitely if the session ended without an explicit memory_tick call. The hook fixes this cleanly.
on_session_start hook — foundation for a future warm-cache.

Slash commands via /memory (alias /mem):

/memory status — gauge %, active/cold fact counts, active scopes
/memory search <query> — FTS5 search across all facts
/memory reflect <topic> — grouped synthesis by fact type (C/D/V/✓/?)
/memory purge — hard-delete cold/superseded facts

Implemented register_command() in hermes-agent

PluginContext.register_command() and get_plugin_command_handler() were documented in the v0.4.0 plugin guide but never implemented (test_plugins.py line 370 confirms this). Both are now functional — handlers stored in PluginManager._plugin_commands, dispatched by cli.py and gateway/run.py. Any future plugin wanting to register slash commands benefits from this.

${ENV_VAR} substitution in config.yaml

HERMES_MEMORY_DB migrated to ${HERMES_MEMORY_DB} in config.yaml, resolved from the environment. Portable across VPS installs without manual edits.

MCP standalone toolset

hermes-memory now appears as mcp-hermes-memory in hermes tools, togglable per platform — via v0.4.0 feature #1907, no extra code required.

- Compression guidelines: abbreviate first (40-60% reduction), then migrate structured facts to hermes-memory DB, then remove duplicates - Before/after example showing C/D/V migration pattern - Automated pressure relief via cron job (2x/day, threshold-based) - Establishes hermes-memory as a relief valve for MEMORY.md overflow

Mibayy · 2026-03-25T13:35:43Z

Update: MEMORY.md pressure management section

Added a new section "Managing MEMORY.md pressure" to docs/hermes-memory-integration.md (commit c8afb44).

Why this matters

MEMORY.md is injected into every single turn — every byte in there costs tokens on every call. As projects grow, it fills up fast. Until now the doc explained what hermes-memory is, but not how to use it as a relief valve for MEMORY.md overflow.

This came out of real usage: after a long session, MEMORY.md hit 85% capacity. The fix wasn't obvious — abbreviate first, then migrate structured facts to the DB, then remove duplicates. That workflow deserves to be documented.

What was added

Three rules in priority order:

Abbreviate first — 40-60% reduction with standard shorthands before touching anything else
Migrate structured facts — any C/D/V entry not needed every turn belongs in hermes-memory, not MEMORY.md. Includes a before/after example showing the pattern.
Remove duplicates — facts already in the DB don't need to live in MEMORY.md too

Plus a cron automation pattern: run 2x/day, check thresholds (55%), compress + migrate if needed, do nothing if already under. Keeps injection cost stable across long-running projects without manual intervention.

Also pushed to hermes-memory

The same section lives in the hermes-memory README (the standalone PyPI package) where it arguably belongs as the primary source of truth. The doc in hermes-agent serves as the integration guide perspective.

Mibayy · 2026-03-25T13:41:46Z

v0.3.0 — `memory_optimize` shipped to PyPI

Following up on the MEMORY.md pressure management section added to the docs, the feature is now fully implemented and published in the package itself.

What's new in v0.3.0

memory_optimize — 8th MCP tool. Anyone running pip install hermes-memory gets it automatically.

What it does:

Reads MEMORY.md and USER.md from ~/.hermes/memories/
Scans for any C[...], D[...], V[...], ?[...] lines → migrates them to the hermes-memory DB, removes them from the flat file
Applies abbreviation compression on remaining entries (FR/EN, ~40-60% reduction)
Only acts if either file exceeds the threshold (default 55%). If both are healthy, returns immediately with no changes.
dry_run=true for preview without touching files

Output example:

optimized:
  MEMORY: 85.0% → 38.2%  (3 facts migrated)
  USER:   62.0% → 41.5%
  3 fact(s) moved to hermes-memory DB

or if nothing to do:

no action needed
MEMORY: 36.0%  USER: 37.0%
(both below 55% threshold)

Safe to run on a schedule — the cron pattern from the docs works out of the box:

# config.yaml or cron prompt
memory_optimize()  # 2x/day, no-op if healthy

52 tests passing. PyPI: https://pypi.org/project/hermes-memory/0.3.0/

Mibayy · 2026-03-25T23:34:31Z

Ready for review

Validation period complete (initiated 2026-03-24, 1-2 days as noted above).

Status summary:

52/52 tests passing (up from 42 in 0.1.3 — coverage expanded for 0.2.0 plugin integration and 0.3.0 memory_optimize)
All 3 gauge pressure relief mechanisms validated on a live instance
End-to-end integration tests passed (ports, tokens, skills, cross-session recall)
CI: 3/3 passing (supply chain scan, docs, test suite)
Automated healthcheck running 2x/day since 2026-03-24, no regressions

What the PR adds:

docs/hermes-memory-integration.md — full MCP integration guide
website/docs/guides/use-mcp-with-hermes.md — hermes-memory section added

No code changes to hermes-agent core. hermes-memory remains an optional zero-dependency MCP server — two lines of YAML to configure, pip install hermes-memory to install.

Happy to address any feedback. @teknium1

… FTS5 Closes NousResearch#2692 (supersedes the MCP server prototype). Adds a typed, searchable fact store directly into hermes-agent with no external process, no MCP transport, and zero user configuration beyond enabling the toolset. ## Background PR NousResearch#2692 implemented this feature as a standalone MCP server (hermes-memory on PyPI). After review feedback, the MCP boundary was dropped in favour of a tighter native integration: same core logic, same schema, same 52-test suite — just without the subprocess overhead and configuration friction. ## What is structured memory A SQLite-backed typed fact store using MEMORY_SPEC notation: C[db.id]: UUID mndtry, nvr autoincrement ← Constraint D[auth]: JWT 7d refresh 6d ← Decision V[srv.prod]: api.example.com:3005 ← Value ?[deploy]: rolling or blue-green? ← Unknown ✓[auth]: deployed to prod ← Done ~[db.id]: old autoincrement scheme ← Obsolete Facts are stored in state.db (sm_facts / sm_scopes / sm_sessions tables) with a FTS5 virtual table for sub-millisecond keyword search. ## New files tools/structured_memory/ constants.py — gauge thresholds, ABBREV_DICT, COMPRESS_MAP, TYPE_MAP, FACT_RE db.py — schema SQL, get_sm_connection(), sm_now(); tables co-located in state.db facts.py — write(), search(), get_hot(), purge(), parse_notation() gauge.py — read(), check_and_act(), _merge_duplicates(), _archive_cold_scopes() scopes.py — get_or_create(), tick(), touch(), close(), auto-cooling logic optimize.py — compress MEMORY.md/USER.md + migrate MEMORY_SPEC lines to store tools/structured_memory_tool.py 7 tools registered in the structured_memory toolset: mcp_memory_write — store a typed fact (gauge check before every write) mcp_memory_search — FTS5 keyword search (default limit 5, max 20) mcp_memory_reflect — synthesize facts by topic, grouped by type mcp_memory_export — dump all facts as MEMORY_SPEC notation mcp_memory_purge — hard-delete superseded/archived facts mcp_memory_optimize — compress flat-file memory + migrate to structured store mcp_memory_gauge — return current pressure state Also exports: get_structured_memory_injection(session_id) — gauge + hot facts for system prompt tick_structured_memory(turn, message_text, session_id) — silent tick hook ## Wiring changes run_agent.py - Automatic memory_tick on every user message (no tool-call turn consumed) - get_structured_memory_injection() called at system prompt build time (gauge + hot facts injected before session starts, zero tool calls) toolsets.py - New structured_memory toolset with all 7 tools model_tools.py - tools.structured_memory_tool added to the module load list ## Automatic pressure management At each write, gauge.check_and_act() fires automatically: ≥70% merge duplicate facts (same target + scope) ≥80% warning in tool response ≥85% archive facts from closed scopes to cold ≥95% push oldest active facts to cold storage ## Tests 52 tests ported from hermes-memory test suite, adapted for native imports and sm_* table names. All pass with isolated tmp_path fixtures. tests/structured_memory/test_facts.py (9 tests) tests/structured_memory/test_gauge.py (4 tests) tests/structured_memory/test_scopes.py (6 tests) tests/structured_memory/test_status.py (5 tests) tests/structured_memory/test_reflect.py (4 tests) tests/structured_memory/test_export_archive.py (5 tests) tests/structured_memory/test_current_turn.py (4 tests) tests/structured_memory/test_optimize.py (15 tests) ## Documentation website/docs/user-guide/features/structured-memory.md — full feature doc website/docs/user-guide/features/memory.md — cross-reference added website/docs/user-guide/configuration.md — toolset config example

Mibayy mentioned this pull request Mar 24, 2026

fix(memory): surface write failures to user in gateway/IM sessions #2774

Open

Mibayy mentioned this pull request Mar 26, 2026

feat(memory): native structured memory — typed SQLite fact store with FTS5 #3093

Open

Mibayy closed this Mar 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add hermes-memory MCP integration (structured persistent memory)#2692

feat: add hermes-memory MCP integration (structured persistent memory)#2692
Mibayy wants to merge 2 commits into
NousResearch:mainfrom
Mibayy:feat/hermes-memory-integration

Mibayy commented Mar 23, 2026 •

edited

Loading

Uh oh!

Mibayy commented Mar 24, 2026

Uh oh!

Mibayy commented Mar 24, 2026 •

edited

Loading

Uh oh!

Mibayy commented Mar 24, 2026

Uh oh!

Mibayy commented Mar 24, 2026

Uh oh!

Mibayy commented Mar 24, 2026

Uh oh!

Mibayy commented Mar 25, 2026

Uh oh!

Mibayy commented Mar 25, 2026

Uh oh!

Mibayy commented Mar 25, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Mibayy commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The problem, and why existing solutions miss it

Why MCP, and why not a core integration

What hermes-memory provides

The 8 MCP tools

memory_optimize — automatic MEMORY.md pressure relief (v0.3.0)

Relationship to existing memory tool

Changes in this PR

Installation

Technical details

Uh oh!

Mibayy commented Mar 24, 2026

Uh oh!

Mibayy commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Live test audit — 2026-03-24

hermes-memory test suite

Live session audit — functional pass (in-process)

Gauge pressure relief — all 3 mechanisms verified

Bug found and fixed during audit

Documentation

Still pending

Uh oh!

Mibayy commented Mar 24, 2026

Relationship to Honcho

Uh oh!

Mibayy commented Mar 24, 2026

Real-world integration test results

Observations

Uh oh!

Mibayy commented Mar 24, 2026

hermes-memory 0.2.0 — plugin integration following hermes-agent v0.4.0

What was added

Uh oh!

Mibayy commented Mar 25, 2026

Update: MEMORY.md pressure management section

Why this matters

What was added

Also pushed to hermes-memory

Uh oh!

Mibayy commented Mar 25, 2026

v0.3.0 — memory_optimize shipped to PyPI

What's new in v0.3.0

Uh oh!

Mibayy commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Ready for review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Mibayy commented Mar 23, 2026 •

edited

Loading

Mibayy commented Mar 24, 2026 •

edited

Loading

v0.3.0 — `memory_optimize` shipped to PyPI

Mibayy commented Mar 25, 2026 •

edited

Loading