feat(lcm): Lossless Context Management — never lose a message again#4033
Closed
dusterbloom wants to merge 14 commits into
Closed
feat(lcm): Lossless Context Management — never lose a message again#4033dusterbloom wants to merge 14 commits into
dusterbloom wants to merge 14 commits into
Conversation
b81c0d3 to
f3f0fb9
Compare
This was referenced Apr 7, 2026
…ory plugin LCM core: dual-state architecture with immutable message store, summary DAG, and 7 agent tools (expand, pin, forget, search, focus, budget, toc). Three-level escalation, tool-call pair protection, pin cap, auto-recompact. DAM plugin: Dense Associative Memory implementing exponential-capacity pattern-completion search (arxiv 2601.00984v2). Neural retrieval with compositional queries (AND/OR/NOT). Pure numpy, no GPU required. Also fixes env var isolation in provider resolution tests, updates context pressure and token tracking tests for LCM-based architecture.
…ing and session persistence - Wire _call_summary_llm to auxiliary_client with structured prompts (Goal/Progress/Decisions/Files/Next Steps) and L1→L2→L3 escalation - Fix multimodal token counting in TokenEstimator (was str()-ifying image URLs into token count) - Add max_store_size config + ImmutableStore.prune() with GC for unreferenced entries, auto-prune on ingest - Add lcm_metadata field to gateway SessionEntry for cross-session DAG persistence (backward-compatible)
…ence, store pruning, DAM e2e - 19 tests for _call_summary_llm wiring and L1→L2→L3 escalation - 17 tests for tiktoken integration and multimodal token counting - 14 tests for gateway SessionEntry lcm_metadata round-trip - 20 tests for store pruning with max_store_size cap and GC - 18 tests for DAM end-to-end pipeline (ingest→index→search→recall→compose→persist→reload)
…M, wire async compaction - Replace global _engine_ref with contextvars.ContextVar for thread-safe concurrent agent isolation - Split 714-line engine.py into focused modules: query.py (LcmQueryMixin), format.py (LcmFormatMixin), session.py (LcmSessionMixin) - Bundle DAM plugin from ~/.hermes/plugins/hermes-dam/ into agent/lcm/dam/ with proper absolute imports; plugin delegates to bundled code - Wire CompactionAction.ASYNC with threading.Lock + daemon thread pool; async_compact() runs compaction off main thread with callback support
…async compaction - 6 tests for ContextVar thread/asyncio isolation - 10 tests for bundled DAM imports and functionality - 34 tests for engine mixin split with backward compat - 17 tests for async compaction thread safety, callbacks, and locking
…_warned - Only add LCM tool schemas to tool surface when tools are enabled (fixes test_no_tools_never_injects CI failure) - Initialize _context_pressure_warned in __init__ to prevent AttributeError before first compression
…ssion rebuild - Auto-initialize DAMRetriever in LcmEngine when numpy available - Auto-sync DAM with store every 5 ingests (configurable interval) - Add sync_from_store() to DAMRetriever to break circular get_engine() dep - Search uses DAM ranking when retriever has patterns, keyword fallback otherwise - Track compaction metrics: total, tokens saved, compression ratio, level distribution - format_metrics() for human-readable stats - Persist DAM state + metrics in session metadata for cross-session continuity - Harden rebuild_from_session: validate pinned IDs, handle missing/null fields, degrade gracefully on corrupt data
…ild hardening - 18 tests for DAM auto-init, periodic sync, search integration, session persistence - 17 tests for compaction metrics tracking, format_metrics, async coverage - 10 tests for rebuild edge cases: invalid pinned IDs, missing keys, corrupt data
…test reset_session_state() was missing _user_turn_count and context_compressor._previous_summary resets, failing 4 tests in test_session_reset_fix. The toolset derivation test now patches LCM_TOOL_SCHEMAS to keep its controlled tool set precise.
Cherry-pick holographic.py (phase-encoded HRR math), store.py (SQLite fact store with entity resolution and trust scoring), and retrieval.py (FTS5 + Jaccard + HRR hybrid search) from the holographic-memory-store branch into agent/lcm/hrr/ following the DAM bundling pattern. 12 tests verify imports, CRUD, and retrieval.
Compaction crystallizer: engine.compact() now auto-stores summaries as persistent facts in the HRR store via _crystallize_to_hrr(). Failures are swallowed to never block compaction. 14 tests. Session primer: prime_from_hrr() queries the HRR store for relevant prior knowledge and injects it into the LCM active context at session start. 13 tests.
Replace 11 fragmented tools (7 lcm_* + 4 dam_*) with 6 unified memory_* tools that auto-route across all three memory layers: - memory_search: DAM → HRR → keyword cascade - memory_pin: LCM pin + auto-crystallize to HRR - memory_expand: in-session or cross-session recall - memory_forget: LCM forget + optional HRR trust reduction - memory_reason: HRR compositional queries (probe/related/reason/contradict) - memory_budget: token breakdown (delegates to lcm_budget) Old lcm_* and dam_* tools remain functional for backward compat. 84 tests including schema validation, handler behavior, and compat.
…y_budget - Warn when numpy unavailable and HRR weights redistributed - Promote cascade search failures from debug to warning level - Extract _normalize_ids() helper, removing duplicated normalization - Validate and sanitize entities in memory_reason (strip, reject non-strings) - Remove memory_budget passthrough (lcm_budget already covers it) - Patch MEMORY_TOOL_SCHEMAS in toolset derivation test - Wire HRR store init + memory_* tool dispatch in run_agent.py
The rebase onto v0.7.0 removed the ContextCompressor initialization but v0.7.0 code paths still reference self.context_compressor for token counting, fallback handling, and session state. Restore it as a lightweight shim — LCM engine drives actual compression.
11a0ed0 to
2cac51b
Compare
Contributor
Author
|
Superseded by feat/lcm-as-plugin — cleaner history, same feature set. |
4 tasks
dusterbloom
added a commit
to dusterbloom/hermes-agent
that referenced
this pull request
May 14, 2026
Three-layer memory hierarchy (L1 Hot / L2 Warm DAM / L3 Cold HRR) as a self-contained plugin under plugins/context_engine/lcm/. Implements the ContextEngine ABC from PR NousResearch#6126 -- zero changes to run_agent.py. Activated by setting context.engine: lcm in config.yaml. Features: - ImmutableStore: append-only archive of all messages - SummaryDAG: reversible compaction with expansion - Dense Associative Memory (L2): Modern Hopfield network - HRR persistent knowledge store (L3): cross-session facts - LLM escalation: structured summary generation - 7 agent-facing tools: expand, pin, forget, search, budget, toc, focus - Session persistence and rebuild Tests: 199 tests (171 internal + 28 ABC contract), all passing. Config: lcm section in DEFAULT_CONFIG, context.engine selection. Based on original PR NousResearch#4033, restructured as a plugin per the ContextEngine ABC design in PR NousResearch#6126 by @teknium1 / @stephenschoettler.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Lossless Context Management (LCM) with a three-layer memory hierarchy for hermes-agent. Replaces the legacy
ContextCompressorwith a structured, multi-level compaction system backed by an immutable message store, summary DAG, and cross-session persistent knowledge.Architecture: Three-Layer Memory Hierarchy
LCM Core (
agent/lcm/, 12 modules)engine.py— Orchestrates ingest, thresholds, compaction, expand, pin/unpinstore.py— Immutable append-only message store with pruningdag.py— Summary DAG with recursive source tracking for reversibilityescalation.py— 3-level: L1 preserve_details → L2 bullet_points → L3 deterministicconfig.py,tokens.py,refusal.py,tools.py— Supporting modulesquery.py,format.py,session.py— Mixins for search, formatting, persistenceDAM Plugin (
agent/lcm/dam/, 7 modules — bundled)DenseAssociativeMemory— Modern Hopfield network, pure numpyMessageEncoder— Trigram hashing → 2048-dim unit vectorsDAMRetriever— sync_with_store, search, compose (AND/OR/NOT), recall_similarHRR Persistent Store (
agent/lcm/hrr/, 5 modules — bundled)holographic.py— Phase-encoded HRR: bind, unbind, bundle, similaritystore.py— SQLite fact store with FTS5, entity resolution, trust scoringretrieval.py— Hybrid FTS5 + Jaccard + HRR search, probe, related, reason, contradictschemas.py+tools.py— 5 unifiedmemory_*toolsEvent Hooks (Connecting the Layers)
Unified Tool Surface (5 tools)
memory_searchmemory_pinmemory_expandmemory_forgetmemory_reasonOld
lcm_*anddam_*tools remain functional for backward compat.Gateway Integration
SessionEntry.lcm_metadatafield (backward-compatible, None for legacy sessions)to_dict()/from_dict()round-trip for session persistenceContextCompressorkept as backward-compat shim for v0.7.0 code pathsCommits
c5eaa14d— Core LCM engine + DAM plugincb7072a2— Wire LLM summarization, fix token counting, add store pruning, session persistencef3a9c2d7— 88 tests for escalation, tiktoken, persistence, pruning, DAM e2e27a85ac1— DI via ContextVar, split engine into mixins, bundle DAM, wire async compaction7cc320a5— 67 tests for DI isolation, bundled DAM, engine split, async compactiond3396f1d— Conditional LCM tool registration8af42cba— Auto-wire DAM retriever, add compaction metrics, harden session rebuild803c9557— 45 tests for DAM auto-wiring, compaction metrics, rebuild hardeningaef58c20— Clear session state fully + isolate LCM tools from toolset test5964f1ec— Bundle HRR persistent knowledge store under agent.lcm.hrrbe7b2d2c— Wire compaction crystallizer and session primer hooksb456a3cf— Add 5 unified memory_* tools across LCM, DAM, and HRR layersf3f0fb95— Address review feedback: logging, validation, remove memory_budget11a0ed09— Restore ContextCompressor init for v0.7.0 backward compatTest plan
hermes chat -q "Say hello"works on rebased v0.7.0