Skip to content

feat(lcm): Lossless Context Management — never lose a message again#4033

Closed
dusterbloom wants to merge 14 commits into
NousResearch:mainfrom
dusterbloom:feat/lossless-context-management
Closed

feat(lcm): Lossless Context Management — never lose a message again#4033
dusterbloom wants to merge 14 commits into
NousResearch:mainfrom
dusterbloom:feat/lossless-context-management

Conversation

@dusterbloom

@dusterbloom dusterbloom commented Mar 30, 2026

Copy link
Copy Markdown
Contributor

Summary

Lossless Context Management (LCM) with a three-layer memory hierarchy for hermes-agent. Replaces the legacy ContextCompressor with a structured, multi-level compaction system backed by an immutable message store, summary DAG, and cross-session persistent knowledge.

Architecture: Three-Layer Memory Hierarchy

Layer System Timescale Math
L1 Hot LCM Within-turn LLM summarization (L1/L2/L3 escalation)
L2 Warm DAM Within-session Modern Hopfield network (arxiv 2601.00984v2)
L3 Cold HRR Store Cross-session Holographic Reduced Representations (Plate 1995)

LCM Core (agent/lcm/, 12 modules)

  • engine.py — Orchestrates ingest, thresholds, compaction, expand, pin/unpin
  • store.py — Immutable append-only message store with pruning
  • dag.py — Summary DAG with recursive source tracking for reversibility
  • escalation.py — 3-level: L1 preserve_details → L2 bullet_points → L3 deterministic
  • config.py, tokens.py, refusal.py, tools.py — Supporting modules
  • query.py, format.py, session.py — Mixins for search, formatting, persistence

DAM Plugin (agent/lcm/dam/, 7 modules — bundled)

  • DenseAssociativeMemory — Modern Hopfield network, pure numpy
  • MessageEncoder — Trigram hashing → 2048-dim unit vectors
  • DAMRetriever — sync_with_store, search, compose (AND/OR/NOT), recall_similar

HRR Persistent Store (agent/lcm/hrr/, 5 modules — bundled)

  • holographic.py — Phase-encoded HRR: bind, unbind, bundle, similarity
  • store.py — SQLite fact store with FTS5, entity resolution, trust scoring
  • retrieval.py — Hybrid FTS5 + Jaccard + HRR search, probe, related, reason, contradict
  • schemas.py + tools.py — 5 unified memory_* tools

Event Hooks (Connecting the Layers)

  1. Compaction Crystallizer — When LCM compacts, auto-stores summaries as persistent facts in HRR. Failures never block compaction.
  2. Session Primer — On session start, queries HRR for relevant prior knowledge and injects into LCM context.

Unified Tool Surface (5 tools)

Tool What it does
memory_search DAM → HRR → keyword cascade
memory_pin Pin in LCM + auto-crystallize to HRR
memory_expand In-session expand or cross-session recall
memory_forget LCM forget + optional HRR trust reduction
memory_reason HRR compositional queries (probe/related/reason/contradict)

Old lcm_* and dam_* tools remain functional for backward compat.

Gateway Integration

  • SessionEntry.lcm_metadata field (backward-compatible, None for legacy sessions)
  • to_dict()/from_dict() round-trip for session persistence
  • Legacy ContextCompressor kept as backward-compat shim for v0.7.0 code paths

Commits

  1. c5eaa14d — Core LCM engine + DAM plugin
  2. cb7072a2 — Wire LLM summarization, fix token counting, add store pruning, session persistence
  3. f3a9c2d7 — 88 tests for escalation, tiktoken, persistence, pruning, DAM e2e
  4. 27a85ac1 — DI via ContextVar, split engine into mixins, bundle DAM, wire async compaction
  5. 7cc320a5 — 67 tests for DI isolation, bundled DAM, engine split, async compaction
  6. d3396f1d — Conditional LCM tool registration
  7. 8af42cba — Auto-wire DAM retriever, add compaction metrics, harden session rebuild
  8. 803c9557 — 45 tests for DAM auto-wiring, compaction metrics, rebuild hardening
  9. aef58c20 — Clear session state fully + isolate LCM tools from toolset test
  10. 5964f1ec — Bundle HRR persistent knowledge store under agent.lcm.hrr
  11. be7b2d2c — Wire compaction crystallizer and session primer hooks
  12. b456a3cf — Add 5 unified memory_* tools across LCM, DAM, and HRR layers
  13. f3f0fb95 — Address review feedback: logging, validation, remove memory_budget
  14. 11a0ed09 — Restore ContextCompressor init for v0.7.0 backward compat

Test plan

  • 278 new tests across 20+ test files
  • Full regression suite: 3,985 passed (1 pre-existing env-dependent failure)
  • Thread safety: ContextVar isolation verified across threads + asyncio tasks
  • Async compaction: lock contention, callback, pending-flag tested
  • DAM e2e: ingest → index → search → recall → compose → persist → reload
  • HRR e2e: add_fact → search → probe → reason → contradict
  • Backward compat: all existing imports unchanged, old sessions load cleanly
  • CLI smoke test: hermes chat -q "Say hello" works on rebased v0.7.0
Test File Tests Coverage
test_lcm_components 75 Config, summarizer, tokens, semantic index
test_lcm_engine 19 Engine init, ingest, thresholds, compact, expand, search, pin
test_lcm_escalation 19 L1→L2→L3 escalation, structured prompts, fallback
test_lcm_tiktoken 17 tiktoken, multimodal, cache, fallback
test_lcm_gateway_persistence 14 SessionEntry lcm_metadata round-trip
test_lcm_store_pruning 20 max_store_size, GC, pinned/referenced preservation
test_lcm_engine_di 6 ContextVar thread/asyncio isolation
test_lcm_dam_bundled 10 Bundled imports, functionality
test_lcm_engine_split 34 Mixin inheritance, backward compat
test_lcm_async_compaction 17 Thread pool, lock, callback, pending flag
test_hrr_bundled 12 HRR imports, CRUD, entity extraction, retrieval
test_memory_crystallizer 14 Compaction → HRR crystallization hook
test_memory_primer 13 Session primer injection
test_memory_unified_tools 76 Unified tools, cascade search, backward compat
test_session_reset_fix 5 Session state reset

mraxai added 14 commits April 8, 2026 23:41
…ory plugin

LCM core: dual-state architecture with immutable message store, summary DAG,
and 7 agent tools (expand, pin, forget, search, focus, budget, toc).
Three-level escalation, tool-call pair protection, pin cap, auto-recompact.

DAM plugin: Dense Associative Memory implementing exponential-capacity
pattern-completion search (arxiv 2601.00984v2). Neural retrieval with
compositional queries (AND/OR/NOT). Pure numpy, no GPU required.

Also fixes env var isolation in provider resolution tests, updates context
pressure and token tracking tests for LCM-based architecture.
…ing and session persistence

- Wire _call_summary_llm to auxiliary_client with structured prompts
  (Goal/Progress/Decisions/Files/Next Steps) and L1→L2→L3 escalation
- Fix multimodal token counting in TokenEstimator (was str()-ifying
  image URLs into token count)
- Add max_store_size config + ImmutableStore.prune() with GC for
  unreferenced entries, auto-prune on ingest
- Add lcm_metadata field to gateway SessionEntry for cross-session
  DAG persistence (backward-compatible)
…ence, store pruning, DAM e2e

- 19 tests for _call_summary_llm wiring and L1→L2→L3 escalation
- 17 tests for tiktoken integration and multimodal token counting
- 14 tests for gateway SessionEntry lcm_metadata round-trip
- 20 tests for store pruning with max_store_size cap and GC
- 18 tests for DAM end-to-end pipeline (ingest→index→search→recall→compose→persist→reload)
…M, wire async compaction

- Replace global _engine_ref with contextvars.ContextVar for thread-safe
  concurrent agent isolation
- Split 714-line engine.py into focused modules: query.py (LcmQueryMixin),
  format.py (LcmFormatMixin), session.py (LcmSessionMixin)
- Bundle DAM plugin from ~/.hermes/plugins/hermes-dam/ into agent/lcm/dam/
  with proper absolute imports; plugin delegates to bundled code
- Wire CompactionAction.ASYNC with threading.Lock + daemon thread pool;
  async_compact() runs compaction off main thread with callback support
…async compaction

- 6 tests for ContextVar thread/asyncio isolation
- 10 tests for bundled DAM imports and functionality
- 34 tests for engine mixin split with backward compat
- 17 tests for async compaction thread safety, callbacks, and locking
…_warned

- Only add LCM tool schemas to tool surface when tools are enabled
  (fixes test_no_tools_never_injects CI failure)
- Initialize _context_pressure_warned in __init__ to prevent
  AttributeError before first compression
…ssion rebuild

- Auto-initialize DAMRetriever in LcmEngine when numpy available
- Auto-sync DAM with store every 5 ingests (configurable interval)
- Add sync_from_store() to DAMRetriever to break circular get_engine() dep
- Search uses DAM ranking when retriever has patterns, keyword fallback otherwise
- Track compaction metrics: total, tokens saved, compression ratio, level distribution
- format_metrics() for human-readable stats
- Persist DAM state + metrics in session metadata for cross-session continuity
- Harden rebuild_from_session: validate pinned IDs, handle missing/null fields,
  degrade gracefully on corrupt data
…ild hardening

- 18 tests for DAM auto-init, periodic sync, search integration, session persistence
- 17 tests for compaction metrics tracking, format_metrics, async coverage
- 10 tests for rebuild edge cases: invalid pinned IDs, missing keys, corrupt data
…test

reset_session_state() was missing _user_turn_count and
context_compressor._previous_summary resets, failing 4 tests in
test_session_reset_fix. The toolset derivation test now patches
LCM_TOOL_SCHEMAS to keep its controlled tool set precise.
Cherry-pick holographic.py (phase-encoded HRR math), store.py
(SQLite fact store with entity resolution and trust scoring), and
retrieval.py (FTS5 + Jaccard + HRR hybrid search) from the
holographic-memory-store branch into agent/lcm/hrr/ following the
DAM bundling pattern. 12 tests verify imports, CRUD, and retrieval.
Compaction crystallizer: engine.compact() now auto-stores summaries
as persistent facts in the HRR store via _crystallize_to_hrr().
Failures are swallowed to never block compaction. 14 tests.

Session primer: prime_from_hrr() queries the HRR store for relevant
prior knowledge and injects it into the LCM active context at
session start. 13 tests.
Replace 11 fragmented tools (7 lcm_* + 4 dam_*) with 6 unified
memory_* tools that auto-route across all three memory layers:
- memory_search: DAM → HRR → keyword cascade
- memory_pin: LCM pin + auto-crystallize to HRR
- memory_expand: in-session or cross-session recall
- memory_forget: LCM forget + optional HRR trust reduction
- memory_reason: HRR compositional queries (probe/related/reason/contradict)
- memory_budget: token breakdown (delegates to lcm_budget)

Old lcm_* and dam_* tools remain functional for backward compat.
84 tests including schema validation, handler behavior, and compat.
…y_budget

- Warn when numpy unavailable and HRR weights redistributed
- Promote cascade search failures from debug to warning level
- Extract _normalize_ids() helper, removing duplicated normalization
- Validate and sanitize entities in memory_reason (strip, reject non-strings)
- Remove memory_budget passthrough (lcm_budget already covers it)
- Patch MEMORY_TOOL_SCHEMAS in toolset derivation test
- Wire HRR store init + memory_* tool dispatch in run_agent.py
The rebase onto v0.7.0 removed the ContextCompressor initialization
but v0.7.0 code paths still reference self.context_compressor for
token counting, fallback handling, and session state. Restore it as
a lightweight shim — LCM engine drives actual compression.
@dusterbloom dusterbloom force-pushed the feat/lossless-context-management branch from 11a0ed0 to 2cac51b Compare April 8, 2026 21:48
@dusterbloom

Copy link
Copy Markdown
Contributor Author

Superseded by feat/lcm-as-plugin — cleaner history, same feature set.

dusterbloom added a commit to dusterbloom/hermes-agent that referenced this pull request May 14, 2026
Three-layer memory hierarchy (L1 Hot / L2 Warm DAM / L3 Cold HRR) as a
self-contained plugin under plugins/context_engine/lcm/.

Implements the ContextEngine ABC from PR NousResearch#6126 -- zero changes to run_agent.py.
Activated by setting context.engine: lcm in config.yaml.

Features:
  - ImmutableStore: append-only archive of all messages
  - SummaryDAG: reversible compaction with expansion
  - Dense Associative Memory (L2): Modern Hopfield network
  - HRR persistent knowledge store (L3): cross-session facts
  - LLM escalation: structured summary generation
  - 7 agent-facing tools: expand, pin, forget, search, budget, toc, focus
  - Session persistence and rebuild

Tests: 199 tests (171 internal + 28 ABC contract), all passing.
Config: lcm section in DEFAULT_CONFIG, context.engine selection.

Based on original PR NousResearch#4033, restructured as a plugin per the ContextEngine
ABC design in PR NousResearch#6126 by @teknium1 / @stephenschoettler.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants