Skip to content

fix(memory): defense-in-depth scrub of <memory-context> leaks#33248

Open
kenyonj wants to merge 1 commit into
NousResearch:mainfrom
kenyonj:fix/honcho-memory-context-leak
Open

fix(memory): defense-in-depth scrub of <memory-context> leaks#33248
kenyonj wants to merge 1 commit into
NousResearch:mainfrom
kenyonj:fix/honcho-memory-context-leak

Conversation

@kenyonj

@kenyonj kenyonj commented May 27, 2026

Copy link
Copy Markdown

Summary

Fixes a memory-context leak where <memory-context>...</memory-context> blocks
(internal injected recall payloads, gated by a [System note: ... NOT new user input ...] preamble) could be persisted into the conversation transcript and
then re-surfaced verbatim in later turns — both to the model and, via Honcho's
peer representation / summary recall, back to the UI as part of the model's
own output.

Root cause

Honcho's recall surfaces (peer representation, peer card, session summary,
recent-messages) return whatever was previously stored. Legacy transcripts
written before the streaming-side scrubber (StreamingContextScrubber) and
the gateway-side stripper landed contain raw <memory-context> blocks. Once
poisoned, every subsequent recall re-injects them, and the model — seeing
authoritative-looking system framing — echoes them.

Fix — defense in depth

Three independent layers, any one of which is sufficient on its own:

Layer 1 — read-side sanitize (plugins/memory/honcho/session.py)
get_session_context, _fetch_peer_context, and the summary fetch all run
their Honcho-returned strings through agent.memory_manager.sanitize_context
before handing them back. This defangs every legacy poisoned record at the
boundary — no migration required.

Layer 2 — write-side sanitize (gateway/run.py)
All three user-content persistence paths (early-exit error, no-new-messages
fallback, and the main per-message loop) now sanitize message_text /
entry["content"] before calling append_to_transcript. Prevents future
poison from ever being written.

Layer 3 — in-memory persist override (agent/conversation_loop.py)
When memory injection mutates the API-bound api_msg during
run_conversation, we defensively set _persist_user_message_override to
the pre-injection clean content. Today the injection only touches a shallow
copy so the in-memory messages entry stays clean, but if a future refactor
ever mutates the wrong dict, the override path in
_apply_persist_user_message_override will still scrub the persisted
content back to the clean text. Idempotent — only sets when no override is
already in effect.

Layer 4 scope-out

The original plan included a Layer 4 bulk-sanitize CLI that would rewrite
historical Honcho-stored messages in place. The Honcho SDK does not
expose a message-content rewrite API — stored message content is immutable
from the client side — so the bulk migration is unimplementable as designed.
Layer 1's read-side sanitize already defangs every legacy poisoned record at
the boundary on every recall, so the migration is also unnecessary. Skipped.

Tests

tests/test_honcho_memory_context_leak.py adds 6 regression tests covering
all three layers:

  • Layer 1: poisoned Honcho.context() payload (summary, representation,
    card, messages) → get_session_context returns sanitized data; source
    inspection confirms the fallback path also sanitizes.
  • Layer 2: sanitize_context strips the block; source inspection confirms
    all 3 gateway persistence sites funnel through it; end-to-end round-trip
    via SessionStore.append_to_transcriptload_transcript returns
    clean content.
  • Layer 3: source inspection confirms run_conversation sets the
    persist-override to the pre-injection content inside the injection
    branch, gated on not-already-overridden.

Full targeted suite: 297 passed
pytest tests/test_honcho_memory_context_leak.py tests/honcho_plugin/ tests/agent/test_streaming_context_scrubber.py tests/gateway/test_vision_memory_leak.py

@alt-glitch alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder comp/gateway Gateway runner, session dispatch, delivery comp/plugins Plugin system and bundled plugins tool/memory Memory tool and memory providers P2 Medium — degraded but workaround exists labels May 27, 2026
@liuhao1024

Copy link
Copy Markdown
Contributor

Verified this defense-in-depth fix across all three layers — the implementation is correct and well-tested.

Layer 1 (read-side): Confirmed sanitize_context is applied to all Honcho return paths — summary, representation, card (list comprehension), and recent_messages content. The _fetch_peer_context return is also wrapped. No read-side path can surface legacy poisoned records.

Layer 2 (write-side): Verified all three gateway persistence sites in gateway/run.py now funnel through sanitize_context before append_to_transcript: (1) early-exit error path, (2) no-new-messages fallback, (3) main per-message loop. The source-inspection test (assert src.count("sanitize_context") >= 3) is a clever way to guard against future regressions that add new persistence sites without sanitization.

Layer 3 (in-memory): The _persist_user_message_override mechanism correctly snapshots the pre-injection msg content (not the injected api_msg), so even if a future refactor mutates the wrong dict, the persisted message stays clean. The idempotency guard (getattr(...) is None) prevents re-snapshotting on multi-turn injection.

One note: the test_fetch_peer_context_return_is_sanitized and test_conversation_loop_sets_override_when_injecting_memory tests use source inspection rather than behavioral assertions. This is pragmatic given the complex initialization requirements, but means a refactor that moves the sanitize call to an equivalent location would fail the test even though the behavior is correct. Not a blocker — the behavioral tests in Layer 1 and Layer 2 provide the real regression coverage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder comp/gateway Gateway runner, session dispatch, delivery comp/plugins Plugin system and bundled plugins P2 Medium — degraded but workaround exists tool/memory Memory tool and memory providers type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants