arch(security): per-turn memory policy filtering is defeated by LLM output persistence

## Context

Discovered during #370 review. Intersects with the trust context security policy work in PR #249.

## Problem

The current memory recall architecture filters memories per-turn based on trust context, audience, and sensitivity policy. But this protection is illusory because the LLM's own **responses are persisted to `_state.History`** while the memory injections are transient.

**Example:**

1. Turn 1 (high trust, `audience=personal`): Memory recalled — "User's preferred airport is IEH"
2. LLM responds: "I'll book your flight from IEH..." → **persisted to `_state.History`**
3. Trust degrades mid-session (e.g., reading an email from an untrusted source enters the context)
4. Memory policy dutifully excludes the airport preference from recall on subsequent turns
5. **Doesn't matter** — "IEH" is already in the conversation history from the LLM's prior response

The sensitive information leaked into the persistent conversation the moment the LLM referenced it. Per-turn recall filtering is protecting the front door while the information already walked out through the LLM's output.

## Implications for PR #249

PR #249's trust context spec includes scenarios like:

> **GIVEN** a stored memory item is marked `audience=personal`  
> **WHEN** a `public` turn runs recall  
> **THEN** the item is excluded from both automatic recall and inline retrieval

This scenario is correct for **first-contact gating** (preventing a sensitive memory from being *introduced* during a low-trust turn). But it provides no protection for information that was already surfaced in a prior high-trust turn within the same session.

## What per-turn filtering still protects

- **Cross-session isolation** — memories from one session domain don't leak into a different session. This boundary is real.
- **First-contact gating** — prevents sensitive memories from being introduced during a low-trust turn (the memory never enters the conversation at all).
- **Raw secret rejection at write time** — preventing credentials from being stored as memories in the first place. Genuinely valuable.

## Proposed direction

The trust boundary should be at the **session** level, not the **turn** level:

- Trust context evaluated at session start (or when a new participant/channel joins)
- If trust degrades mid-session, the session forks or terminates rather than trying to filter recall while history is already contaminated
- Per-turn recall filtering becomes simpler — apply the session's established trust level, don't re-evaluate policy on every call

This also has implications for the transient injection architecture. If policy is session-scoped, the main justification for making memory injections transient (re-evaluating policy per-turn) weakens. Memories could potentially be persisted to session history on first injection, eliminating the per-call re-injection cost entirely.

## Related

- #370 — memory recall deduplication (where this was discovered)
- PR #249 — trust context security policy (directly affected)
- #374 — tool-result-aware recall (future consideration)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

arch(security): per-turn memory policy filtering is defeated by LLM output persistence #376

Context

Problem

Implications for PR #249

What per-turn filtering still protects

Proposed direction

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

arch(security): per-turn memory policy filtering is defeated by LLM output persistence #376

Description

Context

Problem

Implications for PR #249

What per-turn filtering still protects

Proposed direction

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions