Skip to content

Memory recall unconditionally injects top-N items with no score floor, polluting unrelated sessions #582

@Aaronontheweb

Description

@Aaronontheweb

Problem

SQLiteMemoryRecallCoordinator.RecallAsync runs on every turn and always returns up to maxItems (default 3) candidates, regardless of whether any of them are actually relevant to the current conversation. Those items are then unconditionally injected into the prompt as a [memory-recall] system message by SessionRecallManager.InjectIntoMessages. There is no relevance gate — "recalled" effectively equals "injected."

When the memory DB is sparse or populated with off-topic content (e.g., eval artifacts, doctor diagnostic notes, operational memos), every turn pulls those unrelated facts into the prompt as authoritative context, even for conversations that share no topical overlap with them.

Observed behavior

In a recent session the only durable facts in the DB were operational/diagnostic docs created by earlier eval and doctor runs (things like Default Context Window Configuration, Shell Execution Environment Restriction, Slack Channel Access Restrictions, Full Host Shell Access Permission, Development Environment Configuration). The session itself was on an entirely unrelated topic.

Daemon logs show injectedCount=3 on every turn, with candidate sets where:

  • On the first turn the planner produced high raw scores (200+) for durable facts that matched on generic tokens, and took the top 3 regardless of semantic relevance.
  • On a later turn every single candidate scored 6.0 (a flat tie), and the coordinator still returned the top 3 and injected them.

None of the injected items were topically related to the user message. The model received them as a system-level [memory-recall] block labeled status: healthy.

Root causes

  1. No minimum score floor. RecallAsync does scoredCandidates.OrderByDescending(...).Take(maxItems) — it will always take maxItems items if that many exist, even if the top candidate's score is indistinguishable from noise. See src/Netclaw.Actors/Sessions/SQLiteMemoryRecallCoordinator.cs around the deterministicItems selection.
  2. Unconditional injection. SessionRecallManager.InjectIntoMessages at src/Netclaw.Actors/Sessions/Pipelines/SessionRecallManager.cs:118 inserts the recall block whenever recall.Items.Count > 0. There is no gate between "coordinator returned items" and "items go into the prompt."
  3. RecallRank boosts durable facts by ~200 points regardless of query match. DurableFact + MergeDocument gets +200 in the tiebreaker, so operational docs dominate rankings whenever nothing else is available.
  4. Planner tokenizes the user message into lexical terms and FTS-searches the entire domain. With a sparse DB, only the eval/ops docs can match, so they always win by default.

Impact

  • Users running on a fresh or eval-seeded DB get operational trivia injected into unrelated conversations.
  • The model is being given false authoritative context, which degrades response quality and risks the LLM treating unrelated configuration notes as relevant to the current topic.
  • This also inflates prompt tokens on every turn with content that contributes nothing.

Proposed fix

  • Add a minimum-score floor to the deterministic selector so low-confidence candidates are dropped, allowing RecallAsync to legitimately return zero items.
  • Gate InjectIntoMessages on that same floor — if nothing cleared the bar, do not emit a [memory-recall] system message at all (rather than emitting one with marginal content).
  • Revisit RecallRank: the +200 boost for DurableFact + MergeDocument should not be large enough to override a near-zero selector score.
  • Consider logging memory_retrieval_skipped_low_score when the coordinator suppresses injection, so it's observable in the daemon log.

Acceptance

  • A session with no relevant memories in the DB sees injectedCount=0 and no [memory-recall] block in the prompt.
  • A session with clearly relevant memories still gets them injected as today.
  • Candidate sets with flat scores (all ties at a low value) do not produce an injection.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions