Skip to content

fix(memory): automatic recall overfetches irrelevant memories at scale #560

@Aaronontheweb

Description

@Aaronontheweb

Summary

Automatic pre-turn memory recall injects irrelevant memories into the LLM context when the memory store grows beyond a trivial size. Observed in session C0AGM484P0Q/1775569879.583199 where a question about kegerators recalled testlab-setup repo locations, Reddit scan results, and Akka.NET release versions alongside the one relevant kegerator memory.

Root Cause

The deterministic retrieval path uses SearchAcrossDomainsByPlanAsync which fetches an overfetch pool of limit * 5 candidates via FTS, then ranks them with DeterministicCandidateSelector.Score(). The scoring is additive (lexical term matches +4, facet matches +6, anchor hints +8, soft scopes +3.5, domain affinity +5) with a baseline of 1.0. Any candidate that survives the FTS match gets a non-zero score, so noise gets through when:

  1. The FTS query terms are common enough to match many unrelated memories
  2. The candidate limit (30 ranked, 60 bundle) is high relative to the number of truly relevant results
  3. The scoring doesn't penalize irrelevance — it only boosts relevance

Observed Impact

  • 3 automatic recall slots used per turn, only 1 relevant → 66% noise rate
  • Noise memories consume context window tokens that could be used for the actual conversation
  • At scale (hundreds of memories), the noise-to-signal ratio will worsen

Possible Mitigations

  • Score threshold: reject candidates below a minimum score instead of always filling all slots
  • Negative scoring: penalize candidates that match only on common/stop-word terms
  • Recency weighting: boost recently accessed or created memories
  • Query-topic coherence: detect topic drift between the query and candidate content
  • Reduce default candidate limit from 3 to 2 for ranked mode

Notes

The audience fail-close fix (#558) partially mitigates this by reducing the eligible candidate pool — NULL-audience records now only surface in Personal contexts. But the underlying ranking precision issue remains for large memory stores within a single audience tier.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    context-pipelineLLM context assembly: prompt layers, dynamic injection, memory recall, temporal groundingmemoryMemory formation, recall, curation pipeline

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions