Skip to content

fix(memory): unblock recall and improve formation quality#409

Closed
Aaronontheweb wants to merge 1 commit into
devfrom
claude-wt-memory-formation
Closed

fix(memory): unblock recall and improve formation quality#409
Aaronontheweb wants to merge 1 commit into
devfrom
claude-wt-memory-formation

Conversation

@Aaronontheweb

Copy link
Copy Markdown
Collaborator

Summary

Phase 1 of memory formation fixes based on analysis of three production sessions
(2026-03-24) that showed zero memory proposals and zero recall matches.

  • Recall: Allow evidence class in deterministic retrieval, add baseline candidate
    scoring, switch to audience-primary recall (domain → ranking preference, not security gate)
  • Formation: Revise sidecar prompt with agent-derived findings category, tighten
    ProjectStatementPattern to reject conversational fragments
  • Observability: Progressive recall exhaustion logging, eval debugging guidance

Key changes

Fix File Impact
Evidence in retrieval DeterministicRetrievalPlanning.cs 10/28 memories now visible
Baseline candidate score DeterministicCandidateSelector.cs Fixes rawCount>0 selectedCount=0
Audience-primary recall SQLiteMemoryRecallCoordinator.cs Cross-domain user facts visible
Sidecar prompt revision MemorySidecarPromptBuilder.cs Agent conclusions → evidence
Conversational fragment guard MemoryCurationPipeline.cs Blocks junk durable_facts
Exhaustion logging LlmSessionActor.cs Observability

Test plan

  • All 1,312 existing tests pass
  • 13 new tests added (candidate selection, cross-domain recall, fragment rejection)
  • dotnet slopwatch analyze clean
  • Eval suite (running against Qwen 3.5 27B — some failures may be instrumentation)
  • Manual validation: restart daemon, verify recall surfaces cross-domain memories

Follow-up: Phase 2 — Session-Level Distillation

Per-turn observation produces proposalCount=0 even with the prompt fix because the
sidecar only sees the assistant's final reply text, not the journey. Phase 2 will
replace per-turn observation with session-level distillation on idle, giving the sidecar
full conversation context. Separate branch/PR.

Phase 1 of memory formation fixes based on analysis of three production
sessions that showed zero memory proposals and zero recall matches.

Recall fixes:
- Allow evidence class in deterministic retrieval (was hardcoded to
  durable_fact only, making 10/28 stored memories invisible)
- Add baseline score (1.0) in candidate selector so SQL-matched
  candidates aren't silently dropped by zero-score threshold
- Switch to audience-primary recall: remove domain as hard filter,
  use audience+boundary as security gates, add domain affinity boost
  for ranking. Removes ShouldWidenAcrossDomains two-path design.

Formation fixes:
- Revise observation sidecar prompt: add agent-derived findings
  classification category, evidence example, soften conservative bias
- Tighten ProjectStatementPattern to reject conversational fragments
  via IsConversationalFragment prefix check (blocks junk like "Well I
  was going to has You do some Netclaw work")

Observability:
- Log progressive_recall_exhausted when all candidates already injected
- Add eval debugging guidance to CLAUDE.md (failures are almost always
  instrumentation, rarely the model)
@Aaronontheweb

Copy link
Copy Markdown
Collaborator Author

Superseded by #410 which includes all Phase 1 changes plus the session-level memory observer.

@Aaronontheweb Aaronontheweb deleted the claude-wt-memory-formation branch May 6, 2026 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant