Skip to content

feat(memory): session-level memory observer actor (spike)#410

Merged
Aaronontheweb merged 8 commits into
devfrom
claude-wt-session-memory-observer
Mar 25, 2026
Merged

feat(memory): session-level memory observer actor (spike)#410
Aaronontheweb merged 8 commits into
devfrom
claude-wt-session-memory-observer

Conversation

@Aaronontheweb

Copy link
Copy Markdown
Collaborator

Summary

Spike: adds SessionMemoryObserverActor — a persistent child actor that watches the
conversation stream and distills memories when the session goes idle. Replaces per-turn
observation which produced proposalCount=0 on every prompt.

  • Observer subscribes to same stream as SessionLogActor (user messages, assistant text,
    tool call names, recalled memories, turn boundaries)
  • Runs sidecar LLM call with full transcript + skip list of already-proposed anchors
  • Proposals flow through existing gate → curation actor pipeline
  • Token usage emitted through standard pipeline for accurate billing
  • Persistent: journals proposed anchors so skip list survives across incarnations

Builds on PR #409 (Phase 1 recall fixes).

Known limitation

No passivation-triggered distillation. The observer's idle timer (90s) fires before
the session's passivation timeout (~5min), so memories form during normal pauses.
Final distillation before session death requires formalizing the session actor's state
machine — tracked in #TBD.

Test plan

  • All 1,202 existing tests pass (including passivation test)
  • dotnet slopwatch analyze clean
  • Binary swap → netclaw -p exercises → verify proposalCount>0 in daemon logs
  • Memory documents grow in SQLite after idle distillation

Phase 1 of memory formation fixes based on analysis of three production
sessions that showed zero memory proposals and zero recall matches.

Recall fixes:
- Allow evidence class in deterministic retrieval (was hardcoded to
  durable_fact only, making 10/28 stored memories invisible)
- Add baseline score (1.0) in candidate selector so SQL-matched
  candidates aren't silently dropped by zero-score threshold
- Switch to audience-primary recall: remove domain as hard filter,
  use audience+boundary as security gates, add domain affinity boost
  for ranking. Removes ShouldWidenAcrossDomains two-path design.

Formation fixes:
- Revise observation sidecar prompt: add agent-derived findings
  classification category, evidence example, soften conservative bias
- Tighten ProjectStatementPattern to reject conversational fragments
  via IsConversationalFragment prefix check (blocks junk like "Well I
  was going to has You do some Netclaw work")

Observability:
- Log progressive_recall_exhausted when all candidates already injected
- Add eval debugging guidance to CLAUDE.md (failures are almost always
  instrumentation, rarely the model)
Adds SessionMemoryObserverActor — a persistent child actor that watches
the conversation stream and distills memories when the session goes idle.
Replaces the per-turn observation sidecar which produced proposalCount=0
on every prompt because it lacked session-level context.

Observer design:
- Persistent actor (journals proposed anchors for skip list durability)
- Subscribes to same stream as SessionLogActor (user messages, assistant
  text, tool call names, recalled memories, turn boundaries)
- Owns its own ReceiveTimeout (90s default, configurable)
- Runs sidecar LLM call with full transcript + skip list
- Sends proposals to parent → gate → curation actor pipeline
- Token usage emitted through standard EmitUsageOutput pipeline

Session actor changes:
- Creates observer alongside curation actor in RecoveryCompleted
- Forwards SendUserMessage and SessionOutput to observer
- Forwards ObserverSystemContext for recalled memories
- Handles SessionDistillationCompleted with token tracking
- Removes: ObserveTurnForMemory, ObserveMemoryAsync,
  BuildStrongAssertions, _sidecarMemoryObserver field,
  MemoryObservationCompleted/Failed handlers

Known limitation: no passivation-triggered distillation yet. The observer's
idle timer (90s) is shorter than passivation timeout (~5min), so memories
form during normal pauses. Final distillation before death requires
formalizing the session actor's state machine (separate work).
- Double the sidecar timeout for distillation (full transcript is larger
  than single-turn prompts)
- Preserve _hasNewContent on distillation failure so the observer retries
  on the next idle cycle instead of silently losing the content
- DistillationFinished now carries a success flag for this purpose

Known: Qwen 3.5 27B times out even at 180s on distillation calls.
The architecture works correctly (fires, retries, guards concurrency).
A faster or API-hosted model is needed for practical use.
…ault

Qwen 3.5 27B needs more time for full-transcript distillation.
2x was still timing out. 5x (450s with default 90s sidecar config)
gives the local model enough room.

Confirmed: session D0AC6CKBK5K/1774388427.214549 successfully distilled
2 proposals (litellm-supply-chain-attack, litellm-mitigation-steps),
routed through gate (accepted=2) and curation (created=2).
ApplyInlineCurationBatchAsync was not writing boundary or audience
columns, causing all observer-distilled memories to have NULL values.
Since recall filters on COALESCE(boundary, 'boundary:legacy-restricted'),
NULL-boundary documents were invisible to Slack sessions that expect
'boundary:trusted-instance'.

Also fix HandleDistillationResult to resolve audience from session ID
when _currentTurnSource is null (distillation fires after idle, turn
context is cleared). Slack → Team, SignalR → Personal, etc.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant