Skip to content

v0.6.1.0 feat(assistant): twin profile + memory context (#147)#150

Merged
jayzalowitz merged 1 commit into
mainfrom
jayzalowitz/assistant-twin-context
May 5, 2026
Merged

v0.6.1.0 feat(assistant): twin profile + memory context (#147)#150
jayzalowitz merged 1 commit into
mainfrom
jayzalowitz/assistant-twin-context

Conversation

@jayzalowitz

Copy link
Copy Markdown
Owner

Summary

Closes #147 (assistant phase 2b). The conversational assistant now reads the user's twin profile (preferences + inferences) and recent episodic memories before composing each reply, so it can answer the two killer use cases that distinguish a personal twin from generic ChatGPT:

  • "What did I tell you about X last month?"
  • "What's my preference for Y?"

What landed

ContextBuilder in @skytwin/assistant

Composes a compact context block prepended to the system prompt:

```

What I know about you

Trust tier: moderate_autonomy
Preferences:

  • email/auto_archive = yes (high)
  • calendar/default_meeting_length = 30 (confirmed)
    Inferences (not yet user-confirmed):
  • finance/monthly_subscription_threshold = 50 (high) — based on 12 prior approvals

Relevant past episodes

  • [2026-04-12] email · Archived a Stripe receipt without asking · auto-archive (approved)
    ```

Two ports keep the @skytwin/assistant package free of @skytwin/db / @skytwin/mempalace deps and unit-testable with stubs:

Port Backing (in API)
TwinContextProvider.fetch TwinService.getOrCreateProfile + userRepository.findById
MemoryContextProvider.search mempalaceRepository.searchEpisodes

Design choices:

  • Hard cap at 2KB with UTF-8-clean ellipsis truncation. Noisy profile + long memory hits can't dominate the model's token budget.
  • Confidence floor — only confirmed / high / moderate surface. Speculative + low entries stay in the twin model but don't broadcast (would make the assistant look unsure of itself).
  • Confidence-ranked truncation — highest-confidence wins MAX_PREFERENCES = 12 / MAX_INFERENCES = 6 / MAX_MEMORIES = 5 slots.
  • Boolean values render as yes/no instead of true/false. More readable for both the model and anyone debugging via prompt logs.
  • Partial-context fallback — if either provider throws, the other still renders. console.warn records which side failed; the request continues with whatever was retrievable.
  • No-op on empty — empty result from both providers returns '', which AssistantService treats as "use the default system prompt unchanged" — same behavior as phase 1, no surprises for early bring-up paths without context.

AssistantService.reply() now takes optional enrichment

New optional 2nd param enrichment?: { userId, query }. Backward-compatible: omitting it OR omitting the ctor builder falls back to the bare default system prompt — phase 1 callers compile unchanged.

Wiring

  • apps/api/src/routes/assistant.ts constructs a ContextBuilder once per process. enrichment.query is the just-sent user message — that's what the assistant is about to answer, so the most-relevant memories are the ones that match it.
  • The memory adapter splits the query into ≥3-char tokens and calls mempalaceRepository.searchEpisodes — same backing as MemoryStack.search L3. Stop-words and short tokens drop so a query like "the plan for X" doesn't ILIKE-match every episode containing "the".
  • Episode outcome JSON blobs collapse to a one-line label (kind / status / result field if present, else short stringification) so the rendered context stays compact.

Safety / invariants

  • Trust boundary — the rendered context is text only, prepended to the system prompt. No user-controlled strings flow through it: preferences and inferences come from the twin model (which already tracks source and evidence), and episodic memory summary strings are written by the system itself, not by external senders. So even though we expand what the LLM sees, we don't expand the attack surface.
  • Information leak — same don't-leak-existence semantics as phase 1; ownership middleware gates userId access; the new context endpoint isn't a thing — context is built server-side per request.
  • Cost / latency — the 2KB cap puts a hard ceiling on token cost per request from this feature. Profile + memory fetches happen in parallel with Promise.all.

Test plan

  • pnpm build — 20 packages green
  • pnpm test — 40 packages green; 15 new tests (12 ContextBuilder + 4 AssistantService enrichment paths) on top of phase-1's 24
  • pnpm lint — clean
  • Manual: load #/assistant, ask the twin about its preferences (e.g. "do I auto-archive emails?") and verify it answers from the profile, not from generic priors
  • Manual: with no profile data, verify the assistant still works (returns generic ChatGPT-style answers)
  • Manual: pull the worker log during a request and verify the rendered system prompt contains the expected context block

Phase 2+ still in flight

🤖 Generated with Claude Code

…mpt (#147)

Phase 2b of issue #135closes #147. The assistant now reads the user's
twin profile (preferences + inferences above the moderate confidence
floor) and recent episodic memories before composing each reply. Two
ports keep @skytwin/assistant free of @skytwin/db / @skytwin/mempalace
deps; adapters wire to real backings in apps/api/src/routes/assistant.ts.

ContextBuilder
- Hard cap at MAX_CONTEXT_BYTES = 2000 with UTF-8-clean ellipsis
  truncation. Noisy profiles can't dominate the token budget.
- Confidence floor: only confirmed/high/moderate surface. Speculative
  + low entries stay in the model but don't broadcast.
- Confidence-ranked truncation: highest-confidence wins MAX_PREFERENCES=12
  / MAX_INFERENCES=6 / MAX_MEMORIES=5 slots.
- Booleans render as yes/no for readability.
- Partial-context fallback on either provider failure.
- Empty-on-both-empty so AssistantService can short-circuit unchanged.

AssistantService.reply now takes optional enrichment={userId, query}.
Backward-compatible: omitting it OR omitting the ctor builder falls back
to the bare default system prompt — phase 1 callers untouched.

Wiring
- TwinContextProvider: TwinService.getOrCreateProfile + userRepository.findById
  in parallel for preferences/inferences/trust-tier.
- MemoryContextProvider: query → ≥3-char tokens →
  mempalaceRepository.searchEpisodes (same backing as MemoryStack L3).
- Episode outcome JSON collapses to a one-line label (kind/status/result
  field if present, else short stringification).

Tests: 15 new (12 ContextBuilder + 4 AssistantService enrichment paths).
Full suite green across 40 packages.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 5, 2026 18:26

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds phase 2b assistant enrichment so @skytwin/assistant can prepend twin-profile facts and relevant episodic memories to the system prompt before generating a reply. This fits the assistant pipeline by keeping enrichment logic package-local/testable while wiring real data sources in the API route.

Changes:

  • Added a new ContextBuilder plus exported port/types for twin-profile and memory-context enrichment.
  • Extended AssistantService.reply() with optional enrichment input and prepended context support.
  • Wired the API assistant route to build/query twin + mempalace context, and added unit tests/changelog/version updates.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
VERSION Bumps release to 0.6.1.0.
packages/assistant/src/index.ts Exports new enrichment types and ContextBuilder.
packages/assistant/src/context-builder.ts Implements context rendering, filtering, truncation, and provider fallback.
packages/assistant/src/assistant-service.ts Adds optional enrichment flow to assistant reply generation.
packages/assistant/src/__tests__/context-builder.test.ts Covers context rendering/truncation/fallback behavior.
packages/assistant/src/__tests__/assistant-service.test.ts Covers enrichment prompt-prepending behavior in the service.
CHANGELOG.md Documents the new assistant enrichment release.
apps/api/src/routes/assistant.ts Wires real twin/memory providers into the assistant API route.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +257 to +259
for (const i of infs) {
const reason = i.reasoning ? ` — ${i.reasoning}` : '';
lines.push(`- ${i.domain}/${i.key} = ${renderValue(i.value)} (${i.confidence})${reason}`);
Comment on lines +129 to +132
const terms = query
.toLowerCase()
.split(/[^a-z0-9]+/i)
.filter((t) => t.length >= 3);
async search(userId, query, limit = 5) {
const terms = query
.toLowerCase()
.split(/[^a-z0-9]+/i)
Comment thread CHANGELOG.md
- `MemoryContextProvider` adapter splits the query into ≥3-char tokens and calls `mempalaceRepository.searchEpisodes` — same backing call that `MemoryStack.search` uses for L3 deep-search. Stop-words and short tokens drop so a query like "the plan for X" doesn't ILIKE-match every episode containing "the".
- Episode `outcome` JSON blobs collapse to a one-line label (`kind` / `status` / `result` field if present, else short stringification) so the rendered context stays compact.

### Tests (15 new)
let reply;
try {
reply = await service.reply(history);
reply = await service.reply(history, { userId, query: content });
@jayzalowitz jayzalowitz merged commit 4b0aed0 into main May 5, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Assistant phase 2b: twin profile + Memory Palace context in system prompt

2 participants