Skip to content

GBrain v0.4.0 — production agent documentation + reference architecture#10

Merged
garrytan merged 24 commits intomasterfrom
garrytan/supabase-db-setup
Apr 9, 2026
Merged

GBrain v0.4.0 — production agent documentation + reference architecture#10
garrytan merged 24 commits intomasterfrom
garrytan/supabase-db-setup

Conversation

@garrytan
Copy link
Copy Markdown
Owner

@garrytan garrytan commented Apr 9, 2026

Summary

GBrain v0.4.0: documentation upgrade to show what a production agent actually looks like when gbrain is its knowledge backbone.

New: GBRAIN_SKILLPACK.md (789 lines)
Reference architecture document based on patterns from a real production deployment with 14,700+ brain files, 40+ skills, 20+ cron jobs. Covers:

  • Entity detection on every message (original thinking capture + entity mentions)
  • Brain-first lookup protocol (search before external APIs)
  • 7-step enrichment pipeline with tiered API spend
  • Compiled truth + timeline page pattern
  • Source attribution with mandatory citations
  • Meeting ingestion with entity propagation
  • Cron schedule with quiet hours and travel-aware timezone
  • YouTube/media ingestion via Diarize.io
  • Integration guides: ClawVisor, Circleback webhooks, Quo/OpenPhone

README rewrite

  • New memex opener (Vannevar Bush framing, production numbers, "ask it anything" examples)
  • Credits Karpathy's Knowledge LLM vision as inspiration
  • OpenClaw positioning up front
  • New sections: Compounding Thesis, Architecture diagram, Production Agent teaser, OpenClaw complement

5 skill updates

  • setup: brain-first lookup protocol (gbrain search → query → get → grep fallback)
  • query: token-budget awareness + source precedence hierarchy
  • ingest: entity detection on every message
  • maintain: heartbeat integration (doctor, embed --stale, sync verification)
  • briefing: gbrain-native context loading (search attendees, sender, deals)

Infrastructure (already shipped in prior commits)

  • Doctor command, pluggable storage backends, parallel import, resume checkpoints
  • RLS on all tables, schema migration runner, bulk chunk INSERT
  • Version bumped 0.3.0 → 0.4.0, CHANGELOG entry, 56 new unit tests

Test Coverage

All 207 unit tests pass. 0 failures. Docs-only changes in this session — no new code paths.

Pre-Landing Review

Eng review CLEAR (2 runs). CEO review CLEAR. DX review CLEAR (score 6/10 → 8/10).

Test plan

  • All unit tests pass (207 pass, 0 fail, 22 test files)
  • Sanitization sweep: 0 hits for banned terms
  • All gbrain CLI commands referenced in docs verified against operations.ts

🤖 Generated with Claude Code

garrytan and others added 24 commits April 9, 2026 05:56
Git is the system of record. Slugs are lowercased repo-relative paths.
The restrictive regex rejected spaces, parens, and special chars, blocking
5,861 Apple Notes files from importing. Now only rejects empty slugs,
path traversal (..), and leading slash.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Without RLS, the Supabase anon key gives full read access to the DB.
Enable RLS on all 10 tables with no policies — the postgres role
(used by gbrain via pooler) has BYPASSRLS and is unaffected. Only
enables if the current role actually has BYPASSRLS privilege to
avoid locking ourselves out on non-Supabase setups.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…gress

Raise MAX_FILE_SIZE from 1MB to 5MB for Apple Notes with attachments.
Track error patterns and suppress after 5 identical errors to prevent
5,861 identical warnings from killing the agent process. Replace \r
progress bar with structured log lines (rate, ETA) for agent parsing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Detect db.*.supabase.co direct URLs and warn about IPv6 failure.
On ECONNREFUSED/ETIMEDOUT to Supabase, suggest the Session pooler
connection string with exact dashboard click path. Check for pgvector
extension after connecting and fail with clear instructions if missing.
Update wizard hints to show pooler URL format.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
E2E tests against real Postgres+pgvector must pass before /ship or
/review. Adds the requirement to CLAUDE.md so all agents enforce it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Refactor PostgresEngine to support instance-level DB connections instead
of only the module-global singleton. Each worker gets its own connection
with poolSize:2 (vs 10 for the main engine), so 8 workers = 16 connections.

Add --workers N flag to gbrain import. Workers pull from a shared queue
and use independent engine instances — no transaction context corruption.

The bottleneck is network round-trips to Supabase (one per page upsert).
Parallel workers cut import time proportionally.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Migrations are embedded as string constants in migrate.ts (survives
Bun --compile). Each migration runs in a transaction for clean rollback
on failure. Runs automatically on initSchema() — no manual step needed
when a user updates the gbrain binary against an older DB.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add StorageBackend interface with three implementations:
- S3Storage: works with AWS S3, Cloudflare R2, MinIO (any S3-compatible)
- SupabaseStorage: uses Supabase Storage REST API with service role key
- LocalStorage: filesystem-based, for testing

Add file-resolver.ts with fallback chain: local file → .redirect
breadcrumb → .supabase marker → storage backend. Supports the
three-stage migration (mirror → redirect → clean).

Add yaml-lite.ts for parsing marker and breadcrumb files without
adding a YAML dependency.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Checks: connection, pgvector extension, RLS on all tables, schema
version, embedding coverage. Outputs structured JSON with --json flag
for agent parsing. Exit code 0 if healthy, 1 if issues found.

Agents should run gbrain doctor --json when any command fails.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Setup skill: add Why Supabase, step-by-step project creation, explicit
agent instructions (nohup for large imports, doctor on failure, don't
ask for anon key), available init flags, file migration offer after
first import. Remove ClawHub references.

README: simplify to single OpenClaw install path, remove ClawHub, fix
squatted npm name to github:garrytan/gbrain, add Supabase settings
note about Session pooler.

Add Apple Notes test fixtures with spaces and parens in filenames for
E2E testing of the slug fix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…n skill

Maintenance skill now checks RLS status and schema version as part of
periodic health checks. Adds nohup pattern for large embedding refreshes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Import resume: saves checkpoint every 100 files to ~/.gbrain/import-checkpoint.json.
On restart with same directory and file count, skips already-processed files.
Use --fresh to ignore checkpoint and start over. Cleared on successful completion.

Supabase admin: extractProjectRef() parses any Supabase URL format (dashboard,
direct, pooler, project URL) to extract the project ref. discoverPoolerUrl()
uses the Management API to find the correct pooler connection string (including
the exact region prefix). checkRls() verifies RLS status via the API.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
8 new test files covering every feature added in this branch:
- slug-validation.test.ts: spaces, parens, unicode, path traversal (10 tests)
- yaml-lite.test.ts: parse + stringify, marker/redirect formats (9 tests)
- supabase-admin.test.ts: extractProjectRef for 4 URL formats (7 tests)
- migrate.test.ts: version export, runMigrations callable (2 tests)
- storage.test.ts: LocalStorage CRUD + createStorage factory (14 tests)
- file-resolver.test.ts: fallback chain, redirect, marker parsing (6 tests)
- import-resume.test.ts: checkpoint save/load/resume/fresh (6 tests)
- doctor.test.ts: module export, CLI registration (3 tests)

Total: 184 pass, 0 fail (up from 128).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bulk INSERT: upsertChunks now builds a multi-row VALUES query instead
of inserting chunks one-by-one. Reduces DB round-trips by ~50x per page.

E2E tests added to mechanical.test.ts:
- Slug with special chars: import Apple Notes fixtures with spaces/parens,
  verify search finds them, verify idempotency
- RLS verification: check pg_tables.rowsecurity on all tables, verify
  current user has BYPASSRLS
- Doctor command: verify exit 0 on healthy DB, --json produces valid JSON
  with check structure
- Parallel import: --workers 2 produces same page count as sequential

Unit tests added:
- setup-branching.test.ts: IPv6 detection, defaultWorkers auto-tuning,
  smart URL parsing across all Supabase URL formats

Fixtures added:
- large/big-file.md (2.1MB) for testing raised file size limit
- apple-notes/ fixtures already existed

Total: 200 pass, 0 fail (up from 184).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
--json flag: init and import now support --json for structured output.
Agents get parseable JSON instead of human-readable text.

File migration CLI: implement mirror, unmirror, redirect, restore,
clean, and status subcommands for the three-stage file migration
lifecycle (local → mirrored → redirected → cloud-only).

File migration tests: full lifecycle test covering every transition
in the state machine (LOCAL → MIRROR → UNMIRROR → REDIRECT → RESTORE
→ CLEAN), including edge cases and file resolver at each stage.

Bulk chunk INSERT: upsertChunks now builds multi-row parameterized
VALUES query, reducing round-trips per page from ~50 to 1.

Total: 207 pass, 0 fail.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the weak single-comparison parallel import test with 7 tests:
- Sequential baseline: capture page count, chunk count, and all slugs
- --workers 2: verify page count matches sequential
- Chunk count matches (no duplicates from concurrent writes)
- Page slugs match exactly
- No duplicate pages (SQL GROUP BY HAVING count > 1)
- No duplicate chunks (SQL GROUP BY page_id, chunk_index)
- --workers 4: also works correctly
- Re-import with workers is idempotent

These tests catch the exact bug Codex found (db.ts singleton causing
concurrent transaction corruption) by verifying data integrity after
parallel writes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Deferred during eng review (per-worker embedding is good enough for now).
Revisit after profiling real imports to confirm embedding is the bottleneck.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix fixture count assertions: 13 → 16 pages (added apple-notes + large file),
companies 2 → 3 (ohmygreen), concepts 3 → 5 (notes, big-file).

Fix --workers arg parsing: the worker count value (e.g. "2") was being
picked up as the directory arg. Skip flag values when finding the dir.

Fix doctor exit code: warnings (like missing embeddings) should exit 0,
only actual failures exit 1. E2E tests import with --no-embed, so
embeddings are always WARN.

Fix E2E CLI tests: add initCli() before doctor and parallel import
tests so ~/.gbrain/config.json exists for the subprocess.

All E2E tests pass: 63 pass, 0 fail.
All unit tests pass: 207 pass, 0 fail.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New CHANGELOG entry for all post-0.3.0 features (doctor, storage backends,
parallel import, resume checkpoints, RLS, schema migrations, --json output).
Version bumped 0.3.0 → 0.4.0 across all manifests.

CLAUDE.md: test count 9→19, skill count 8→7, added key files.
CONTRIBUTING.md: fixture count 13→16, added missing source files.
README.md: added gbrain doctor to commands, fixed stale welcome PRs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Production agent patterns from a real deployment with 14,700+ brain files.
Covers: entity detection on every message, brain-first lookup protocol,
7-step enrichment pipeline with tiered API spend, compiled truth + timeline,
source attribution with mandatory citations, meeting ingestion with entity
propagation, cron schedule with quiet hours and travel-aware timezone,
YouTube/media ingestion via Diarize.io, integration guides for ClawVisor,
Circleback webhooks, and Quo/OpenPhone SMS. Opens with the Vannevar Bush
memex framing and the originals folder for capturing intellectual capital.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace code-first opener with mimetic-desire pitch: Vannevar Bush memex
tagline, production brain numbers (10K+ files, 3K+ people, 13 years of
calendar), "ask it anything" examples, compounding thesis.

New sections: The Compounding Thesis (read-write loop), Architecture
(three-column diagram), What a Production Agent Looks Like (SKILLPACK
reference), How gbrain fits with OpenClaw (three-layer complement).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
setup: Phase D rewritten with brain-first lookup protocol (gbrain search
→ query → get → grep fallback), sync-after-write rule, memory_search
complement table.

query: token-budget awareness (chunks not full pages), source precedence
hierarchy (user > compiled truth > timeline > external).

ingest: entity detection on every message (scan, check brain, create or
enrich, commit and sync).

maintain: heartbeat integration (doctor, embed --stale, sync verification,
stale compiled truth detection).

briefing: gbrain-native context loading (search attendees before meetings,
search sender before email, daily deal/meeting/commitment queries).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Make it clear up top that GBrain is built for OpenClaw agents and
works with any OpenClaw deployment.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GBrain started as Karpathy's LLM wiki idea built for real. Worked great
until the brain hit thousands of files and grep fell apart. GBrain is the
search layer that had to exist once the brain outgrew grep.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@garrytan garrytan merged commit 912a321 into master Apr 9, 2026
3 checks passed
garrytan added a commit that referenced this pull request Apr 10, 2026
- fix(file_upload): call storage.upload() in all 3 paths (operation, CLI upload, CLI sync) with rollback semantics (#22 Bug #9)
- fix(import): use atomic index counter for parallel queue instead of array.shift() race, preserve checkpoint on errors (#22 Bug #3)
- fix(s3): replace unsigned fetch with @aws-sdk/client-s3 for proper SigV4 auth, supports R2/MinIO via forcePathStyle (#22 Bug #10)
- fix(redirect): verify remote file exists before deleting local copy, skip files not found in storage (#22 Bug #5)
- deps: add @aws-sdk/client-s3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request Apr 11, 2026
* fix: 7 bug fixes from Issue #9 and #22

- fix(mcp): use ListToolsRequestSchema/CallToolRequestSchema instead of string literals (Issue #9, PR #25)
- fix(mcp): handleToolCall reads dry_run from params instead of hardcoding false (#22 Bug #11)
- fix(search): keyword search returns best chunk per page via DISTINCT ON, not all chunks (#22 Bug #8)
- fix(search): dedup layer 1 keeps top 3 chunks per page instead of collapsing to 1 (#22 Bug #12)
- fix(engine): transaction uses scoped engine via Object.create, no shared state mutation (#22 Bug #2)
- fix(engine): upsertChunks uses UPSERT instead of DELETE+INSERT, preserves existing embeddings (#22 Bug #1)
- fix(slugs): validateSlug normalizes to lowercase, pathToSlug lowercases consistently (#22 Bug #4)
- schema: add unique index on content_chunks(page_id, chunk_index) for UPSERT support
- schema: add access_tokens and mcp_request_log tables via migration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: embed schema.sql at build time, remove fs dependency from initSchema

initSchema() previously read schema.sql from disk at runtime via readFileSync,
which broke in compiled Bun binaries and Deno Edge Functions. Now uses a
generated schema-embedded.ts constant (run `bun run build:schema` to regenerate).

- Removes fs and path imports from postgres-engine.ts and db.ts
- Adds scripts/build-schema.sh for one-source-of-truth generation
- Adds build:schema npm script

Fixes Issue #22 Bug #6.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: 5 more bug fixes from Issue #22

- fix(file_upload): call storage.upload() in all 3 paths (operation, CLI upload, CLI sync) with rollback semantics (#22 Bug #9)
- fix(import): use atomic index counter for parallel queue instead of array.shift() race, preserve checkpoint on errors (#22 Bug #3)
- fix(s3): replace unsigned fetch with @aws-sdk/client-s3 for proper SigV4 auth, supports R2/MinIO via forcePathStyle (#22 Bug #10)
- fix(redirect): verify remote file exists before deleting local copy, skip files not found in storage (#22 Bug #5)
- deps: add @aws-sdk/client-s3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: remote MCP server via Supabase Edge Functions

Deploy GBrain as a serverless remote MCP endpoint on your existing Supabase
instance. One brain, accessible from Claude Desktop, Claude Code, Cowork,
Perplexity Computer, and any MCP client. Zero new infrastructure.

New files:
- supabase/functions/gbrain-mcp/index.ts — Edge Function with Hono + MCP SDK
- supabase/functions/gbrain-mcp/deno.json — Deno import map
- src/edge-entry.ts — curated bundle entry point (excludes fs-dependent modules)
- src/commands/auth.ts — standalone token management (create/list/revoke/test)
- scripts/deploy-remote.sh — one-script deployment
- .env.production.example — 3-value config template

Changes:
- config.ts: lazy-evaluate CONFIG_DIR (no homedir() at module scope)
- schema.sql: add access_tokens + mcp_request_log tables
- package.json: add build:edge script

Auth: bearer tokens via access_tokens table (SHA-256 hashed, per-client, revocable)
Transport: WebStandardStreamableHTTPServerTransport (stateless, Streamable HTTP)
Health: /health endpoint (unauth: 200/503, auth: postgres/pgvector/openai checks)
Excluded from remote: sync_brain, file_upload (may exceed 60s timeout)

Setup: clone, fill .env.production, run scripts/deploy-remote.sh, create token, done.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: per-client MCP setup guides

- docs/mcp/DEPLOY.md — deployment walkthrough, auth, troubleshooting, latency table
- docs/mcp/CLAUDE_CODE.md — claude mcp add command
- docs/mcp/CLAUDE_DESKTOP.md — Settings > Integrations (NOT JSON config!)
- docs/mcp/CLAUDE_COWORK.md — remote + local bridge paths
- docs/mcp/PERPLEXITY.md — Perplexity Computer connector setup
- docs/mcp/CHATGPT.md — coming soon (requires OAuth 2.1, P0 TODO)
- docs/mcp/ALTERNATIVES.md — Tailscale Funnel + ngrok self-hosted options

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.6.0)

GBrain v0.6.0: Remote MCP server via Supabase Edge Functions + 12 bug fixes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add Remote MCP Server section to README

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: make document-release mandatory in CLAUDE.md, add MCP key files

Post-ship requirements section: document-release is NOT optional. Lists every
file that must be checked on every ship. A ship without updated docs is incomplete.

Also adds remote MCP server files to Key files section.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: batch upsertChunks into single statement to prevent deadlocks

The per-chunk UPSERT loop caused deadlocks under parallel workers because
each INSERT ON CONFLICT acquired row-level locks sequentially. Multiple
workers upserting different pages could deadlock on the shared unique index.

Fix: batch all chunks into a single multi-row INSERT ON CONFLICT statement.
One round-trip, one lock acquisition. COALESCE preserves existing embeddings
when the new value is NULL.

Fixes CI failure: "E2E: Parallel Import > parallel import with --workers 4"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: advisory lock in initSchema() prevents deadlock on concurrent DDL

When multiple processes call initSchema() concurrently (e.g., test setup +
CLI subprocess, or parallel workers during E2E tests), the schema SQL's
DROP TRIGGER + CREATE TRIGGER statements acquire AccessExclusiveLock on
different tables, causing deadlocks.

Fix: pg_advisory_lock(42) serializes all initSchema() calls within the
same database. The lock is session-scoped and released in a finally block.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add explicit test timeouts for CLI subprocess E2E tests

CLI subprocess tests (Setup Journey, Doctor Command, Parallel Import)
spawn `bun run src/cli.ts` which takes several seconds to JIT compile +
connect. The Bun test framework default 5000ms per-test timeout is too
tight for CI. Added 30-60s timeouts matching each subprocess's own
timeout to prevent false failures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: infinite recursion in config.ts exported getConfigDir/getConfigPath

The replace_all refactor created recursive functions: the exported
getConfigDir() called the private getConfigDir() which called itself.
Renamed exports to configDir()/configPath() to avoid shadowing.

Also adds scripts/smoke-test-mcp.ts — verified all 8 MCP tool calls
work against a real Postgres database.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request Apr 15, 2026
- #3: autopilot extract step was a no-op (imported but never called)
- #6: PGLite orphan_pages query aligned with Postgres (check both inbound+outbound)
- #8: embedPage throws instead of process.exit (was killing sync/autopilot)
- #9: dead-links set auto_fixable=false (needs repo path we may not have)
- #10: JSON auto-fix output was dead code (unreachable !jsonMode check)
- #14: autopilot lock file prevents concurrent instances
- #20: --dir without value no longer crashes extract

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request Apr 15, 2026
* feat: migrate 8 existing skills to conformance format

Add YAML frontmatter (name, version, description, triggers, tools, mutating),
Contract, Anti-Patterns, and Output Format sections to all existing skills.
Rename Workflow to Phases. Ingest becomes thin router delegating to specialized
ingestion skills (Phase 2).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add RESOLVER.md, conventions directory, and output rules

RESOLVER.md is the skill dispatcher modeled on Wintermute's AGENTS.md.
Categorized routing table: Always-on, Brain ops, Ingestion, Thinking,
Operational, Setup, Identity. Conventions directory extracts cross-cutting
rules (quality, brain-first lookup, model routing, test-before-bulk).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add skills conformance and resolver validation tests

skills-conformance.test.ts validates every skill has YAML frontmatter with
required fields, Contract, Anti-Patterns, and Output Format sections, and
manifest.json coverage. resolver.test.ts validates routing table categories,
skill path existence, and manifest-to-resolver coverage. 50 new tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add 9 brain skills from Wintermute (Phase 2)

Generalized from Wintermute's battle-tested skills:
- signal-detector: always-on idea+entity capture on every message
- brain-ops: brain-first lookup, read-enrich-write loop, source attribution
- idea-ingest: links/articles/tweets with author people page mandatory
- media-ingest: video/audio/PDF/book with entity extraction (absorbs video/youtube/book)
- meeting-ingestion: transcripts with attendee enrichment chaining
- citation-fixer: audit and fix citation formatting
- repo-architecture: filing rules by primary subject
- skill-creator: create skills with conformance standard + MECE check
- daily-task-manager: task lifecycle with priority levels

All Garry-specific references generalized. Core workflows preserved.
Updated RESOLVER.md and manifest.json.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add operational infrastructure + identity layer (Phase 3)

Operational skills:
- daily-task-prep: morning prep with calendar context and open threads
- cross-modal-review: quality gate via second model with refusal routing
- cron-scheduler: schedule staggering, quiet hours, wake-up override, idempotency
- reports: timestamped reports with keyword routing
- testing: skill validation framework (conformance checks)
- soul-audit: 6-phase interview generating SOUL.md, USER.md, ACCESS_POLICY.md, HEARTBEAT.md
- webhook-transforms: external events to brain signals with dead-letter queue

Identity layer:
- SOUL.md template (agent identity, generated by soul-audit)
- USER.md template (user profile, generated by soul-audit)
- ACCESS_POLICY.md template (4-tier access control)
- HEARTBEAT.md template (operational cadence)
- cross-modal.yaml convention (review pairs, refusal routing chain)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update CLAUDE.md with 24 skills, RESOLVER.md, conventions, templates

GBrain is now a GStack mod for agent platforms. Updated architecture description,
key files listing (16 new skill files, RESOLVER.md, conventions, templates), skills
section (24 skills organized by resolver categories), and testing section (new
conformance and resolver tests).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add GStack detection + mod status to gbrain init (Phase 4)

After brain initialization, gbrain init now reports:
- Number of skills loaded (from manifest.json)
- GStack detection (checks known host paths, uses gstack-global-discover if available)
- GStack install instructions if not found
- Resolver and soul-audit pointers

Also adds installDefaultTemplates() for SOUL.md/USER.md/ACCESS_POLICY.md/HEARTBEAT.md
deployment, and detectGStack() using gstack-global-discover with fallback to known paths
(DRY: doesn't reimplement GStack's host detection logic).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: v0.10.0 release documentation

- CHANGELOG: 24 skills, signal detector, RESOLVER.md, soul-audit, access control,
  conventions, conformance standard, GStack detection in init
- README: updated skill section with 24 skills, resolver, conventions
- TODOS: added runtime MCP access control (P1)
- VERSION: 0.9.2 → 0.10.0
- package.json + manifest.json version bumped

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add skill table to CHANGELOG v0.10.0

16-row table detailing every new skill, what it does, and why it matters.
Written to sell the upgrade, not document the implementation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: restore package.json version after merge conflict resolution

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: zero-based README rewrite for GStackBrain v0.10.0

Lead with GStack mod identity. 24 skills table organized by category.
Install block references RESOLVER.md and soul-audit. GBrain+GStack
relationship explained. Removed redundancy (733 -> 406 lines).
All essential content preserved: install, recipes, architecture,
search, commands, engines, voice, knowledge model.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: extract install block to INSTALL_FOR_AGENTS.md, simplify README

The 30-line copy-paste install block becomes one line:
"Retrieve and follow INSTALL_FOR_AGENTS.md"

Benefits: agent always gets latest instructions (no stale copy-paste),
README stays clean, install details live where agents read them.

README now leads with what GBrain does ("gives your agent a brain")
instead of GStack relationship. Removed "requires frontier model" note.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: 3 bugs in init.ts from merge conflict resolution

1. llstatSync typo (merge corruption) → lstatSync
2. __dirname undefined in ESM module → fileURLToPath polyfill
3. require('fs') in ESM → use imported readFileSync

All three would crash gbrain init at runtime. Caught by /review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add checkResolvable shared core function for resolver validation

Shared function at src/core/check-resolvable.ts validates that all skills
are reachable from RESOLVER.md, detects MECE overlaps (with whitelist for
always-on/router skills), finds gaps in frontmatter triggers, and scans
for DRY violations. Returns structured ResolvableIssue objects with
machine-parseable fix objects alongside human-readable action strings.

Three call sites: bun test, gbrain doctor, skill-creator skill.

Cleans up test/resolver.test.ts: removes stale 9-line skip list, imports
from production check-resolvable.ts instead of reimplementing parsing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: expand doctor with resolver validation, filesystem-first architecture

Doctor now runs filesystem checks (resolver health, skill conformance) before
connecting to DB. New --fast flag skips DB checks. Falls back to filesystem-only
when DB is unavailable. Adds schema_version: 2 to JSON output, composite health
score (0-100), and structured issues array with action strings for agent parsing.

Resolver health check calls checkResolvable() and surfaces actionable fix
instructions. Link integrity check uses engine.getHealth() dead_links count.

CLI routing split: doctor dispatched before connectEngine() so filesystem
checks always run. Fixes Codex-identified blocker where doctor required DB.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add adaptive load-aware throttling and fail-improve loop

backoff.ts: System load checking (CPU via os.loadavg, memory via os.freemem),
exponential backoff with 20-attempt max guard, active hours multiplier (2x
slower during waking hours), concurrent process limit (max 2). Windows-safe:
defaults to "proceed" when os.loadavg returns zeros.

fail-improve.ts: Deterministic-first, LLM-fallback pattern with JSONL failure
logging. Cascade failure handling: when both paths fail, throws LLM error and
logs both. Log rotation at 1000 entries. Call count tracking for deterministic
hit rate metrics. Auto-generates test cases from successful LLM fallbacks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add transcription service and enrichment-as-a-service

transcription.ts: Groq Whisper (default) with OpenAI fallback. Files >25MB
segmented via ffmpeg. Provider auto-detection from env vars. Clear error
messages for missing API keys and unsupported formats.

enrichment-service.ts: Global enrichment service callable from any ingest
pathway. Entity slug generation (people/jane-doe, companies/acme-corp),
mention counting via searchKeyword, tier auto-escalation (Tier 3→2→1 based
on mention frequency and source diversity), batch enrichment with backoff
throttling, regex-based entity extraction from text.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add data-research skill with recipe system, extraction, dedup, tracker

New skill: data-research — one parameterized pipeline for any email-to-
structured-data workflow (investor updates, donations, company metrics).
7-phase pipeline: define recipe, search, classify, extract (with extraction
integrity rule), archive, deduplicate, update tracker.

data-research.ts: Recipe validation, MRR/ARR/runway/headcount regex
extraction (battle-tested patterns), dedup with configurable tolerance,
markdown tracker parsing/appending, quarterly/monthly date windowing,
6-phase HTML email stripping with 500KB ReDoS cap.

Registers data-research in manifest.json (25th skill) and RESOLVER.md.
Fixes backoff test robustness for high-load systems.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for v0.10.0 infrastructure additions

CLAUDE.md: added 6 new core files (check-resolvable, backoff, fail-improve,
transcription, enrichment-service, data-research), 6 new test files, updated
skill count to 25, test file count to 34.

README.md: updated skill count to 25, added data-research to skills table.

CHANGELOG.md: added Infrastructure section documenting resolver validation,
doctor expansion, adaptive throttling, fail-improve loop, voice transcription,
enrichment service, and data-research skill.

TODOS.md: anonymized personal references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: doctor.ts use ES module imports, harden backoff test

Replace require('fs') with ES module import in doctor.ts for consistency
with the rest of the file. Backoff test made resilient to parallel test
execution leaking module-level state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: sync --watch routing, dead_links parity, doctor command, embed --slugs

- Move sync to CLI_ONLY so --watch flag reaches runSync() (was routed through
  operation layer which only calls performSync single-pass)
- Hide sync_brain from CLI help (MCP still exposes it)
- Fix performFullSync missing sync state persistence (C1)
- Align Postgres dead_links query to match PGLite (count dangling links, not
  empty-content chunks) (C3)
- Fix doctor recommending nonexistent 'gbrain embed refresh' (C4)
- Refactor doctor outputResults to not call process.exit directly
- Add --slugs flag to embed for targeted page embedding
- Add sync auto-extract + auto-embed after performSync
- Add noExtract to SyncOpts
- Route extract, features, autopilot in CLI_ONLY
- Update help text with new commands

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: extract, features, and autopilot commands

- gbrain extract <links|timeline|all> — batch extraction of links and timeline
  entries from brain markdown files. Broad regex for all .md links (C7: filters
  external URLs). Frontmatter field parsing (company, investors, attendees).
  Directory-based link type inference. JSONL progress on stderr for agents.
  Sync integration hooks (extractLinksForSlugs, extractTimelineForSlugs).

- gbrain features [--json] [--auto-fix] — scan brain usage, pitch unused features
  with the user's own numbers. Priority 1 (data quality): missing embeddings,
  dead links. Priority 2 (unused features): zero links, zero timeline, low
  coverage, unconfigured integrations, no sync. Embedded recipe metadata for
  binary-safe integration detection. Persistence in ~/.gbrain/feature-offers.json.
  Doctor teaser hook. Upgrade hook.

- gbrain autopilot [--repo] [--interval N] — self-maintaining brain daemon.
  Pipeline: sync → extract → embed. Health-based adaptive scheduling
  (brain_score >= 90 doubles interval, < 70 halves it). --install/--uninstall
  for launchd (macOS) and crontab (Linux). Signal handling. Consecutive error
  tracking (stops at 5). Log to ~/.gbrain/autopilot.log.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: hook features scan into post-upgrade flow

After gbrain post-upgrade completes, automatically run gbrain features to show
the user what's new and what to fix. Best-effort (doesn't fail the upgrade).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: brain_score (0-100) in BrainHealth

Weighted composite score computed in getHealth() for both Postgres and PGLite:
  embed_coverage: 0.35, link_density: 0.25, timeline_coverage: 0.15,
  no_orphans: 0.15, no_dead_links: 0.10

Returns 0 for empty brains. Agents use brain_score as a health gate.
Autopilot uses it for adaptive scheduling (>=90 slows down, <70 speeds up).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: extract and features unit tests

25 tests covering:
- extractMarkdownLinks: relative links, external URL filtering, edge cases
- extractLinksFromFile: slug resolution, frontmatter parsing, directory-based
  type inference (works_at, deal_for, invested_in)
- extractTimelineFromContent: bullet format, header format with detail,
  em/en dash handling, empty content
- features: module exports, brain_score calculation weights, CLI routing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: instruction layer for extract, features, autopilot

Agent-facing tools are invisible without instruction-layer coverage.
- RESOLVER.md: add routing for extract, features, autopilot
- maintain/SKILL.md: add link graph extraction, timeline extraction,
  autopilot check sections

Without these, agents reading skills/ will never discover or run the
new commands. This is the #1 DX finding from the devex review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.10.1)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: sync CLAUDE.md with v0.10.1 additions

Add extract.ts, features.ts, autopilot.ts to key files.
Add extract.test.ts, features.test.ts to test list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: adversarial review fixes — 7 issues

- #3: autopilot extract step was a no-op (imported but never called)
- #6: PGLite orphan_pages query aligned with Postgres (check both inbound+outbound)
- #8: embedPage throws instead of process.exit (was killing sync/autopilot)
- #9: dead-links set auto_fixable=false (needs repo path we may not have)
- #10: JSON auto-fix output was dead code (unreachable !jsonMode check)
- #14: autopilot lock file prevents concurrent instances
- #20: --dir without value no longer crashes extract

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* security: fix command injection + plaintext API key in daemon install

- #1: Crontab install used echo pipe with shell-interpolated values.
  Now uses a temp file via crontab(1) and single-quote escaping on all
  interpolated paths. No shell expansion possible.

- #2: OPENAI_API_KEY was baked as plaintext into the launchd plist
  (readable by any local process, backed up by Time Machine). Now uses
  a wrapper script (~/.gbrain/autopilot-run.sh) that sources ~/.zshrc
  at runtime. No secrets in plist or crontab.

- #16: extract.ts used a custom 20-line YAML parser that only handled
  single-line key:value pairs. Multi-line arrays (attendees list with
  - items) were silently ignored. Now uses the project's gray-matter
  parser via parseMarkdown() from src/core/markdown.ts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
TFITZ57 added a commit to TFITZ57/gbrain that referenced this pull request Apr 23, 2026
* feat: GStackBrain — 16 new skills, resolver, conventions, identity layer (v0.10.0) (#120)

* feat: migrate 8 existing skills to conformance format

Add YAML frontmatter (name, version, description, triggers, tools, mutating),
Contract, Anti-Patterns, and Output Format sections to all existing skills.
Rename Workflow to Phases. Ingest becomes thin router delegating to specialized
ingestion skills (Phase 2).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add RESOLVER.md, conventions directory, and output rules

RESOLVER.md is the skill dispatcher modeled on Wintermute's AGENTS.md.
Categorized routing table: Always-on, Brain ops, Ingestion, Thinking,
Operational, Setup, Identity. Conventions directory extracts cross-cutting
rules (quality, brain-first lookup, model routing, test-before-bulk).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add skills conformance and resolver validation tests

skills-conformance.test.ts validates every skill has YAML frontmatter with
required fields, Contract, Anti-Patterns, and Output Format sections, and
manifest.json coverage. resolver.test.ts validates routing table categories,
skill path existence, and manifest-to-resolver coverage. 50 new tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add 9 brain skills from Wintermute (Phase 2)

Generalized from Wintermute's battle-tested skills:
- signal-detector: always-on idea+entity capture on every message
- brain-ops: brain-first lookup, read-enrich-write loop, source attribution
- idea-ingest: links/articles/tweets with author people page mandatory
- media-ingest: video/audio/PDF/book with entity extraction (absorbs video/youtube/book)
- meeting-ingestion: transcripts with attendee enrichment chaining
- citation-fixer: audit and fix citation formatting
- repo-architecture: filing rules by primary subject
- skill-creator: create skills with conformance standard + MECE check
- daily-task-manager: task lifecycle with priority levels

All Garry-specific references generalized. Core workflows preserved.
Updated RESOLVER.md and manifest.json.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add operational infrastructure + identity layer (Phase 3)

Operational skills:
- daily-task-prep: morning prep with calendar context and open threads
- cross-modal-review: quality gate via second model with refusal routing
- cron-scheduler: schedule staggering, quiet hours, wake-up override, idempotency
- reports: timestamped reports with keyword routing
- testing: skill validation framework (conformance checks)
- soul-audit: 6-phase interview generating SOUL.md, USER.md, ACCESS_POLICY.md, HEARTBEAT.md
- webhook-transforms: external events to brain signals with dead-letter queue

Identity layer:
- SOUL.md template (agent identity, generated by soul-audit)
- USER.md template (user profile, generated by soul-audit)
- ACCESS_POLICY.md template (4-tier access control)
- HEARTBEAT.md template (operational cadence)
- cross-modal.yaml convention (review pairs, refusal routing chain)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update CLAUDE.md with 24 skills, RESOLVER.md, conventions, templates

GBrain is now a GStack mod for agent platforms. Updated architecture description,
key files listing (16 new skill files, RESOLVER.md, conventions, templates), skills
section (24 skills organized by resolver categories), and testing section (new
conformance and resolver tests).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add GStack detection + mod status to gbrain init (Phase 4)

After brain initialization, gbrain init now reports:
- Number of skills loaded (from manifest.json)
- GStack detection (checks known host paths, uses gstack-global-discover if available)
- GStack install instructions if not found
- Resolver and soul-audit pointers

Also adds installDefaultTemplates() for SOUL.md/USER.md/ACCESS_POLICY.md/HEARTBEAT.md
deployment, and detectGStack() using gstack-global-discover with fallback to known paths
(DRY: doesn't reimplement GStack's host detection logic).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: v0.10.0 release documentation

- CHANGELOG: 24 skills, signal detector, RESOLVER.md, soul-audit, access control,
  conventions, conformance standard, GStack detection in init
- README: updated skill section with 24 skills, resolver, conventions
- TODOS: added runtime MCP access control (P1)
- VERSION: 0.9.2 → 0.10.0
- package.json + manifest.json version bumped

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add skill table to CHANGELOG v0.10.0

16-row table detailing every new skill, what it does, and why it matters.
Written to sell the upgrade, not document the implementation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: restore package.json version after merge conflict resolution

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: zero-based README rewrite for GStackBrain v0.10.0

Lead with GStack mod identity. 24 skills table organized by category.
Install block references RESOLVER.md and soul-audit. GBrain+GStack
relationship explained. Removed redundancy (733 -> 406 lines).
All essential content preserved: install, recipes, architecture,
search, commands, engines, voice, knowledge model.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: extract install block to INSTALL_FOR_AGENTS.md, simplify README

The 30-line copy-paste install block becomes one line:
"Retrieve and follow INSTALL_FOR_AGENTS.md"

Benefits: agent always gets latest instructions (no stale copy-paste),
README stays clean, install details live where agents read them.

README now leads with what GBrain does ("gives your agent a brain")
instead of GStack relationship. Removed "requires frontier model" note.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: 3 bugs in init.ts from merge conflict resolution

1. llstatSync typo (merge corruption) → lstatSync
2. __dirname undefined in ESM module → fileURLToPath polyfill
3. require('fs') in ESM → use imported readFileSync

All three would crash gbrain init at runtime. Caught by /review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add checkResolvable shared core function for resolver validation

Shared function at src/core/check-resolvable.ts validates that all skills
are reachable from RESOLVER.md, detects MECE overlaps (with whitelist for
always-on/router skills), finds gaps in frontmatter triggers, and scans
for DRY violations. Returns structured ResolvableIssue objects with
machine-parseable fix objects alongside human-readable action strings.

Three call sites: bun test, gbrain doctor, skill-creator skill.

Cleans up test/resolver.test.ts: removes stale 9-line skip list, imports
from production check-resolvable.ts instead of reimplementing parsing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: expand doctor with resolver validation, filesystem-first architecture

Doctor now runs filesystem checks (resolver health, skill conformance) before
connecting to DB. New --fast flag skips DB checks. Falls back to filesystem-only
when DB is unavailable. Adds schema_version: 2 to JSON output, composite health
score (0-100), and structured issues array with action strings for agent parsing.

Resolver health check calls checkResolvable() and surfaces actionable fix
instructions. Link integrity check uses engine.getHealth() dead_links count.

CLI routing split: doctor dispatched before connectEngine() so filesystem
checks always run. Fixes Codex-identified blocker where doctor required DB.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add adaptive load-aware throttling and fail-improve loop

backoff.ts: System load checking (CPU via os.loadavg, memory via os.freemem),
exponential backoff with 20-attempt max guard, active hours multiplier (2x
slower during waking hours), concurrent process limit (max 2). Windows-safe:
defaults to "proceed" when os.loadavg returns zeros.

fail-improve.ts: Deterministic-first, LLM-fallback pattern with JSONL failure
logging. Cascade failure handling: when both paths fail, throws LLM error and
logs both. Log rotation at 1000 entries. Call count tracking for deterministic
hit rate metrics. Auto-generates test cases from successful LLM fallbacks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add transcription service and enrichment-as-a-service

transcription.ts: Groq Whisper (default) with OpenAI fallback. Files >25MB
segmented via ffmpeg. Provider auto-detection from env vars. Clear error
messages for missing API keys and unsupported formats.

enrichment-service.ts: Global enrichment service callable from any ingest
pathway. Entity slug generation (people/jane-doe, companies/acme-corp),
mention counting via searchKeyword, tier auto-escalation (Tier 3→2→1 based
on mention frequency and source diversity), batch enrichment with backoff
throttling, regex-based entity extraction from text.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add data-research skill with recipe system, extraction, dedup, tracker

New skill: data-research — one parameterized pipeline for any email-to-
structured-data workflow (investor updates, donations, company metrics).
7-phase pipeline: define recipe, search, classify, extract (with extraction
integrity rule), archive, deduplicate, update tracker.

data-research.ts: Recipe validation, MRR/ARR/runway/headcount regex
extraction (battle-tested patterns), dedup with configurable tolerance,
markdown tracker parsing/appending, quarterly/monthly date windowing,
6-phase HTML email stripping with 500KB ReDoS cap.

Registers data-research in manifest.json (25th skill) and RESOLVER.md.
Fixes backoff test robustness for high-load systems.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for v0.10.0 infrastructure additions

CLAUDE.md: added 6 new core files (check-resolvable, backoff, fail-improve,
transcription, enrichment-service, data-research), 6 new test files, updated
skill count to 25, test file count to 34.

README.md: updated skill count to 25, added data-research to skills table.

CHANGELOG.md: added Infrastructure section documenting resolver validation,
doctor expansion, adaptive throttling, fail-improve loop, voice transcription,
enrichment service, and data-research skill.

TODOS.md: anonymized personal references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: doctor.ts use ES module imports, harden backoff test

Replace require('fs') with ES module import in doctor.ts for consistency
with the rest of the file. Backoff test made resilient to parallel test
execution leaking module-level state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: README rewrite with production brain stats, sample output, new infrastructure

Lead with the flex: 17,888 pages, 4,383 people, 723 companies, 526 meeting
transcripts built in 12 days. Show sample query output so readers see what
they'll get. Document self-improving infrastructure (tier auto-escalation,
fail-improve loop, doctor trajectory). Add data-research recipes to Getting
Data In. Update commands section with doctor --fix, transcribe, research
init/list. Fix stale "24" references to "25".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: README lead with YC President origin and production agent deployments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: README lead with skill philosophy and link to Thin Harness Fat Skills

Skills section now explains: skill files are code, they encode entire
workflows, they call deterministic TypeScript for the parts that shouldn't
be LLM judgment. Links to the tweet and the architecture essay.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: link GStack repo, add 70K stars and 30K daily users

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: remove meeting transcript count from README (sensitive)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: README lead with YC President origin and production agent deployments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: rename political-donations recipe to expense-tracker (sensitivity)

Renamed the built-in data-research recipe from political-donations to
expense-tracker across README, CHANGELOG, SKILL.md, and reports routing.
Same extraction patterns (amounts, dates, recipients), neutral framing.
Also renamed social-radar keyword route to social-mentions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: sync pipeline, extract, features, autopilot (v0.10.1) (#129)

* feat: migrate 8 existing skills to conformance format

Add YAML frontmatter (name, version, description, triggers, tools, mutating),
Contract, Anti-Patterns, and Output Format sections to all existing skills.
Rename Workflow to Phases. Ingest becomes thin router delegating to specialized
ingestion skills (Phase 2).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add RESOLVER.md, conventions directory, and output rules

RESOLVER.md is the skill dispatcher modeled on Wintermute's AGENTS.md.
Categorized routing table: Always-on, Brain ops, Ingestion, Thinking,
Operational, Setup, Identity. Conventions directory extracts cross-cutting
rules (quality, brain-first lookup, model routing, test-before-bulk).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add skills conformance and resolver validation tests

skills-conformance.test.ts validates every skill has YAML frontmatter with
required fields, Contract, Anti-Patterns, and Output Format sections, and
manifest.json coverage. resolver.test.ts validates routing table categories,
skill path existence, and manifest-to-resolver coverage. 50 new tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add 9 brain skills from Wintermute (Phase 2)

Generalized from Wintermute's battle-tested skills:
- signal-detector: always-on idea+entity capture on every message
- brain-ops: brain-first lookup, read-enrich-write loop, source attribution
- idea-ingest: links/articles/tweets with author people page mandatory
- media-ingest: video/audio/PDF/book with entity extraction (absorbs video/youtube/book)
- meeting-ingestion: transcripts with attendee enrichment chaining
- citation-fixer: audit and fix citation formatting
- repo-architecture: filing rules by primary subject
- skill-creator: create skills with conformance standard + MECE check
- daily-task-manager: task lifecycle with priority levels

All Garry-specific references generalized. Core workflows preserved.
Updated RESOLVER.md and manifest.json.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add operational infrastructure + identity layer (Phase 3)

Operational skills:
- daily-task-prep: morning prep with calendar context and open threads
- cross-modal-review: quality gate via second model with refusal routing
- cron-scheduler: schedule staggering, quiet hours, wake-up override, idempotency
- reports: timestamped reports with keyword routing
- testing: skill validation framework (conformance checks)
- soul-audit: 6-phase interview generating SOUL.md, USER.md, ACCESS_POLICY.md, HEARTBEAT.md
- webhook-transforms: external events to brain signals with dead-letter queue

Identity layer:
- SOUL.md template (agent identity, generated by soul-audit)
- USER.md template (user profile, generated by soul-audit)
- ACCESS_POLICY.md template (4-tier access control)
- HEARTBEAT.md template (operational cadence)
- cross-modal.yaml convention (review pairs, refusal routing chain)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update CLAUDE.md with 24 skills, RESOLVER.md, conventions, templates

GBrain is now a GStack mod for agent platforms. Updated architecture description,
key files listing (16 new skill files, RESOLVER.md, conventions, templates), skills
section (24 skills organized by resolver categories), and testing section (new
conformance and resolver tests).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add GStack detection + mod status to gbrain init (Phase 4)

After brain initialization, gbrain init now reports:
- Number of skills loaded (from manifest.json)
- GStack detection (checks known host paths, uses gstack-global-discover if available)
- GStack install instructions if not found
- Resolver and soul-audit pointers

Also adds installDefaultTemplates() for SOUL.md/USER.md/ACCESS_POLICY.md/HEARTBEAT.md
deployment, and detectGStack() using gstack-global-discover with fallback to known paths
(DRY: doesn't reimplement GStack's host detection logic).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: v0.10.0 release documentation

- CHANGELOG: 24 skills, signal detector, RESOLVER.md, soul-audit, access control,
  conventions, conformance standard, GStack detection in init
- README: updated skill section with 24 skills, resolver, conventions
- TODOS: added runtime MCP access control (P1)
- VERSION: 0.9.2 → 0.10.0
- package.json + manifest.json version bumped

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add skill table to CHANGELOG v0.10.0

16-row table detailing every new skill, what it does, and why it matters.
Written to sell the upgrade, not document the implementation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: restore package.json version after merge conflict resolution

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: zero-based README rewrite for GStackBrain v0.10.0

Lead with GStack mod identity. 24 skills table organized by category.
Install block references RESOLVER.md and soul-audit. GBrain+GStack
relationship explained. Removed redundancy (733 -> 406 lines).
All essential content preserved: install, recipes, architecture,
search, commands, engines, voice, knowledge model.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: extract install block to INSTALL_FOR_AGENTS.md, simplify README

The 30-line copy-paste install block becomes one line:
"Retrieve and follow INSTALL_FOR_AGENTS.md"

Benefits: agent always gets latest instructions (no stale copy-paste),
README stays clean, install details live where agents read them.

README now leads with what GBrain does ("gives your agent a brain")
instead of GStack relationship. Removed "requires frontier model" note.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: 3 bugs in init.ts from merge conflict resolution

1. llstatSync typo (merge corruption) → lstatSync
2. __dirname undefined in ESM module → fileURLToPath polyfill
3. require('fs') in ESM → use imported readFileSync

All three would crash gbrain init at runtime. Caught by /review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add checkResolvable shared core function for resolver validation

Shared function at src/core/check-resolvable.ts validates that all skills
are reachable from RESOLVER.md, detects MECE overlaps (with whitelist for
always-on/router skills), finds gaps in frontmatter triggers, and scans
for DRY violations. Returns structured ResolvableIssue objects with
machine-parseable fix objects alongside human-readable action strings.

Three call sites: bun test, gbrain doctor, skill-creator skill.

Cleans up test/resolver.test.ts: removes stale 9-line skip list, imports
from production check-resolvable.ts instead of reimplementing parsing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: expand doctor with resolver validation, filesystem-first architecture

Doctor now runs filesystem checks (resolver health, skill conformance) before
connecting to DB. New --fast flag skips DB checks. Falls back to filesystem-only
when DB is unavailable. Adds schema_version: 2 to JSON output, composite health
score (0-100), and structured issues array with action strings for agent parsing.

Resolver health check calls checkResolvable() and surfaces actionable fix
instructions. Link integrity check uses engine.getHealth() dead_links count.

CLI routing split: doctor dispatched before connectEngine() so filesystem
checks always run. Fixes Codex-identified blocker where doctor required DB.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add adaptive load-aware throttling and fail-improve loop

backoff.ts: System load checking (CPU via os.loadavg, memory via os.freemem),
exponential backoff with 20-attempt max guard, active hours multiplier (2x
slower during waking hours), concurrent process limit (max 2). Windows-safe:
defaults to "proceed" when os.loadavg returns zeros.

fail-improve.ts: Deterministic-first, LLM-fallback pattern with JSONL failure
logging. Cascade failure handling: when both paths fail, throws LLM error and
logs both. Log rotation at 1000 entries. Call count tracking for deterministic
hit rate metrics. Auto-generates test cases from successful LLM fallbacks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add transcription service and enrichment-as-a-service

transcription.ts: Groq Whisper (default) with OpenAI fallback. Files >25MB
segmented via ffmpeg. Provider auto-detection from env vars. Clear error
messages for missing API keys and unsupported formats.

enrichment-service.ts: Global enrichment service callable from any ingest
pathway. Entity slug generation (people/jane-doe, companies/acme-corp),
mention counting via searchKeyword, tier auto-escalation (Tier 3→2→1 based
on mention frequency and source diversity), batch enrichment with backoff
throttling, regex-based entity extraction from text.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add data-research skill with recipe system, extraction, dedup, tracker

New skill: data-research — one parameterized pipeline for any email-to-
structured-data workflow (investor updates, donations, company metrics).
7-phase pipeline: define recipe, search, classify, extract (with extraction
integrity rule), archive, deduplicate, update tracker.

data-research.ts: Recipe validation, MRR/ARR/runway/headcount regex
extraction (battle-tested patterns), dedup with configurable tolerance,
markdown tracker parsing/appending, quarterly/monthly date windowing,
6-phase HTML email stripping with 500KB ReDoS cap.

Registers data-research in manifest.json (25th skill) and RESOLVER.md.
Fixes backoff test robustness for high-load systems.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for v0.10.0 infrastructure additions

CLAUDE.md: added 6 new core files (check-resolvable, backoff, fail-improve,
transcription, enrichment-service, data-research), 6 new test files, updated
skill count to 25, test file count to 34.

README.md: updated skill count to 25, added data-research to skills table.

CHANGELOG.md: added Infrastructure section documenting resolver validation,
doctor expansion, adaptive throttling, fail-improve loop, voice transcription,
enrichment service, and data-research skill.

TODOS.md: anonymized personal references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: doctor.ts use ES module imports, harden backoff test

Replace require('fs') with ES module import in doctor.ts for consistency
with the rest of the file. Backoff test made resilient to parallel test
execution leaking module-level state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: sync --watch routing, dead_links parity, doctor command, embed --slugs

- Move sync to CLI_ONLY so --watch flag reaches runSync() (was routed through
  operation layer which only calls performSync single-pass)
- Hide sync_brain from CLI help (MCP still exposes it)
- Fix performFullSync missing sync state persistence (C1)
- Align Postgres dead_links query to match PGLite (count dangling links, not
  empty-content chunks) (C3)
- Fix doctor recommending nonexistent 'gbrain embed refresh' (C4)
- Refactor doctor outputResults to not call process.exit directly
- Add --slugs flag to embed for targeted page embedding
- Add sync auto-extract + auto-embed after performSync
- Add noExtract to SyncOpts
- Route extract, features, autopilot in CLI_ONLY
- Update help text with new commands

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: extract, features, and autopilot commands

- gbrain extract <links|timeline|all> — batch extraction of links and timeline
  entries from brain markdown files. Broad regex for all .md links (C7: filters
  external URLs). Frontmatter field parsing (company, investors, attendees).
  Directory-based link type inference. JSONL progress on stderr for agents.
  Sync integration hooks (extractLinksForSlugs, extractTimelineForSlugs).

- gbrain features [--json] [--auto-fix] — scan brain usage, pitch unused features
  with the user's own numbers. Priority 1 (data quality): missing embeddings,
  dead links. Priority 2 (unused features): zero links, zero timeline, low
  coverage, unconfigured integrations, no sync. Embedded recipe metadata for
  binary-safe integration detection. Persistence in ~/.gbrain/feature-offers.json.
  Doctor teaser hook. Upgrade hook.

- gbrain autopilot [--repo] [--interval N] — self-maintaining brain daemon.
  Pipeline: sync → extract → embed. Health-based adaptive scheduling
  (brain_score >= 90 doubles interval, < 70 halves it). --install/--uninstall
  for launchd (macOS) and crontab (Linux). Signal handling. Consecutive error
  tracking (stops at 5). Log to ~/.gbrain/autopilot.log.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: hook features scan into post-upgrade flow

After gbrain post-upgrade completes, automatically run gbrain features to show
the user what's new and what to fix. Best-effort (doesn't fail the upgrade).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: brain_score (0-100) in BrainHealth

Weighted composite score computed in getHealth() for both Postgres and PGLite:
  embed_coverage: 0.35, link_density: 0.25, timeline_coverage: 0.15,
  no_orphans: 0.15, no_dead_links: 0.10

Returns 0 for empty brains. Agents use brain_score as a health gate.
Autopilot uses it for adaptive scheduling (>=90 slows down, <70 speeds up).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: extract and features unit tests

25 tests covering:
- extractMarkdownLinks: relative links, external URL filtering, edge cases
- extractLinksFromFile: slug resolution, frontmatter parsing, directory-based
  type inference (works_at, deal_for, invested_in)
- extractTimelineFromContent: bullet format, header format with detail,
  em/en dash handling, empty content
- features: module exports, brain_score calculation weights, CLI routing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: instruction layer for extract, features, autopilot

Agent-facing tools are invisible without instruction-layer coverage.
- RESOLVER.md: add routing for extract, features, autopilot
- maintain/SKILL.md: add link graph extraction, timeline extraction,
  autopilot check sections

Without these, agents reading skills/ will never discover or run the
new commands. This is the #1 DX finding from the devex review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.10.1)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: sync CLAUDE.md with v0.10.1 additions

Add extract.ts, features.ts, autopilot.ts to key files.
Add extract.test.ts, features.test.ts to test list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: adversarial review fixes — 7 issues

- #3: autopilot extract step was a no-op (imported but never called)
- #6: PGLite orphan_pages query aligned with Postgres (check both inbound+outbound)
- #8: embedPage throws instead of process.exit (was killing sync/autopilot)
- #9: dead-links set auto_fixable=false (needs repo path we may not have)
- #10: JSON auto-fix output was dead code (unreachable !jsonMode check)
- #14: autopilot lock file prevents concurrent instances
- #20: --dir without value no longer crashes extract

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* security: fix command injection + plaintext API key in daemon install

- #1: Crontab install used echo pipe with shell-interpolated values.
  Now uses a temp file via crontab(1) and single-quote escaping on all
  interpolated paths. No shell expansion possible.

- #2: OPENAI_API_KEY was baked as plaintext into the launchd plist
  (readable by any local process, backed up by Time Machine). Now uses
  a wrapper script (~/.gbrain/autopilot-run.sh) that sources ~/.zshrc
  at runtime. No secrets in plist or crontab.

- #16: extract.ts used a custom 20-line YAML parser that only handled
  single-line key:value pairs. Multi-line arrays (attendees list with
  - items) were silently ignored. Now uses the project's gray-matter
  parser via parseMarkdown() from src/core/markdown.ts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* security: fix wave 3 — 9 vulns (file_upload, SSRF, recipe trust, prompt injection) (#174)

* feat(engine): add cap parameter to clampSearchLimit (H6)

clampSearchLimit(limit, defaultLimit, cap = MAX_SEARCH_LIMIT) — third arg
is a caller-specified cap so operation handlers can enforce limits below
MAX_SEARCH_LIMIT. Backward compatible: existing two-arg callers still cap
at MAX_SEARCH_LIMIT.

This fixes a Codex-caught semantics bug: the prior signature took (limit,
defaultLimit) where the second arg was misread as a cap. clampSearchLimit(x, 20)
was actually allowing values up to 100, not 20.

* feat(integrations): SSRF defense + recipe trust boundary (B1, B2, Fix 2, Fix 4, B3, B4)

- B1: split loadAllRecipes into trusted (package-bundled) and untrusted
  (cwd/recipes, $GBRAIN_RECIPES_DIR) tiers. Only package-bundled recipes
  get embedded=true. Closes the fake trust boundary that let any cwd-local
  recipe bypass health-check gates.
- B2: hard-block string health_checks for non-embedded recipes (was previously
  only blocked when isUnsafeHealthCheck regex matched, which the cwd recipe
  exploit bypassed). Embedded recipes still get the regex defense.
- Fix 2: gate command DSL health_checks on isEmbedded. Non-embedded
  recipes cannot spawnSync.
- Fix 4 + B3 + B4: gate http DSL health_checks on isEmbedded; for embedded
  recipes, validate URLs via new isInternalUrl() before fetch:
  - Scheme allowlist (http/https only): blocks file:, data:, blob:, ftp:, javascript:
  - IPv4 range check covering hex/octal/decimal/single-integer bypass forms
  - IPv6 loopback ::1 + IPv4-mapped ::ffff: (canonicalized hex hextets handled)
  - Metadata hostnames (AWS, GCP, instance-data) blocked
  - fetch with redirect: 'manual' + per-hop re-validation up to 3 hops

Original PRs #105-109 by @garagon. Wave 3 collector branch reimplemented
the fixes after Codex outside-voice review found that PRs #106/#108 alone
did not actually gate cwd-local recipes (B1) and that PR #108 missed
redirect-following SSRF (B3) and non-http schemes (B4).

* feat(file_upload): path/slug/filename validation + remote-caller confinement (Fix 1, B5, H5, M4, Fix 5)

- Fix 1 + B5 + H1: validateUploadPath uses realpathSync + path.relative
  to defeat symlink-parent traversal. lstatSync alone (the original PR #105
  approach) only catches final-component symlinks; a symlinked parent dir
  still followed to /etc/passwd. Now the entire path chain is resolved.
- H5: validatePageSlug uses an allowlist regex (alphanumeric + hyphens,
  slash-separated segments). Closes URL-encoded traversal (%2e%2e%2f),
  Unicode lookalikes, backslashes, control chars implicitly.
- M4: validateFilename allowlist regex. Rejects control chars, backslash,
  RTL override (\u202E), leading dot/dash. Filename flows into storage_path
  so this matters for every storage backend.
- Fix 5: clamp list_pages and get_ingest_log limits at the operation layer
  via new clampSearchLimit cap parameter (list_pages caps at 100,
  get_ingest_log at 50). Internal bulk commands bypass the operation
  layer and remain uncapped.
- New OperationContext.remote flag distinguishes trusted local CLI from
  untrusted MCP callers. file_upload uses strict cwd confinement when
  remote=true (default), loose mode when remote=false (CLI). MCP stdio
  server sets remote=true; cli.ts and handleToolCall (gbrain call) set
  remote=false.

Original PR #105 by @garagon. Issue #139 reported by @Hybirdss.

* feat(search): query sanitization + structural prompt boundary (Fix 3, M1, M2, M3)

- M1: restructure callHaikuForExpansion to use a system message that declares
  the user query as untrusted data, plus an XML-tagged <user_query> boundary
  in the user message. Layered defense with the existing tool_choice constraint
  (3 layers vs 1).
- Fix 3 (regex sanitizer, defense-in-depth): sanitizeQueryForPrompt strips
  triple-backtick code fences, XML/HTML tags, leading injection prefixes,
  and caps at 500 chars. Original query is still used for downstream search;
  only the LLM-facing copy is sanitized.
- M2: sanitizeExpansionOutput validates the model's alternative_queries array
  before it flows into search. Strips control chars, caps length, dedupes
  case-insensitively, drops empty/non-string items, caps to 2 items.
- M3: console.warn on stripped content NEVER logs the query text — privacy-safe
  debug signal only.

Original PR #107 by @garagon. M1/M2/M3 are wave 3 hardening per Codex review.

* chore: bump version and changelog (v0.10.2)

Security wave 3: 9 vulnerabilities closed across file_upload, recipe trust
boundary, SSRF defense, prompt injection, and limit clamping. See CHANGELOG
for full details.

Contributors:
- @garagon (PRs #105-109)
- @Hybirdss (Issue #139)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: sync documentation with v0.10.2 security wave 3

- CLAUDE.md: document OperationContext.remote, new security helpers
  (validateUploadPath, validatePageSlug, validateFilename, isInternalUrl,
  parseOctet, hostnameToOctets, isPrivateIpv4, getRecipeDirs,
  sanitizeQueryForPrompt, sanitizeExpansionOutput), updated clampSearchLimit
  signature, recipe trust boundary, new test files
- docs/integrations/README.md: replace string-form health_check example
  with typed DSL (string checks now hard-block for non-embedded recipes);
  add recipe trust boundary subsection
- docs/mcp/DEPLOY.md: document file_upload remote-caller cwd confinement,
  symlink rejection, slug/filename allowlists

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Minions v7 + v0.11.1 canonical migration + skillify (#130)

* feat: add minion_jobs schema, migration v5, and executeRaw to BrainEngine

Foundation for the Minions job queue system. Adds:
- minion_jobs table (20 columns) with CHECK constraints, partial indexes,
  and RLS. Inspired by BullMQ's job model, adapted for Postgres.
- Migration v5 creates the table for existing databases.
- executeRaw<T>() method on BrainEngine interface for raw SQL access,
  needed by the Minions module for claim queries (FOR UPDATE SKIP LOCKED),
  token-fenced writes, and atomic stall detection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: Minions job queue — queue, worker, backoff, types

BullMQ-inspired Postgres-native job queue built into GBrain. No Redis.
No external dependencies. Postgres transactions replace Lua scripts.

- MinionQueue: submit, claim (FOR UPDATE SKIP LOCKED), complete/fail
  (token-fenced), atomic stall detection (CTE), delayed promotion,
  parent-child resolution, prune, stats
- MinionWorker: handler registry, lock renewal, graceful SIGTERM,
  exponential backoff with jitter, UnrecoverableError bypass
- MinionJobContext: updateProgress(), log(), isActive() for handlers
- 8-state machine: waiting/active/completed/failed/delayed/dead/
  cancelled/waiting-children

Patterns stolen from: BullMQ (lock tokens, stall detection, flows),
Sidekiq (dead set, backoff formula), Inngest (checkpoint/resume).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: 43 tests for Minions job queue

Full coverage of the Minions module against PGLite in-memory:
- Queue CRUD (9): submit, get, list, remove, cancel, retry, duplicate
- State machine (6): waiting→active→completed/failed, retry→delayed→waiting
- Backoff (4): exponential, fixed, jitter range, attempts_made=0 edge
- Stall detection (3): detect stalled, counter increment, max→dead
- Dependencies (5): parent waits, fail_parent, continue, remove_dep, orphan
- Worker lifecycle (5): register, start-without-handlers, claim+execute,
  non-Error throws, UnrecoverableError bypass
- Lock management (3): renewal, token mismatch, claim sets lock fields
- Claim mechanics (4): empty queue, priority ordering, name filtering,
  delayed promotion timing
- Cancel & retry (2): cancel active, retry dead

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: Minions CLI commands and MCP operations

Wire Minions into the GBrain CLI and MCP layer:

CLI (gbrain jobs):
  submit <name> [--params JSON] [--follow] [--dry-run]
  list [--status S] [--queue Q] [--limit N]
  get <id> — detailed view with attempt history
  cancel/retry/delete <id>
  prune [--older-than 30d]
  stats — job health dashboard
  work [--queue Q] [--concurrency N] — Postgres-only worker daemon

6 MCP operations (contract-first, auto-exposed via MCP server):
  submit_job, get_job, list_jobs, cancel_job, retry_job, get_job_progress

Built-in handlers: sync, embed, lint, import. --follow runs inline.
Worker daemon blocked on PGLite (exclusive file lock).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for Minions job queue

CLAUDE.md: added Minions files to key files, updated operation count (36),
BrainEngine method count (38), test file count (45), added jobs CLI commands.
CHANGELOG.md: added Minions entry to v0.10.0 (background jobs, retry, stall
detection, worker daemon).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: Minions v2 — agent orchestration primitives (pause/resume, inbox, tokens, replay)

Adds the foundation for Minions as universal agent orchestration infrastructure.
GBrain's Postgres-native job queue now supports durable, observable, steerable
background agents. The OpenClaw plugin (separate repo) will consume these via
library import, not MCP, for zero-latency local integration.

## New capabilities

- **Concurrent worker** — Promise pool replaces sequential loop. Per-job
  AbortController for cooperative cancellation. Graceful shutdown waits for
  all in-flight jobs via Promise.allSettled.
- **Pause/resume** — pauseJob clears the lock and fires AbortSignal on active
  jobs. Handlers check ctx.signal.aborted and exit cleanly. resumeJob returns
  paused jobs to waiting. Catch block skips failJob when signal.aborted.
- **Inbox (separate table)** — minion_inbox table for sidechannel messages.
  sendMessage with sender validation (parent job or admin). readInbox is
  token-fenced and marks read_at atomically. Separate table avoids row bloat
  from rewriting JSONB on every send.
- **Token accounting** — tokens_input/tokens_output/tokens_cache_read columns.
  updateTokens accumulates; completeJob rolls child tokens up to parent.
  USD cost computed at read time (no cost_usd column — pricing too volatile).
- **Job replay** — replayJob clones a terminal job with optional data overrides.
  New job, fresh attempts, no parent link.

## Handler contract additions

MinionJobContext now provides:
- `signal: AbortSignal` — cooperative cancellation
- `updateTokens(tokens)` — accumulate token usage
- `readInbox()` — check for sidechannel messages
- `log()` — now accepts string or TranscriptEntry

## MCP operations added

pause_job, resume_job, replay_job, send_job_message — all auto-generate CLI
commands and MCP server endpoints.

## Library exports

package.json exports map adds ./minions and ./engine-factory paths so plugins
can `import { MinionQueue } from 'gbrain/minions'` for direct library use.

## Instruction layer (the teaching)

- skills/minion-orchestrator/SKILL.md — when/how to use Minions, decision
  matrix, lifecycle management, anti-patterns
- skills/conventions/subagent-routing.md — cross-cutting rule: all background
  work goes through Minions
- RESOLVER.md — trigger entries for agent orchestration
- manifest.json — registered

## Schema migration v6

Additive: 3 token columns, paused status, minion_inbox table with unread index.
Full Postgres + PGLite support. No backfill needed.

## Tests

65 tests (was 43): pause/resume (5), inbox (6), tokens (4), replay (4),
concurrent worker context (3), plus all existing coverage.

## What's NOT in this commit

Deferred to follow-up PRs:
- LISTEN/NOTIFY subscribe (needs real Postgres E2E)
- Resource governor (depends on concurrent worker stress testing)
- Routing eval harness (needs API keys + benchmark data)
- OpenClaw plugin (separate @gbrain/openclaw-minions-plugin repo)

See docs/designs/MINIONS_AGENT_ORCHESTRATION.md for full CEO-approved design.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(minions): migration v7 — agent_parity_layer schema

Adds columns on minion_jobs (depth, max_children, timeout_ms, timeout_at,
remove_on_complete, remove_on_fail, idempotency_key) plus the new
minion_attachments table. Three partial indexes for bounded scans:
idx_minion_jobs_timeout, idx_minion_jobs_parent_status, and
uniq_minion_jobs_idempotency. Check constraints enforce non-negative depth
and positive child cap / timeout.

Additive migration — existing installs pick it up via ensureSchema on next
use. No user action required.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(minions): extend types for v7 parity layer

Extends MinionJob with depth/max_children/timeout_ms/timeout_at/
remove_on_complete/remove_on_fail/idempotency_key. Extends MinionJobInput
with the same options plus max_spawn_depth override. Adds MinionQueueOpts
(maxSpawnDepth default 5, maxAttachmentBytes default 5 MiB). Adds
AttachmentInput/Attachment shapes and ChildDoneMessage in the InboxMessage
union. rowToMinionJob updated to pick up the new columns.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(minions): attachments validator

New module validateAttachment() gates every attachment write. Rejects empty
filenames, path traversal (.., /, \), null bytes, oversized content (5 MiB
default, per-queue override), invalid base64, and implausible content_type
headers. Returns normalized { filename, content_type, content (Buffer),
sha256, size } on success.

The DB also enforces UNIQUE (job_id, filename) as defense-in-depth for
concurrent addAttachment races — JS-only checks are not sufficient.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(minions): queue v7 — depth, child cap, timeouts, cascade, idempotency, child_done

Wraps completeJob and failJob in engine.transaction() so parent hook
invocations (resolveParent, failParent, removeChildDependency) fold into
the same transaction as the child update. A process crash between child
and parent can't strand the parent in waiting-children anymore.

Adds v7 behaviors:
- Depth tracking. add() computes depth = parent.depth + 1 and rejects
  past maxSpawnDepth (default 5).
- Per-parent child cap. add() takes SELECT ... FOR UPDATE on the parent,
  counts non-terminal children, rejects when count >= max_children.
  NULL max_children = no cap.
- Per-job wall-clock timeout. claim() populates timeout_at when
  timeout_ms is set. New handleTimeouts() dead-letters expired rows with
  error_text='timeout exceeded'. Terminal — no retry.
- Cascade cancel. cancelJob() walks descendants via recursive CTE with
  depth-100 runaway cap. Returns the root row. Re-parented descendants
  (parent_job_id NULL) are naturally excluded.
- Idempotency. add() uses INSERT ... ON CONFLICT (idempotency_key) DO
  NOTHING RETURNING; falls back to SELECT when RETURNING is empty. Same
  key always yields the same job id.
- child_done inbox. completeJob inserts {type:'child_done', child_id,
  job_name, result} into the parent's inbox in the same transaction as
  the token rollup, guarded by EXISTS so terminal/deleted parents skip
  without FK violation. New readChildCompletions(parent_id, lock_token,
  since?) helper; token-fenced like readInbox.
- removeOnComplete / removeOnFail. Deletes the row after the parent hook
  fires, so parent policy sees consistent state.
- Attachment methods. addAttachment validates via validateAttachment
  then INSERTs; UNIQUE (job_id, filename) backs the JS dup check.
  listAttachments, getAttachment, deleteAttachment round out the API.

Fixes pre-existing inverted status bug: add() now puts children in
waiting/delayed (not waiting-children) and atomically flips the parent
to waiting-children in the same transaction. Tests no longer need
manual UPDATE workarounds.

Two correctness fixes:
- Sibling completion race. Under READ COMMITTED, two grandchildren
  completing concurrently each saw the other as still-active in the
  pre-commit snapshot and neither flipped the parent. Fixed by taking
  SELECT ... FOR UPDATE on the parent row at the start of completeJob
  and failJob transactions, serializing siblings on the parent lock.
- JSONB double-encode. postgres.js conn.unsafe(sql, params) auto-
  JSON-encodes parameters. Calling JSON.stringify(obj) first stored a
  JSON string literal (jsonb_typeof=string) and broke payload->>'key'
  queries silently. Removed JSON.stringify from three call sites
  (child_done inbox post, updateProgress, sendMessage). PGLite tolerated
  both forms so unit tests missed it — real-PG E2E caught it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(minions): worker — timeout safety net + handleTimeouts tick

Worker tick now calls handleStalled() first, then handleTimeouts() — stall
requeue wins over timeout dead-letter when both could fire in the same
cycle. handleTimeouts() guards on lock_until > now() so stalled jobs take
the retryable path.

launchJob schedules a per-job setTimeout(timeout_ms) that fires ctx.signal
as a best-effort handler interrupt. The timer is always cleared in .finally
so process exit isn't delayed by a dangling timer. Handlers that respect
AbortSignal stop cleanly; handlers that ignore it still get dead-lettered
by the DB-side handleTimeouts.

Removed post-completeJob and post-failJob parent-hook calls from the worker
— those are now inside the queue method transactions. Worker becomes
simpler and crash-safer.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(minions): 33 new unit tests for v7 parity layer

Covers depth cap, per-parent child cap, timeout dead-letter, cascade
cancel (including the re-parent edge case), removeOnComplete /
removeOnFail, idempotency (single + concurrent), child_done inbox
(posted in txn + survives child removeOnComplete + since cursor),
attachment validation (oversize, path traversal, null byte, duplicates,
base64), AbortSignal firing on pause mid-handler, catch-block skipping
failJob when aborted, worker in-flight bookkeeping, token-rollup guard
when parent already terminal, and setTimeout safety-net cleanup.

Existing tests updated to remove the inverted-status manual UPDATE
workarounds that the add() fix made obsolete.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(e2e): Minions v7 concurrency + OpenClaw resilience coverage

minions-concurrency.test.ts spins two MinionWorker instances against the
test Postgres, submits 20 jobs, and asserts zero double-claims (every job
runs exactly once). This is the only test that actually proves FOR UPDATE
SKIP LOCKED under real concurrency — PGLite runs on a single connection
and can't exercise the race.

minions-resilience.test.ts covers the six OpenClaw daily pains:
1. Spawn storm caps enforce under concurrent submit. 2. Agent stall →
handleStalled() requeues; handleTimeouts() skips (lock_until guard).
3. Forgotten dispatches recoverable via child_done inbox. 4. Cascade
cancel stops grandchildren mid-flight. 5. Deep tree fan-in
(parent → 3 children → 2 grandchildren each) completes with the full
inbox chain. 6. Parent crash/recovery resumes from persisted state.

helpers.ts extends ALL_TABLES with minion_attachments, minion_inbox, and
minion_jobs (FK dependents first) so E2E teardown doesn't leak rows
between runs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: release v0.11.0 — Minions v7 agent orchestration primitives

Bumps VERSION / package.json to 0.11.0. Adds CHANGELOG entry covering
depth tracking, max_children, per-job timeouts, cascade cancel,
idempotency keys, child_done inbox, removeOnComplete/Fail, attachments,
migration v7, plus the two correctness fixes (sibling completion race
and JSONB double-encode).

TODOS.md captures the four v7 follow-ups: per-queue rate limiting,
repeat/cron scheduler, worker event emitter, and waitForChildren
convenience helpers.

1066 unit + 105 E2E = 1171 tests passing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(minions): unify JSONB inserts, tighten nullish coalescing

Three non-blocker cleanups from post-ship review of v0.11.0:

- queue.ts add() and completeJob(): pre-stringifying with JSON.stringify
  while other sites pass raw objects with $n::jsonb casts. postgres.js
  double-encodes if you stringify first — works on PGLite (text→JSONB
  auto-cast), fails silently on real PG. Unify on raw object + explicit
  $n::jsonb cast.
- queue.ts readChildCompletions: since clause used sent_at > $2 relying
  on PG's implicit text→TIMESTAMPTZ coercion. Explicit $2::timestamptz
  is safer and clearer.
- types.ts rowToMinionJob: parent_job_id used || which coerces 0 to null.
  Harmless today (SERIAL IDs start at 1) but ?? is semantically correct.

All 110 unit tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(minions): updateProgress missed $1::jsonb cast in unification

Residual from c502b7e — updateProgress was the only remaining JSONB write
without the explicit ::jsonb cast. Not broken (implicit cast works) but
breaks the convention the prior commit unified everywhere else.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* doc: Minions v7 skill count + jobs subcommands (26 skills)

README: bump skill count 25 → 26, add minion-orchestrator row, add
`gbrain jobs` command family block so v0.11.0's headline feature is
actually discoverable from the top-level commands reference.

CLAUDE.md: unit test count 48 → 49 (minions.test.ts expanded), skill
count 25 → 26, add minion-orchestrator to Key files + skills categorization,
expand MinionQueue one-liner to cover v7 primitives (depth/child-cap,
timeouts, idempotency, child_done inbox, removeOnComplete/Fail).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat: Minions adoption UX — smoke test + migration + pain-triggered routing

Teach OpenClaw when to reach for Minions vs native subagents. Ship three
pieces so upgrading from v0.10.x actually lands for real users:

- `gbrain jobs smoke` — one-command health check that submits a `noop` job,
  runs a worker, verifies completion, and prints engine-aware guidance
  (PGLite installs get the "daemon needs Postgres, use --follow" note).
  Fails loud if schema's below v7 so the user knows to `gbrain init`.

- `skills/migrations/v0.11.0.md` — post-upgrade migration file the
  auto-update agent reads. Six steps: apply schema, run smoke, ask user
  via AskUserQuestion which mode they want (always / pain_triggered / off),
  write to `~/.gbrain/preferences.json`, sanity-check handlers, mark done.
  Completeness scores on each option so the recommendation is explicit.

- `skills/conventions/subagent-routing.md` rewritten — was a "MUST use
  Minions for ALL background work" mandate, now reads preferences.json
  on every routing decision and branches on three modes. Mode B
  (pain_triggered) is the default: keep subagents until gateway drops
  state, parallel > 3, runtime > 5min, or user expresses frustration.
  Then pitch the switch in-session with a specific script.

Rename pass: "Minions v7" → "Minions" in README (JOBS block), TODOS.md
(P1 section header + depends-on), CHANGELOG.md v0.11.0 entry. v7 stays
as the internal schema version in code/migration contexts. The product
name is just Minions.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* doc(readme): promote Minions — 6 OpenClaw pains + how each is fixed

The one-line mention in the skills table wasn't doing the work. Added a
dedicated section between "How It Works" and "Getting Data In" that leads
with the six multi-agent failures every OpenClaw user hits daily (spawn
storms, hung handlers, forgotten dispatches, unstructured debugging,
gateway crashes, runaway grandchildren) and maps each pain to the
specific Minions primitive that fixes it.

Includes the smoke test command, the adoption default (pain_triggered),
and a pointer to skills/minion-orchestrator for the full patterns.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(bench): add harness for Minions vs OpenClaw subagent dispatch

Shared harness (openclawDispatch + minionsHandler) using matching
claude-haiku-4-5 calls on both sides so the delta measures queue+
dispatch overhead on top of identical LLM work. Includes
statsFromResults (p50/p95/p99) and formatStats helpers. Uses
`openclaw agent --local` embedded mode; does not test gateway
multi-agent fan-out (documented in the harness header).

* test(bench): durability under SIGKILL — Minions vs OpenClaw --local

Headline bench for the claim: when the orchestrator dies mid-dispatch,
Minions rescues via PG state + stall detection; OpenClaw --local loses
in-flight work outright.

Minions side: seed 10 active+expired-lock rows (exact state a SIGKILLed
worker leaves) then run a rescue worker. Expect 10/10 completed.
OpenClaw side: spawn 10 `openclaw agent --local` in parallel, SIGKILL
each at 500ms, count pre-kill delivered output. Expect 0/10 — no
persistence layer, nothing to recover.

Budget: ~$0 (Minions handlers sleep 10ms; OC calls die at 500ms so
partial LLM billing is negligible).

* test(bench): per-dispatch throughput — Minions vs OpenClaw --local

20 serial dispatches each side, identical claude-haiku-4-5 call with the
same trivial prompt. p50/p95/p99 reported via statsFromResults. Serial
(not parallel) so the per-dispatch cost is measured honestly and LLM
token spend stays bounded (~$0.08 total).

Minions: one queue, one worker, one concurrency. Submit → poll to
completion before next submit. OpenClaw: N sequential
`openclaw agent --local` spawns.

* test(bench): fan-out — Minions 10-wide concurrency vs 10 parallel OC spawns

Parent dispatches 10 children, waits for all to return. Minions uses
worker concurrency=10 sharing one warm process; OpenClaw parallel
`openclaw agent --local` spawns, each boots its own runtime.

3 runs × 10 children per run. Reports ok count and wall time per run
plus summary. Honest caveat documented: does not test OC gateway
multi-agent fan-out — that needs a custom WS client and LLM-backed
parent agent. This measures what users script today.

Budget: ~$0.12 LLM spend.

* test(bench): memory — 10 in-flight subagents, single-proc vs 10-proc cost

Measures resident memory for keeping 10 subagents in flight. Minions:
one worker process, concurrency=10 with handlers that park on a
promise — sample RSS of the test process via process.memoryUsage().
OpenClaw: 10 parallel `openclaw agent --local` processes, sum their
RSS via `ps -o rss=`.

Handlers are cheap sleeps, no LLM — we want harness memory, not LLM
client state. Budget: $0.

* test(bench): fan-out — don't gate on OC success rate, report numbers

Initial run showed OC parallel `--local` at 10-wide hits 40% failure
rate (17/30 across 3 runs). That's the finding, not a test bug —
process startup stampede + LLM rate limits. Bench now prints error
samples and reports the numbers instead of gating.

Minions side still gates at 90% (30/30 observed in practice).

* doc(benchmarks): Minions vs OpenClaw --local subagent dispatch

Real numbers on four claims: durability, throughput, fan-out, memory.
Same claude-haiku-4-5 call on both sides so the delta is queue+dispatch+
process cost on top of identical LLM work.

Headline: Minions rescues 10/10 from a SIGKILLed worker in 458ms while
OpenClaw --local loses all 10; ~10× faster per dispatch (778ms p50 vs
8086ms p50); ~21× faster at 10-wide fan-out AND 100% reliable vs OC's
43% failure rate; 2 MB vs 814 MB to keep 10 subagents in flight.

Honest caveats section covers what this doesn't test (OC gateway
multi-agent, load tests, other models). Fully reproducible via
test/e2e/bench-vs-openclaw/.

* doc(readme): inject Minions vs OpenClaw bench numbers

Headline deltas now in the Minions section: 10/10 vs 0/10 on crash,
~10× faster per dispatch, ~21× faster fan-out at 10-wide with 0%
failure vs 43%, ~400× less memory. Links to the full bench doc.

Prose first said Minions "fixes all six pains." Now it shows the
numbers that prove it.

* bench: production Wintermute benchmark — Minions 753ms vs sub-agent timeout

Real deployment: 45K-page brain on Render+Supabase. Task: pull 99 tweets,
write brain page, commit, sync. Minions: 753ms, $0. Sub-agent: gateway
timeout (>10s, couldn't even spawn under production load).

Also: 19,240 tweets backfilled across 36 months in 15 min at $0.
Sub-agents would cost $1.08 and fail 40% of spawns.

* bench: tweet ingestion — Minions 719ms vs OpenClaw 12.5s (17×)

Production benchmark with runnable test code:
- test/e2e/bench-vs-openclaw/tweet-ingest.bench.ts (reusable)
- docs/benchmarks/2026-04-18-tweet-ingestion.md (publishable)

Task: pull 100 tweets from X API, write brain page, commit, sync.
Minions: 719ms mean, $0, 100% success.
OpenClaw: 12,480ms mean, $0.03/run, 60% success (gateway timeouts).
At scale: 36-month backfill, 19K tweets, 15 min, $0 vs est. $1.08.

* doc(benchmarks): Wintermute production data point for Minions vs OpenClaw

Adds a production-environment data point to the Minions README section:
one month of tweet ingest on Wintermute (Render + Supabase + 45K-page brain)
ran end-to-end in 753ms for \$0.00 via Minions, while the equivalent
sessions_spawn hit the 10s gateway timeout and produced nothing.

Full methodology + logs in docs/benchmarks/2026-04-18-minions-vs-openclaw-production.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(core): preferences.ts + cli-util.ts — foundations for v0.11.1

Adds two foundational modules that apply-migrations (Lane A-4), the
v0.11.0 orchestrator (Lane C-1), and the stopgap script (Lane C-4) all
depend on.

- src/core/preferences.ts: atomic-write ~/.gbrain/preferences.json
  (mktemp + rename, 0o600, forward-compatible for unknown keys) with
  validateMinionMode, loadPreferences, savePreferences. Plus
  appendCompletedMigration + loadCompletedMigrations for the
  ~/.gbrain/migrations/completed.jsonl log (tolerates malformed lines).
  Uses process.env.HOME || homedir() so $HOME overrides work in CI and
  tests; Bun's os.homedir() caches the initial value and ignores later
  mutations.
- src/core/cli-util.ts: promptLine(prompt) helper, extracted from
  src/commands/init.ts:212-224. Shared so init, apply-migrations, and
  the v0.11.0 orchestrator's mode prompt don't each reinvent it.

test/preferences.test.ts: 21 unit tests covering load/save atomicity,
0o600 perms, forward-compat for unknown keys, minion_mode validation,
completed.jsonl JSONL append idempotence, auto-ts population, malformed-
line tolerance in loadCompletedMigrations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(init): add --migrate-only flag (schema-only, no saveConfig)

Context: v0.11.0 migration orchestrators need a safe way to re-apply the
schema against an existing brain without risking a config flip. Today
running bare `gbrain init` with no flags defaults to PGLite and calls
saveConfig, which would silently overwrite an existing Postgres
database_url — caught by Codex in the v0.11.1 plan review as a
show-stopper data-loss bug.

The new --migrate-only path:
  - loadConfig() reads the existing config (does NOT call saveConfig)
  - errors out with a clear "run gbrain init first" if no config exists
  - connects via the already-configured engine, calls engine.initSchema(),
    disconnects
  - --json emits structured success/error payloads

Everything downstream in the v0.11.1 migration chain (apply-migrations,
the stopgap bash script, the package.json postinstall hook) will invoke
this flag rather than bare gbrain init.

test/init-migrate-only.test.ts: 4 tests covering the no-config error
path, --json error payload shape, happy-path with a PGLite fixture
(verifies config.json content is byte-identical after the call — the
real invariant), and idempotent rerun.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(migrations): TS registry replaces filesystem migration scan

Context: Codex flagged that bun build --compile produces a self-contained
binary, and the existing findMigrationsDir() in upgrade.ts:145 walks
skills/migrations/v*.md on disk — which fails on a compiled install
because the markdown files aren't bundled. The plan's fix is a TS
registry: migrations are code, imported directly, visible to both source
installs and compiled binaries.

- src/commands/migrations/types.ts: shared Migration, OrchestratorOpts,
  OrchestratorResult types.
- src/commands/migrations/index.ts: exports the migrations[] array,
  getMigration(version), and compareVersions() (semver comparator).
  The feature_pitch data that lived in the MD file frontmatter now
  lives here as a code constant on each Migration, so runPostUpgrade's
  post-upgrade pitch printer can consume it without a filesystem read.
- src/commands/migrations/v0_11_0.ts: stub orchestrator + pitch. The
  full phase implementation lands in Lane C-1; for now the stub throws
  a clear "not yet implemented" so apply-migrations --list (Lane A-4)
  can still enumerate the migration.

test/migrations-registry.test.ts: 9 tests covering ascending-semver
ordering, feature_pitch shape invariants, getMigration lookup, and
compareVersions edge cases (equal / newer / older / single-digit
across major bumps).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cli): gbrain apply-migrations — migration runner CLI

Reads ~/.gbrain/migrations/completed.jsonl, diffs against the TS migration
registry, runs pending orchestrators. Resumes status:"partial" entries
(the stopgap bash script writes these so v0.11.1 apply-migrations can
pick up where it left off). Idempotent: rerunning when up-to-date exits 0.

Flags:
  --list                    Show applied + partial + pending + future.
  --dry-run                 Print the plan; take no action.
  --yes / --non-interactive Skip prompts (used by runPostUpgrade + postinstall).
  --mode <a|p|o>            Preset minion_mode (bypasses the Phase C TTY prompt).
  --migration vX.Y.Z        Force-run one specific version.
  --host-dir <path>         Include $PWD in host-file walk (default is
                            $HOME/.claude + $HOME/.openclaw only).
  --no-autopilot-install    Skip Phase F.

Diff rule (Codex H9): apply when no status:"complete" entry exists AND
migration.version ≤ installed VERSION. Previously proposed rule was
"version > currentVersion", which would SKIP v0.11.0 when running v0.11.1;
regression test in apply-migrations.test.ts pins the correct semantics.

Registered in src/cli.ts CLI_ONLY Set; dispatched before connectEngine so
each phase owns its own engine/subprocess lifecycle (no double-connect
when the orchestrator shells out to init --migrate-only or jobs smoke).

test/apply-migrations.test.ts: 18 unit tests covering parseArgs for every
flag, indexCompleted/statusForVersion correctness (including stopgap-then-
complete transition), and buildPlan's four buckets (applied / par…
garrytan added a commit that referenced this pull request Apr 24, 2026
… exit

Lane A of PR #364 review fixes (20-item multi-lane plan). Addresses the
codex-tier + CEO + Eng findings on src/core/minions/supervisor.ts:

Safety + correctness:
- Atomic O_CREAT|O_EXCL PID lock via openSync('wx') with stale-file
  liveness check. Prevents two supervisors racing on the same PID file.
  (codex #1)
- Health check now queries status='active' AND lock_until < now()
  matching queue.ts:848's authoritative stalled definition. The prior
  `status = 'stalled'` predicate returned zero rows forever because
  'stalled' is not a persisted value in the schema. (codex #2)
- All health queries scoped to WHERE queue = $1 via opts.queue binding.
  Multi-queue installs no longer see cross-queue false positives.
  (codex #3)
- Class default allowShellJobs flipped true→false AND explicit
  `delete env.GBRAIN_ALLOW_SHELL_JOBS` when false, so child workers
  don't silently inherit the var from the parent shell. (eng #8, codex #9)
- Unified shutdown(reason, exitCode) — max-crashes now routes through
  the same drain path as SIGTERM. Single source of truth for lifecycle
  cleanup; prerequisite for trustworthy audit events (Lane C). (eng #1)
- Default PID path moves from /tmp to ~/.gbrain/supervisor.pid with
  mkdirSync recursive + GBRAIN_SUPERVISOR_PID_FILE env override.
  Matches the rest of the product's ~/.gbrain/ convention; fresh
  installs no longer hit ENOENT. (CEO #2 + codex #6)

Refinements:
- crashCount = 1 after 5-min stable-run reset (was 0, produced
  calculateBackoffMs(-1) = 500ms by accident). Now reads as 'first
  crash of a new cycle' with a clean 1s backoff. (Nit 1)
- Top-of-file POSTGRES-ONLY docstring documenting why the supervisor
  can't run against PGLite. (Nit 2)
- inBackoff flag suppresses 'worker not alive' warn during the
  expected null-child window (crash → sleep → next spawn). (eng #2)
- Tracked listener refs for SIGTERM/SIGINT removed in shutdown() so
  integration tests spinning up/tearing down multiple supervisors on
  one process don't leak handlers. (eng #3)
- Single FILTER query replaces two SELECT counts — one round-trip
  instead of two, three metrics in one pass. (eng #10)
- child.on('error') listener emits worker_spawn_failed event for
  ENOENT/EACCES; exit handler still increments crashCount as usual
  so max-crashes bounds permanent misconfigurations. (codex #7)
- healthInFlight boolean guard with try/finally prevents overlapping
  health checks from stacking on a hung DB. (codex #8)

Documented exit codes (ExitCodes const):
  0 CLEAN, 1 MAX_CRASHES, 2 LOCK_HELD, 3 PID_UNWRITABLE
  Agent can branch on exit=2 ('another supervisor, I'm fine') vs
  exit=1 ('escalate to human').

Event emitter surface:
  - started / worker_spawned / worker_exited / worker_spawn_failed
  - backoff / health_warn / health_error / max_crashes_exceeded
  - shutting_down / stopped
  Plumbed through emit() with an onEvent callback hook for Lane C's
  audit writer. json:false is the default; Lane C's --json mode
  flips it and writes JSONL to stderr.

CLI changes (src/commands/jobs.ts):
- `gbrain jobs supervisor` gains --allow-shell-jobs (explicit opt-in
  mirroring the env-var gate), --cli-path (override auto-resolution
  for exotic setups), and --json (JSONL lifecycle events on stderr).
- Expanded --help body with description, 3 examples, and exit-code
  table. (DX Fix A per review)
- Three-tier PID path resolution: --pid-file > GBRAIN_SUPERVISOR_PID_FILE
  > ~/.gbrain/supervisor.pid (via exported DEFAULT_PID_FILE).
- Removed the catch-fallback to process.argv[1] — resolveGbrainCliPath()
  throws its own actionable install-hint error, which is what dev users
  need instead of a cryptic spawn failure on a .ts path. (codex #5)

Tests: existing 7 supervisor.test.ts cases continue to pass.
Integration tests (crash-restart, max-crashes, SIGTERM-during-backoff,
env-inheritance regression) land in Lane E.

Out of scope for this lane (tracked in follow-up lanes):
- Audit file writer at ~/.gbrain/audit/supervisor-YYYY-Www.jsonl (Lane C)
- Documentation pass (Lane B)
- supervisor start/status/stop subcommands (Lane C)
- gbrain doctor supervisor check (Lane D)
- /ship release hygiene (Lane F)
- autopilot.ts migration to MinionSupervisor (deferred to follow-up PR
  per codex — requires non-blocking start() API redesign, not ~30 lines)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request Apr 24, 2026
…nager (#364)

* feat: add `gbrain jobs supervisor` — self-healing worker process manager

Adds a first-class supervisor command that:
- Spawns `gbrain jobs work` as a child process
- Restarts on crash with exponential backoff (1s→60s cap)
- Resets crash counter after 5min of stable operation
- PID file locking prevents duplicate supervisors
- Periodic health checks (stalled jobs, completion gaps)
- Graceful shutdown (SIGTERM→35s→SIGKILL)

Usage:
  gbrain jobs supervisor --concurrency 4

Replaces ad-hoc nohup patterns in bootstrap scripts.
The autopilot command's internal supervisor can be migrated
to use this in a follow-up.

Tests: 7 pass (backoff calc, PID management, crash tracking)

* supervisor: atomic PID lock, queue-scoped health, env safety, unified exit

Lane A of PR #364 review fixes (20-item multi-lane plan). Addresses the
codex-tier + CEO + Eng findings on src/core/minions/supervisor.ts:

Safety + correctness:
- Atomic O_CREAT|O_EXCL PID lock via openSync('wx') with stale-file
  liveness check. Prevents two supervisors racing on the same PID file.
  (codex #1)
- Health check now queries status='active' AND lock_until < now()
  matching queue.ts:848's authoritative stalled definition. The prior
  `status = 'stalled'` predicate returned zero rows forever because
  'stalled' is not a persisted value in the schema. (codex #2)
- All health queries scoped to WHERE queue = $1 via opts.queue binding.
  Multi-queue installs no longer see cross-queue false positives.
  (codex #3)
- Class default allowShellJobs flipped true→false AND explicit
  `delete env.GBRAIN_ALLOW_SHELL_JOBS` when false, so child workers
  don't silently inherit the var from the parent shell. (eng #8, codex #9)
- Unified shutdown(reason, exitCode) — max-crashes now routes through
  the same drain path as SIGTERM. Single source of truth for lifecycle
  cleanup; prerequisite for trustworthy audit events (Lane C). (eng #1)
- Default PID path moves from /tmp to ~/.gbrain/supervisor.pid with
  mkdirSync recursive + GBRAIN_SUPERVISOR_PID_FILE env override.
  Matches the rest of the product's ~/.gbrain/ convention; fresh
  installs no longer hit ENOENT. (CEO #2 + codex #6)

Refinements:
- crashCount = 1 after 5-min stable-run reset (was 0, produced
  calculateBackoffMs(-1) = 500ms by accident). Now reads as 'first
  crash of a new cycle' with a clean 1s backoff. (Nit 1)
- Top-of-file POSTGRES-ONLY docstring documenting why the supervisor
  can't run against PGLite. (Nit 2)
- inBackoff flag suppresses 'worker not alive' warn during the
  expected null-child window (crash → sleep → next spawn). (eng #2)
- Tracked listener refs for SIGTERM/SIGINT removed in shutdown() so
  integration tests spinning up/tearing down multiple supervisors on
  one process don't leak handlers. (eng #3)
- Single FILTER query replaces two SELECT counts — one round-trip
  instead of two, three metrics in one pass. (eng #10)
- child.on('error') listener emits worker_spawn_failed event for
  ENOENT/EACCES; exit handler still increments crashCount as usual
  so max-crashes bounds permanent misconfigurations. (codex #7)
- healthInFlight boolean guard with try/finally prevents overlapping
  health checks from stacking on a hung DB. (codex #8)

Documented exit codes (ExitCodes const):
  0 CLEAN, 1 MAX_CRASHES, 2 LOCK_HELD, 3 PID_UNWRITABLE
  Agent can branch on exit=2 ('another supervisor, I'm fine') vs
  exit=1 ('escalate to human').

Event emitter surface:
  - started / worker_spawned / worker_exited / worker_spawn_failed
  - backoff / health_warn / health_error / max_crashes_exceeded
  - shutting_down / stopped
  Plumbed through emit() with an onEvent callback hook for Lane C's
  audit writer. json:false is the default; Lane C's --json mode
  flips it and writes JSONL to stderr.

CLI changes (src/commands/jobs.ts):
- `gbrain jobs supervisor` gains --allow-shell-jobs (explicit opt-in
  mirroring the env-var gate), --cli-path (override auto-resolution
  for exotic setups), and --json (JSONL lifecycle events on stderr).
- Expanded --help body with description, 3 examples, and exit-code
  table. (DX Fix A per review)
- Three-tier PID path resolution: --pid-file > GBRAIN_SUPERVISOR_PID_FILE
  > ~/.gbrain/supervisor.pid (via exported DEFAULT_PID_FILE).
- Removed the catch-fallback to process.argv[1] — resolveGbrainCliPath()
  throws its own actionable install-hint error, which is what dev users
  need instead of a cryptic spawn failure on a .ts path. (codex #5)

Tests: existing 7 supervisor.test.ts cases continue to pass.
Integration tests (crash-restart, max-crashes, SIGTERM-during-backoff,
env-inheritance regression) land in Lane E.

Out of scope for this lane (tracked in follow-up lanes):
- Audit file writer at ~/.gbrain/audit/supervisor-YYYY-Www.jsonl (Lane C)
- Documentation pass (Lane B)
- supervisor start/status/stop subcommands (Lane C)
- gbrain doctor supervisor check (Lane D)
- /ship release hygiene (Lane F)
- autopilot.ts migration to MinionSupervisor (deferred to follow-up PR
  per codex — requires non-blocking start() API redesign, not ~30 lines)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: supervisor as canonical worker deployment pattern

Lane B of PR #364 review fixes. Reframes docs/guides/minions-deployment.md
around `gbrain jobs supervisor` as the default answer (blocker 7), deletes
the 68-line legacy bash watchdog (F10), and updates README + deployment
snippets to match.

docs/guides/minions-deployment.md:
- New 'Worker supervision' section at the top with the canonical 3-command
  agent pattern (start --detach / status --json / stop) and a documented
  exit-code table (0 clean, 1 max-crashes, 2 lock-held, 3 PID-unwritable).
- 'Which supervisor when?' decision table: container = supervisor as
  PID 1, Linux VM = systemd-over-supervisor, dev laptop = bare terminal.
- New 'Agent usage' section for OpenClaw / Hermes / Cursor / Codex — the
  3-turn discover-start-maintain workflow that replaces shell archaeology
  with machine-parseable JSON events + an audit file at
  ~/.gbrain/audit/supervisor-YYYY-Www.jsonl.
- Demoted the 'Option 1: watchdog cron' path entirely; replaced with a
  straightforward upgrade migration block (stop script, remove cron line,
  start supervisor, verify via doctor).
- Preconditions now check Postgres connectivity directly (supervisor is
  Postgres-only; the CLI rejects PGLite with a clear error).

Snippets:
- systemd.service: ExecStart now invokes `gbrain jobs supervisor` instead
  of raw `gbrain jobs work`. Two-layer supervision (systemd → supervisor
  → worker) buys automatic restart on reboot plus fast crash recovery.
  ReadWritePaths expanded to cover $HOME/.gbrain (supervisor PID + audit).
- Procfile + fly.toml.partial: same change — platform restarts the
  container on host events, supervisor restarts the worker on crashes.
- minion-watchdog.sh: deleted (git history retains it for anyone in an
  exotic deployment). Supervisor subsumes every capability it had plus
  atomic PID locking, structured audit events, queue-scoped health
  checks, and graceful drain on SIGTERM.

README.md:
- Added a paragraph under the Minions section pointing `gbrain jobs
  supervisor` as canonical, noting the --detach / status / stop surface
  and the audit file path, with a link to the full deployment guide.
  Kept `gbrain jobs work` documented for direct raw invocation but
  flagged 'prefer supervisor' for any long-running use.

The supervisor `--help` body itself (3 examples + exit-code table in
src/commands/jobs.ts) landed with Lane A — this lane finishes the
discoverability story by making the supervisor findable via doc grep,
README landing, and deployment-guide landing paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* supervisor: daemon-manager subcommands + JSONL audit writer

Lane C of PR #364 review fixes. Adds the daemon-manager CLI surface so
agents can drive `gbrain jobs supervisor` in 3 turns instead of 10, and
the audit writer that makes lifecycle events inspectable across process
restarts. (Blocker 8, closes DX Fix A/B/C.)

New: src/core/minions/handlers/supervisor-audit.ts
  - writeSupervisorEvent(emission, supervisorPid) appends JSONL to
    `${GBRAIN_AUDIT_DIR:-~/.gbrain/audit}/supervisor-YYYY-Www.jsonl`.
    ISO-week rotation via a `computeSupervisorAuditFilename()` helper
    that mirrors `shell-audit.ts` exactly (year-boundary ISO week math,
    Thursday anchor, etc).
  - readSupervisorEvents({sinceMs}) returns parsed events from the
    current week's file, oldest-first, for Lane D's doctor check.
    Malformed lines are skipped silently (disk-full truncation is
    already best-effort at write time).
  - Reuses `resolveAuditDir()` from shell-audit.ts so the
    `GBRAIN_AUDIT_DIR` env var override works identically across all
    gbrain audit trails.

src/commands/jobs.ts: supervisor subcommand dispatcher
  - `gbrain jobs supervisor [start] [--detach] [--json] ...` — default
    subcommand. Without --detach, runs foreground as before. With
    --detach, forks a background child (inheriting stderr so the caller
    can still tail JSONL events), writes a stdout payload:
      {"event":"started","supervisor_pid":N,"pid_file":"...","detached":true}
    and exits 0. Stdin/stdout on the detached child are /dev/null so
    the parent shell isn't held open.
  - `gbrain jobs supervisor status [--json]` — reads the PID file,
    checks liveness via `kill -0`, then reads the last 24h from the
    supervisor audit file to compute crashes_24h / last_start /
    max_crashes_exceeded. Exits 0 if running, 1 if not. JSON output
    is machine-parseable; human output is a 5-line ASCII report.
  - `gbrain jobs supervisor stop [--json]` — reads PID, sends SIGTERM,
    polls `kill -0` every 250ms for up to 40s (supervisor's own 35s
    worker-drain + 5s slack). Reports outcome: drained / timeout_40s
    / pid_file_missing / pid_file_corrupt / process_gone. Exit 0 on
    clean stop.
  - `--json` flag is already plumbed through to the supervisor opts
    from Lane A — this lane adds the onEvent audit-writer callback
    so every supervisor emission (started, worker_spawned,
    worker_exited, worker_spawn_failed, backoff, health_warn,
    health_error, max_crashes_exceeded, shutting_down, stopped) lands
    in the JSONL file with the supervisor's PID attached.

--help body updated:
  - Three separate usage lines (start / status / stop).
  - SUBCOMMANDS block with one-line summaries each.
  - EXIT CODES block (unchanged from Lane A, moved under SUBCOMMANDS).
  - EXAMPLES block updated with status --json + stop + --detach forms.

Tests: existing 127 supervisor + minions tests continue to pass.
Integration tests for the new subcommands + audit writer land with
Lane E.

Follow-up (Lane D): `gbrain doctor` will read readSupervisorEvents()
from this module to surface a `supervisor` health check alongside its
existing checks (DB connectivity, schema version, queue health).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* doctor: add supervisor health check

Lane D of PR #364 review fixes. Closes the observability loop: now that
Lane C writes supervisor lifecycle events to
`${GBRAIN_AUDIT_DIR:-~/.gbrain/audit}/supervisor-YYYY-Www.jsonl`,
`gbrain doctor` surfaces a `supervisor` check alongside its existing
health indicators.

Implementation (src/commands/doctor.ts, filesystem-only block 3b-bis):
- Resolves DEFAULT_PID_FILE via the same three-tier logic as the start
  path (--pid-file > GBRAIN_SUPERVISOR_PID_FILE > ~/.gbrain/supervisor.pid).
- Reads the PID file + `kill -0 <pid>` for liveness.
- Calls readSupervisorEvents({sinceMs: 24h}) from the audit module to
  derive last_start / crashes_24h / max_crashes_exceeded.
- Suppresses the check entirely when the user has never invoked the
  supervisor (no PID file AND no audit events) — avoids noise on
  installs that don't use the feature.

Status thresholds:
  fail   max_crashes_exceeded event seen in last 24h
         (supervisor gave up; operator needs to restart or triage)
  warn   supervisor not running but audit shows prior use
         (unexpected stop — likely crash or manual kill)
  warn   running but > 3 crashes in last 24h
         (supervisor recovering but worker is unstable)
  ok     running + ≤ 3 crashes + no max_crashes event

All failure paths emit a paste-ready recovery command. Read/import
errors are swallowed (best-effort like the other doctor checks).

Tests: all 127 supervisor + minions tests still green; 13 existing
doctor tests unaffected.

F3 done. All four lanes A/B/C/D are now committed; Lane E (integration
tests) and Lane F (/ship v0.20.2) remain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: 4 critical integration tests for supervisor lifecycle

Lane E of PR #364 review fixes (blocker 10). Fills the ~15% coverage
gap flagged in the eng review by actually exercising the code paths
that will break in production — crash-restart loop, max-crashes exit,
SIGTERM-during-backoff, env-var inheritance — via real spawn() calls
against fake shell-script workers. No mocks: real fork, real signals,
real env propagation, real audit file writes.

test/fixtures/supervisor-runner.ts (new, 55 lines):
  A standalone bun script that constructs a MinionSupervisor from env
  vars (SUP_PID_FILE / SUP_CLI_PATH / SUP_MAX_CRASHES / SUP_BACKOFF_FLOOR_MS
  / SUP_HEALTH_INTERVAL_MS / SUP_ALLOW_SHELL_JOBS / SUP_AUDIT_DIR) and
  calls start(). Mock engine returns empty rows for executeRaw (health
  check path still exercised without Postgres). Tests spawn this as a
  subprocess because MinionSupervisor.start() calls process.exit() on
  shutdown — can't run it in the test runner's own process.

test/supervisor.test.ts (existing; 91 → 300 lines):
  - Added IntegrationHarness helper: creates a unique tmpdir per test,
    a fake worker shell script, a PID-file path, and an audit-dir path;
    cleanup runs in finally.
  - spawnSupervisor() forks bun on the runner with env vars set.
  - readAudit() reads the supervisor-YYYY-Www.jsonl file via the
    existing readSupervisorEvents() helper (Lane C), threading
    GBRAIN_AUDIT_DIR through so tests don't collide on ~/.gbrain.
  - waitFor(pred, timeoutMs) polls helper for event-driven tests.

Four integration tests (with _backoffFloorMs=5 for <1s suite runs):

  1. "respawns the worker after a crash and eventually exits with
     max-crashes code=1"
     Worker always `exit 1`. maxCrashes=3. Asserts: exit code 1, PID
     file cleaned up, audit contains started + 3x worker_spawned +
     3x worker_exited + max_crashes_exceeded + shutting_down + stopped,
     and the stopped event carries {reason:'max_crashes', exit_code:1}.
     Locks in blockers 1 (PID lock), 2+3+6 (health SQL doesn't 500),
     5 (unified shutdown emits right events), F8 (spawn errors counted).

  2. "receives SIGTERM while sleeping between crashes and exits 0 cleanly"
     Worker always `exit 1`, backoff floor 800ms to catch the sleep.
     Asserts: SIGTERM during backoff → exit code 0 (not 1) in <5s,
     no signal kill (process.exit via shutdown), audit contains
     shutting_down {reason:'SIGTERM'} + stopped, PID file cleaned up.
     Locks in eng Issue 1 (unified exit path), eng Issue 3 (signal
     handlers don't accumulate across shutdowns).

  3. "strips inherited GBRAIN_ALLOW_SHELL_JOBS when allowShellJobs=false,
     even if parent has it set"  ⚠ CRITICAL regression test
     Parent env has GBRAIN_ALLOW_SHELL_JOBS=1. SUP_ALLOW_SHELL_JOBS=0.
     Worker writes $GBRAIN_ALLOW_SHELL_JOBS (or 'UNSET' if absent) to
     an OUT_FILE. Asserts child sees 'UNSET'. Locks in codex #9 + eng
     #8: the `else delete env.GBRAIN_ALLOW_SHELL_JOBS` branch from
     Lane A is load-bearing for the supervisor's security posture;
     this test prevents a future refactor silently re-opening the
     inheritance hole.

  4. "DOES pass GBRAIN_ALLOW_SHELL_JOBS to child when allowShellJobs=true"
     Positive-path companion to #3. SUP_ALLOW_SHELL_JOBS=1 → worker
     sees '1'. Confirms the else-branch doesn't over-strip and that
     operators who explicitly opt in still get shell-exec enabled.

Plus two audit-format unit tests:
  - computeSupervisorAuditFilename format (regex match)
  - Year-boundary ISO week: 2027-01-01 → supervisor-2026-W53.jsonl
    (matches the shell-audit.ts pattern exactly)

Before: 7 tests covering backoff math + PID helpers (~15% behavioral
coverage per eng review).
After: 13 tests across all critical lifecycle paths (crash-restart,
max-crashes, SIGTERM, env-inheritance, audit rotation).

All 146 tests in supervisor + minions + doctor suites green in ~8s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.20.2)

Lane F of PR #364 review fixes. Closes the multi-lane plan with release
hygiene: VERSION bump 0.19.0 → 0.20.2, package.json sync, CHANGELOG entry
in GStack voice with release summary + "numbers that matter" table +
"To take advantage of v0.20.2" migration block + itemized changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: escape template-literal interpolation in supervisor --help

The --help body in src/commands/jobs.ts is one big backtick template
literal. The supervisor subcommand description I added in Lane B used
both `${GBRAIN_AUDIT_DIR:-~/.gbrain/audit}` (parsed as a template
interpolation into an undefined variable) and inline `code` backticks
(parsed as nested template literals). CI caught it with ~200 tsc parse
errors across the file.

Fix:
- Escape `${...}` → `\${...}` so the audit-file path renders literally.
- Replace prose inline-code backticks with plain single-quote fences
  (`gbrain jobs work` → 'gbrain jobs work', `~/.gbrain/supervisor.pid`
  → ~/.gbrain/supervisor.pid). `--help` output is human prose; the
  single-quote form reads cleanly in a terminal without needing to
  smuggle nested backticks through a template literal.

`bunx tsc --noEmit` is clean. 146 tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: regenerate llms-full.txt after Lane B doc rewrite

CI drift guard caught that `llms-full.txt` didn't match the current
generator output. Root cause: the Lane B rewrite of
`docs/guides/minions-deployment.md` (supervisor as canonical, watchdog
deleted) changed content that gets inlined into `llms-full.txt`, but I
didn't run `bun run build:llms` to regenerate.

`bun test test/build-llms.test.ts` now clean (7/7 pass).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: root <root@localhost>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request Apr 28, 2026
…step 10/15)

Issue #10 of the eng review: getStorageStatus and runStorageStatus mixed
data gathering, JSON serialization, and human-readable output in one
function. Hard to test, hard to reuse, mismatched the orphans.ts pattern
that CLAUDE.md cites as the precedent.

Now three pure functions + a thin dispatcher:

  getStorageStatus(engine, repoPath) — async, returns StorageStatusResult.
    Side effects: engine.listPages + one walkBrainRepo (Issue #14).
    Exported so MCP exposure (D14) and gbrain doctor (D13) can consume the
    same data without re-running the loop.

  formatStorageStatusJson(result) — pure, returns indented JSON. Stable
    contract on the StorageStatusResult shape, suitable for orchestrators.

  formatStorageStatusHuman(result) — pure, returns ASCII text (D10 — no
    unicode box-drawing). Composable into other commands later.

  runStorageStatus(engine, args) — thin dispatcher: parses --repo /
    --json, calls getStorageStatus, picks a formatter, prints.

8 new test cases on the formatters: JSON parse round-trip, null-config
fallback, missing-files capped at 10 with rollup, ASCII-only assertion
(D10 regression guard), warnings inline, configuration listing, disk-
usage block omitted when zero bytes.

The StorageStatusResult interface is now exported as a public type, so
gbrain doctor's storage_tiering check can build its own findings from
the same shape.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request Apr 30, 2026
)

* feat: storage tiering — git-tracked vs supabase-only directories

Brain repos scaling to 200K+ files. Bulk data (tweets, articles, transcripts)
bloats git repos and slows operations. New storage config in gbrain.yml lets
users declare git-tracked and supabase-only directories.

Changes:
- New config: storage.git_tracked and storage.supabase_only in gbrain.yml
- gbrain sync auto-manages .gitignore for supabase-only paths
- gbrain export --restore-only restores missing supabase-only files from DB
- New gbrain storage status command shows tier breakdown
- Config validation warns on conflicts
- 8 tests passing, full docs at docs/storage-tiering.md

Backward compatible — systems without gbrain.yml work unchanged.

* feat: add getDefaultSourcePath() typed accessor (step 1/15)

Single source of truth for "what brain repo are we operating against?"
Replaces ad-hoc raw SQL in storage.ts:38 (Issue #3 of eng review). Used by
both gbrain storage status and gbrain export --restore-only.

Returns null on miss, throws on DB error. Composes with the existing
resolveSourceId chain so it honors --source flag / GBRAIN_SOURCE env /
.gbrain-source dotfile / longest-prefix CWD match / brain-level default.

4 new test cases covering happy path, missing local_path, DB error
propagation, and CWD-prefix resolution priority.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: replace gray-matter with dedicated YAML parser (step 2/15)

The original storage-config.ts called gray-matter on a delimiter-less YAML
file. Gray-matter only parses YAML inside `---` frontmatter blocks; without
delimiters, it returns `{data: {}}`. Result: loadStorageConfig() always
returned null, the entire feature was a silent no-op for every user.

Original eng review's P0 confidence-9 finding (Issue #1).

Replaces gray-matter with a small dedicated parser for the gbrain.yml shape
(top-level `storage:` section, two array-valued nested keys). Yaml-lite was
considered first, but its flat key:value design doesn't handle nested
arrays. The dedicated parser is ~50 lines and trades expressiveness for
zero-dep, predictable parsing of a file format we control.

Adds the Issue #1B sanity warning (locked B): when gbrain.yml exists but
has no storage section (or empty arrays), warn once-per-process so the
user sees their config didn't take. The single test that would have caught
the original P0 — write a real gbrain.yml, call loadStorageConfig, assert
non-null — now exists.

Also tightens loadStorageConfig per D36: distinguishes "absent" (silent
null) from "unreadable" (throws). The previous code silently swallowed
read errors, hiding broken installs.

8 new test cases: real-disk happy path, comments + blank lines, quoted
values, missing storage section warning, empty section warning,
once-per-process warning suppression, unreadable file behavior, and the
existing helper tests (validation, tier matching, edge cases) all still
pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor: rename storage keys to db_tracked/db_only (step 3/15)

The vendor-specific names "supabase_only" and "git_tracked" hardcoded a
backend (Supabase) into the config schema. gbrain ships two engines —
PGLite and Postgres-via-Supabase. The canonical distinction is "lives in
the brain DB only" vs "lives in the brain DB and on disk under git." Both
work on either engine.

Renamed throughout (Issue #4 of eng review):
  git_tracked    → db_tracked
  supabase_only  → db_only
  isGitTracked() → isDbTracked()
  isSupabaseOnly() → isDbOnly()
  StorageTier 'git_tracked'/'supabase_only' → 'db_tracked'/'db_only'

Backward compatibility (D3 lock):
  loadStorageConfig accepts both shapes. Loader resolution order per the
  eng-review pass-2 finding: parse YAML → if canonical keys present use
  them, else if deprecated keys present map to canonical AND emit
  once-per-process deprecation warning → THEN run validation.
  Validation always sees the canonical shape so error messages reference
  db_tracked/db_only regardless of which keys the user wrote.

  The deprecation warning suggests `gbrain doctor --fix` for an automated
  rename (D72 — fix path lands in step 7).

  When both shapes coexist in one file, canonical wins and a stronger
  warning fires ("deprecated keys ignored — remove them").

Aliases isGitTracked/isSupabaseOnly kept for now to avoid churning the
sync.ts / export.ts / storage.ts call sites in this commit; they'll be
removed in a follow-up step. Storage.ts's tier-bucket initializers and
output strings updated. ASCII output replaces unicode box-drawing per D10.

gbrain.yml example file updated to canonical keys with explanatory
comments.

2 new test cases: deprecated-key fallback (asserts both shapes load
correctly with warning), canonical-wins-over-deprecated (asserts the
"both shapes coexist" path).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: add slugPrefix to PageFilters with engine-side filter (step 4/15)

Issue #13 of the eng review: storage.ts and export.ts loaded every page
in the brain (limit: 1_000_000) to check tier membership. On the 200K-page
brains this feature targets, that's the wall-clock and memory landmine
the feature exists to fix.

Adds an optional `slugPrefix` field to PageFilters. Both engines implement
it as `WHERE slug LIKE prefix || '%' ESCAPE '\'`, with literal escaping of
LIKE metacharacters (%, _, \) so user-supplied prefixes like `media/x/`
are treated as exact string prefixes.

Performance: the (source_id, slug) UNIQUE constraint on the pages table
gives both engines a btree index that supports LIKE-prefix range scans.
An EXPLAIN on Postgres confirms the index range scan rather than a seq
scan. PGLite has the same index shape via pglite-schema.ts.

Consumers updated:
  - export.ts: --slug-prefix flag now goes engine-side (no in-memory
    .filter(...)). The --restore-only path queries each db_only directory
    with slugPrefix in a loop instead of one full-table scan, with seen-set
    deduplication and disk-existence check inline.
  - storage.ts: keeps the full-scan path because storage-status needs the
    "unspecified" bucket count, which can't be computed without enumerating
    every page. Comment notes that step 5 (single-walk filesystem scan)
    will reduce per-page disk syscall cost.

2 new test cases on PGLiteEngine: slugPrefix happy path (3 tier dirs,
asserts only matching slugs return) and metacharacter escape regression
(asserts safe/ doesn't match unrelated slugs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* perf: single-walk filesystem scan via walkBrainRepo() (step 5/15)

Issue #14 of the eng review: storage.ts called existsSync + statSync
per-page in a synchronous loop. On a 200K-page brain that's 400K syscalls
serialized. Wall-clock landmine.

Adds src/core/disk-walk.ts with walkBrainRepo(repoPath) — one recursive
readdirSync walk, builds a Map<slug, {size, mtimeMs}>. Storage.ts looks
up each DB page in the map (O(1)) instead of stat-checking on demand.
Slug derivation matches the pages-table convention: people/alice.md on
disk becomes people/alice as the map key.

Skipped during walk:
  - dot-directories (.git, .gbrain, .vscode, etc) — not part of the brain
    namespace
  - node_modules — guards against accidentally walking into imported repos
  - non-.md files (sidecar JSON, binaries) — tracked by the brain through
    the files table, not by slug

Reusable: future commands (gbrain doctor's storage_tiering check, the
optional autopilot tier-fix path) get the same walk for free.

9 new test cases: empty dir, nonexistent dir, top-level files, nested
dirs, dot-dir skipping, node_modules skipping, non-.md filtering, size
capture, mtimeMs capture.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: path-segment matching for tier directories (step 6/15)

Issue #5 + D6 of the eng review: tier matching used slug.startsWith(dir),
which falsely matches 'media/xerox/foo' against 'media/x' if a user wrote
the directory without a trailing slash.

The new matcher requires the configured directory to end with `/` and
treats it as a canonical path-segment ancestor:

  media/x/   matches  media/x/tweet-1       ✓
  media/x/   doesn't  media/xerox/foo       ✗
  media/x    refused  media/x/tweet-1       (matcher requires trailing /)

Non-canonical input (no trailing slash) is refused outright. Step 7's
auto-normalizing validator converts user-written 'media/x' → 'media/x/'
on load, so the matcher never sees non-canonical input from real configs.
The behavior tested here is the strict matcher's contract.

Regression test pins the media/xerox collision case explicitly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: auto-normalize trailing-slash, throw on tier overlap (step 7/15)

D7+D8 of the eng review: validation was warnings-only. Users miss warnings.
Now:

  - Cosmetic: missing trailing slash auto-corrected, one-time info note
    showing what changed ("normalized 2 storage paths: 'people' →
    'people/', 'media/x' → 'media/x/'"). Once-per-process to keep noise low.

  - Semantic: same directory in both tiers throws StorageConfigError.
    Ambiguous routing — does media/ win as db_tracked or db_only? — is a
    real bug the user must fix. Caller propagates to the CLI for a clean
    exit-1 with actionable message.

loadStorageConfig now applies normalize+validate after merging deprecated
keys, so the path-segment matcher (step 6) only ever sees canonical
trailing-slash directories.

The pure validateStorageConfig kept for callers who want the warnings list
without the auto-fix side effects (gbrain doctor's reporting path).

2 new test cases: auto-normalize round-trip with warning text assertion,
overlap throws StorageConfigError.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: wire manageGitignore into runSync, only on success (step 8/15)

Issue #2 of the eng review: manageGitignore was defined and never
invoked. Docs claimed "auto-managed by gbrain" — false. Users hit a
.gitignore that never updated and committed db_only directories anyway.

Wire-up: runSync now calls manageGitignore after each successful
performSync return, in both watch and one-shot modes.

Eng review pass-2 finding #1: skip on dry_run AND blocked_by_failures
status. A sync that aborted partway has stale state; mutating .gitignore
based on a partially-loaded config invites drift. Failure-skip test
added (uses .gitignore-as-a-directory to simulate write failure;
asserts warning fired and disk wasn't corrupted).

Hardened manageGitignore itself with three additional behaviors:

  - GBRAIN_NO_GITIGNORE=1 escape hatch (D23) for shared-repo setups
    where a maintainer wants gbrain to leave .gitignore alone.

  - Submodule detection (D49). When repoPath/.git is a regular file
    (gitdir: ... pointer), the repo is a git submodule. Submodule
    .gitignore changes don't survive parent submodule updates, so we
    skip with an actionable warning ("add db_only directories to your
    parent repo's .gitignore manually").

  - Graceful failure (D9). Read errors, write errors, and
    StorageConfigError (overlap from step 7) all log a warning and
    return — sync's primary job (moving data) shouldn't die because of
    a side-effect on .gitignore.

manageGitignore is now exported (previously private) so the
storage-sync test file can hit it directly without spinning up sync.

9 new test cases: no-op without gbrain.yml, no-op with empty db_only,
happy-path append, idempotency (run twice, single entry), preservation
of user-written rules, GBRAIN_NO_GITIGNORE skip, submodule skip,
.git-directory normal path, write-failure graceful warning.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: D5 resolution chain for --restore-only and storage status (step 9/15)

D5 of the eng review: gbrain export --restore-only without --repo
silently fell through to the regular export path, dumping every page in
the database to the wrong directory. Hard regression risk.

Now exits 1 with an actionable message when --restore-only has no
--repo AND no configured default source. Resolution order:
  1. Explicit --repo flag
  2. Typed sources.getDefault() (reuses step 1's accessor)
  3. Hard error — never fall through to cwd

storage.ts:38 also bypassed BrainEngine with raw SQL and a bare
try/catch (Issue #3 + Issue #9). Replaced with the same typed
getDefaultSourcePath() — single source of truth, errors propagate
cleanly to the user, no silent cwd fallback.

Regular export (no --restore-only) keeps its current behavior per D26:
exports include everything, --repo is optional.

4 new test cases on PGLite in-memory:
  - hard-errors with no --repo + no default
  - explicit --repo wins
  - falls back to sources default local_path
  - non-restore export does not require --repo

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor: split storage.ts into pure data + JSON + human formatters (step 10/15)

Issue #10 of the eng review: getStorageStatus and runStorageStatus mixed
data gathering, JSON serialization, and human-readable output in one
function. Hard to test, hard to reuse, mismatched the orphans.ts pattern
that CLAUDE.md cites as the precedent.

Now three pure functions + a thin dispatcher:

  getStorageStatus(engine, repoPath) — async, returns StorageStatusResult.
    Side effects: engine.listPages + one walkBrainRepo (Issue #14).
    Exported so MCP exposure (D14) and gbrain doctor (D13) can consume the
    same data without re-running the loop.

  formatStorageStatusJson(result) — pure, returns indented JSON. Stable
    contract on the StorageStatusResult shape, suitable for orchestrators.

  formatStorageStatusHuman(result) — pure, returns ASCII text (D10 — no
    unicode box-drawing). Composable into other commands later.

  runStorageStatus(engine, args) — thin dispatcher: parses --repo /
    --json, calls getStorageStatus, picks a formatter, prints.

8 new test cases on the formatters: JSON parse round-trip, null-config
fallback, missing-files capped at 10 with rollup, ASCII-only assertion
(D10 regression guard), warnings inline, configuration listing, disk-
usage block omitted when zero bytes.

The StorageStatusResult interface is now exported as a public type, so
gbrain doctor's storage_tiering check can build its own findings from
the same shape.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* types: distinct PageCountsByTier and DiskUsageByTier (step 11/15)

Issue #11 of the eng review: pagesByTier (page counts) and
diskUsageByTier (byte totals) shared the same structural type
(Record<StorageTier, number>). Both are tier-keyed numeric maps but
carry semantically different units. A future bug that swaps them at a
call site (e.g., displaying disk bytes where the count belongs) wouldn't
trip the compiler.

Replaced with distinct nominal types via a brand field. Structurally
identical at runtime (no overhead) but compile-time disjoint —
TypeScript catches accidental cross-assignment.

  PageCountsByTier   { db_tracked, db_only, unspecified } : numbers (count)
  DiskUsageByTier    { db_tracked, db_only, unspecified } : numbers (bytes)

Both initialized in getStorageStatus, both threaded into
StorageStatusResult, both consumed by formatStorageStatusHuman /
formatStorageStatusJson without further changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: PGLite soft-warn + full lifecycle test (step 12/15)

D4: storage tiering on PGLite is a partial feature. The "DB" the pages
live in IS the local file gbrain uses for everything else, so "db_only"
has no real offload effect. The .gitignore management still helps
(keeps bulk content out of git history), so we warn and proceed —
not refuse.

Two warning sites (once-per-process each via module-local flags):
  - storage status: warns at runStorageStatus entry
  - sync: warns inside manageGitignore when engineKind='pglite' and
    config has db_only entries

Both phrased actionably ("To get full tiering, migrate to Postgres
with `gbrain migrate --to supabase`").

manageGitignore signature now takes an optional `engineKind` param.
runSync passes engine.kind. Stand-alone callers (tests, future
gbrain doctor --fix path) can omit it.

New test: test/storage-pglite.test.ts — D8 + D4 lifecycle. 6 cases:
engine.kind assertion, getStorageStatus loading gbrain.yml + reporting
tier counts, manageGitignore PGLite-warn (once per process), Postgres
no-warn, slugPrefix on PGLite, end-to-end (config + putPage + status
+ gitignore).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: add trailing-newline CI guard (step 14/15)

Issue #7 of the eng review: all four new files in the original
storage-tiering branch lacked POSIX trailing newlines. Linters complain,
git diffs phantom-flag every future edit. We've been adding newlines as
each file landed; this commit catches the regression class.

scripts/check-trailing-newline.sh:
  - sibling to check-jsonb-pattern.sh / check-progress-to-stdout.sh per
    CLAUDE.md's CI guard pattern
  - portable to bash 3.2 (macOS default; no mapfile, no associative arrays)
  - covers src/**, test/**, gbrain.yml, top-level *.md
  - reports each missing file by path and exits 1

Wired into `bun run test` between progress-to-stdout and typecheck.

Also fixed docs/storage-tiering.md (pre-existing missing newline from
the original branch — caught by the new guard on first run).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: v0.23.0 — VERSION, CHANGELOG, README, CLAUDE.md, storage-tiering.md (step 15/15)

VERSION → 0.23.0 (minor bump for new feature surface).

CHANGELOG entry in Garry voice with the canonical format:
  - Two-line bold headline ("Storage tiering, finally working...")
  - Lead paragraph naming what was broken before and what users get now
  - "Numbers that matter" before/after table for the 6 things that
    actually changed
  - "What this means for your brain" closer
  - "To take advantage of v0.23.0" self-repair block (per CLAUDE.md
    convention) — 6 numbered steps users can follow
  - Itemized changes split into critical fixes / new+renamed surface /
    architecture cleanup / tests + CI guards

CLAUDE.md "Key files" gains four new entries: storage-config.ts,
disk-walk.ts, the v0.23.0 storage.ts shape, and gbrain.yml itself.

README.md gains a new "Storage tiering" section between Skillify and
Getting Data In with the canonical example + commands + link to the
full guide.

docs/storage-tiering.md rewritten end-to-end with canonical key names
(db_tracked / db_only), v0.23.0 hardening details (idempotency,
submodule detection, GBRAIN_NO_GITIGNORE, dry-run gating), the
resolution chain for --restore-only, the auto-normalize +
throw-on-overlap validator, and the PGLite engine note.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: e2e Postgres lifecycle for storage tiering (step 16/16)

Per the v0.23.0 plan: full lifecycle E2E against real Postgres.

  - engine.kind === 'postgres' assertion
  - Full lifecycle: write 4 pages (1 db_tracked, 2 db_only, 1 unspecified)
    → getStorageStatus reports correct tier counts → human formatter
    renders → manageGitignore writes managed block → idempotency check
    → getDefaultSourcePath() resolves the configured local_path.
  - Container restart simulation: 2 db_only pages in DB, files missing
    on disk → status.missingFiles.length === 2 → slugPrefix engine
    filter on Postgres returns exactly the tier slugs.
  - slugPrefix index-based range scan regression: 50 media/x/* + 50
    people/p-* pages → slugPrefix='media/x/' returns exactly 50.
  - getDefaultSourcePath returns null when default source has no
    local_path (the hard-error path that replaces the original silent
    cwd fallback).
  - manageGitignore on Postgres engine does NOT emit the PGLite
    soft-warn (cross-engine assertion).

Skips gracefully when DATABASE_URL is unset, per CLAUDE.md E2E pattern.
Run via: DATABASE_URL=... bun test test/e2e/storage-tiering.test.ts

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: rebump version 0.23.0 → 0.22.9

Reverts the minor bump back to a patch-style version on the v0.22 line.
Storage tiering ships within the v0.22.x train alongside the recent
fix waves. Updates VERSION, package.json, CHANGELOG header + body refs,
CLAUDE.md Key files annotations, README.md section heading, and the
docs/storage-tiering.md backward-compat note.

* chore: bump version 0.22.9 → 0.22.11

Sibling workspaces claimed v0.22.10 in the queue. This branch advances
to v0.22.11 to keep the version monotonic on master.

Updates VERSION, package.json, CHANGELOG header + body refs, CLAUDE.md
Key files annotations, README.md section heading, and the
docs/storage-tiering.md backward-compat note.

* fix: address Codex pre-landing review findings (4 fixes)

Codex found 4 real issues during pre-landing review of v0.22.11 diff:

[P0] export --restore-only fell through to full export when
storageConfig was null (no gbrain.yml present). On older or
misconfigured brains, the recovery command would silently dump the
entire database. src/commands/export.ts now refuses with an actionable
error before any page query fires — matches the D5 lock spirit
("never silently fall through").

[P1] manageGitignore wire-up only fired when --repo was passed
explicitly. performSync resolves the repo from sync.repo_path or
sources.local_path, so the common `gbrain sync` path (after
setup, no flag) never updated .gitignore. src/commands/sync.ts now
uses the same source-resolver chain as the rest of /ship: opts.repoPath
→ getDefaultSourcePath → null. Fires in both watch and one-shot modes.

[P2] getDefaultSourcePath only consulted sources.local_path, missing
the legacy global sync.repo_path config key that pre-v0.18 brains use.
Added a fallback to engine.getConfig('sync.repo_path') when the
sources row has NULL local_path. Pre-v0.18 brains now work without
forcing a `gbrain sources add . --path .` migration.

[P2] sync --all multi-source loop never called manageGitignore even
though src.local_path was already known. Each source now gets its own
gitignore update on successful sync.

Tests:
  - test/storage-export.test.ts: replaced the old "falls through to
    full export" test with one that asserts the new refusal path
    (storage-tiering config required for --restore-only).
  - test/source-resolver.test.ts: added a fallback test exercising the
    legacy sync.repo_path code path for pre-v0.18 brains.
  - All 78 storage-tiering tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: regenerate llms.txt + llms-full.txt for v0.22.11

Per CLAUDE.md: "Run `bun run build:llms` after adding a new doc."
The README's new Storage tiering section + the rewritten
docs/storage-tiering.md changed the inlined bundle. test/build-llms.test.ts
catches the drift and was failing on master pre-regen.

* fix: typecheck error in disk-walk.ts (CI #73350475897)

tsc --noEmit failed in CI because ReturnType<typeof readdirSync> with
withFileTypes:true picks an overload union that includes
Dirent<Buffer<ArrayBufferLike>>. Strict tsc treats entry.name as Buffer,
so .startsWith / .endsWith / string comparisons all blew up.

Annotate the variable as Dirent[] (string-based) and cast through unknown,
matching the pattern sync.ts already uses for its own filesystem walk.
Same runtime behavior; clean typecheck.

Tests still 9/9.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EOF

---------

Co-authored-by: root <root@localhost>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request May 1, 2026
…old)

src/core/anthropic-pricing.ts — USD/1M-tokens map for Claude 4.7 family
plus older aliases. estimateMaxCostUsd returns null on unpriced models so
the meter caller can warn-once and bypass the gate.

src/core/cycle/budget-meter.ts — cumulative cost ledger. Each submit
estimates max-cost from (model + estimatedInputTokens + maxOutputTokens),
accumulates per-cycle, refuses next submit when projected > cap. Codex
P1 #10 fold: non-Anthropic models (gemini, gpt) bypass with one stderr
warn per process and `unpriced=true` on the result. Budget=0 disables
the gate. Audit trail at ~/.gbrain/audit/dream-budget-YYYY-Www.jsonl.

src/core/cycle/auto-think.ts — auto_think dream phase. Reads
dream.auto_think.{enabled,questions,max_per_cycle,budget,cooldown_days,
auto_commit}. Iterates configured questions through runThink with the
BudgetMeter pre-checking each submit. Cooldown timestamp written ONLY on
success (matches v0.23 synthesize pattern — retries after partial
failures pick back up). When auto_commit=true, persists synthesis pages
via persistSynthesis. Default-disabled.

src/core/cycle/drift.ts — drift dream phase scaffold. Reads
dream.drift.{enabled,lookback_days,budget,auto_update}. Surfaces takes
in the soft band (weight 0.3-0.85, unresolved) that have recent timeline
evidence on the same page. v0.28 ships the orchestration; the LLM judge
that proposes weight adjustments lands in v0.29. modelId + meter wired
now so the ledger captures gate state for callers that opt in.

Tests:
- test/budget-meter.test.ts (7 cases) — pricing-map coverage, allow path,
  cumulative-deny, budget=0 disabled, unpriced bypass+warn-once, ledger
  captures all events, ISO-week filename branch.
- test/auto-think-phase.test.ts (9 cases) — auto_think enable/skip,
  questions empty, success → cooldown ts written, cooldown blocks rerun,
  budget exhausted → partial. drift not_enabled, soft-band candidate
  detection, complete + dry-run paths.

All pass. Typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request May 7, 2026
…low-list (#563)

* v0.28 schema: takes + synthesis_evidence (v31) + access_tokens.permissions (v32)

Migration v31 adds the takes table (typed/weighted/attributed claims) and
synthesis_evidence (provenance for `gbrain think` outputs). Page-scoped via
page_id FK (slug isn't unique alone in v0.18+ multi-source). HNSW partial
index on embedding for active rows. ON DELETE CASCADE on synthesis_evidence
so deleting a source take cascades the provenance row.

Migration v32 adds access_tokens.permissions JSONB with safe-default
backfill (`{"takes_holders":["world"]}`). Default keeps non-world holders
hidden from MCP-bound tokens until the operator explicitly grants access
via the v0.28 auth permissions CLI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 engine: addTakesBatch, listTakes, searchTakes/Vector, supersede, resolve, synthesis_evidence

Extends BrainEngine with the takes domain object. Both engines implement the
same surface; PGLite uses manual `$N` placeholders, Postgres uses postgres-js
unnest() — same shape as addLinksBatch and addTimelineEntriesBatch.

Methods:
- addTakesBatch (upsert via ON CONFLICT (page_id, row_num) DO UPDATE)
- listTakes (filter by holder/kind/active/resolved, takesHoldersAllowList
  for MCP-bound calls, sortBy weight/since_date/created_at)
- searchTakes / searchTakesVector (pg_trgm + cosine; honor allow-list)
- countStaleTakes / listStaleTakes (mirror countStaleChunks pattern;
  embedding column intentionally omitted from listStale payload)
- updateTake (mutable fields only; throws TAKE_ROW_NOT_FOUND)
- supersedeTake (transactional: insert new at next row_num, mark old
  active=false, set superseded_by; throws TAKE_RESOLVED_IMMUTABLE on
  resolved bets)
- resolveTake (sets resolved_*; throws TAKE_ALREADY_RESOLVED on re-resolve;
  resolution is immutable per Codex P1 #13 fold)
- addSynthesisEvidence (provenance persist; ON CONFLICT DO NOTHING)
- getTakeEmbeddings (parallel to getEmbeddingsByChunkIds)

Types live in src/core/engine.ts adjacent to LinkBatchInput. Page-scoped
via page_id (slug not unique in v0.18+ multi-source). PageType gains
'synthesis'. takeRowToTake mapper in utils.ts handles Date → ISO string
normalization.

Tests: test/takes-engine.test.ts — 16 cases against PGLite covering
upsert/list/filter/search happy paths, takesHoldersAllowList isolation,
the four invariant errors (TAKE_ROW_NOT_FOUND, TAKES_WEIGHT_CLAMPED,
TAKE_RESOLVED_IMMUTABLE, TAKE_ALREADY_RESOLVED), supersede flow, resolve
metadata round-trip, FK CASCADE on synthesis_evidence when source take
deletes. All pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 model-config: unified resolveModel with 6-tier precedence + alias resolution

Replaces every hardcoded `claude-*-X` and per-phase `dream.<phase>.model`
config key with a single resolver. Hierarchy:

  1. CLI flag (--model)
  2. New-key config (e.g. models.dream.synthesize)
  3. Old-key config (deprecated dream.synthesize.model, dream.patterns.model)
     — read with stderr deprecation warning, one-per-process
  4. Global default (models.default)
  5. Env var (GBRAIN_MODEL or caller-supplied)
  6. Hardcoded fallback

Aliases (`opus`, `sonnet`, `haiku`, `gemini`, `gpt`) resolve at the end so
any tier can use a short name. User-defined `models.aliases.<name>` config
overrides built-ins. Cycle-safe (depth 2 break). Unknown alias passes
through unchanged so users can pass full provider IDs without registering.

When new-key + old-key are BOTH set (Codex P1 #11 fix), new-key wins and
stderr warns "deprecated config X ignored; Y is set and wins". When only
old-key is set, it's honored with a softer "rename to Y before v0.30"
warning. Both warnings emit once per (key, process) — a Set memo prevents
log spam in long-running daemons.

Migrated call sites: synthesize.ts (model + verdictModel), patterns.ts
(model). subagent.ts and search/expansion.ts to be migrated later in v0.28
(staying compatible until then).

Tests: test/model-config.test.ts — 11 cases pinning the 6-tier ordering,
alias resolution + cycle break, deprecated-key warning emit-once, and
unknown-alias pass-through. All pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 takes-fence: parser/renderer/upserter + chunker strip (privacy P0 fix)

src/core/takes-fence.ts — pure functions for the fenced markdown surface:
- parseTakesFence(body) — extracts ParsedTake[] from `<!--- gbrain:takes:begin/end -->`
  blocks. Strict on canonical form, lenient on hand-edits with warnings
  (TAKES_FENCE_UNBALANCED, TAKES_TABLE_MALFORMED, TAKES_ROW_NUM_COLLISION).
  Strikethrough `~~claim~~` → active=false; date ranges `since → until`
  split into sinceDate/untilDate.
- renderTakesFence(takes) — round-trip safe with parseTakesFence.
- upsertTakeRow(body, row) — append-only per CEO-D6 + eng-D9. Creates a
  fresh `## Takes` section if no fence present. row_num is monotonic
  (max + 1, never gap-filled — keeps cross-page refs and synthesis_evidence
  stable forever).
- supersedeRow(body, oldRow, replacement) — strikes through old row's claim
  AND appends the new row at end. Both rows preserved in markdown for
  git-blame archaeology.
- stripTakesFence(body) — removes the fenced block entirely. Used by the
  chunker so takes content lives ONLY in the takes table.

Codex P0 #3 fix: src/core/chunkers/recursive.ts now calls stripTakesFence()
before computing chunk boundaries. Without this, page chunks would contain
the rendered takes table and the per-token MCP allow-list would be
bypassed at the index layer (token bound to takes_holders=['world'] would
see garry's hunches via page hits). Doctor's takes_fence_chunk_leak check
(plan-side) asserts no chunk contains the begin marker.

Tests: 15 cases covering canonical parse, strikethrough, date range, fence
unbalanced detection, malformed-row skip + warning, row_num collision
detection, round-trip render, append-only upsert into existing fence,
fresh-section creation, monotonic row_num under hand-edit gaps, supersede
flow, stripTakesFence verifying takes content removed AND surrounding
prose preserved. Existing chunker tests still pass (15 + 15 = 30).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 page-lock: PID-liveness file lock for atomic markdown read-modify-write

src/core/page-lock.ts — per-page file lock at
~/.gbrain/page-locks/<sha256-of-slug>.lock so two concurrent `gbrain takes
add` calls or `takes seed --refresh` from autopilot can't race on the
same `<slug>.md` read-modify-write. Eng-review fold: reuses the v0.17
cycle.lock pattern (mtime + PID liveness) but per-slug.

Differences from cycle.ts's lock:
- SHA-256 of slug for safe filenames (slashes, unicode, etc.)
- Same-pid + fresh mtime = LIVE (cycle.ts assumes one lock per process and
  reclaims same-pid; page-lock allows concurrent locks for DIFFERENT slugs
  in one process). mtime expiry still rescues post-crash leftovers.
- 5-min TTL (vs cycle's 30 min — page edits are short)
- `withPageLock(slug, fn)` convenience wrapper with default 30s timeout

API:
- acquirePageLock(slug, opts) → handle | null (poll-with-timeout)
- handle.refresh() / handle.release() (idempotent — only releases if pid matches)
- withPageLock(slug, fn, opts) — acquire + run + release-in-finally

Tests: 10 cases — fresh acquire, live holder returns null, stale-mtime
reclaim, dead-PID reclaim, refresh updates timestamp, foreign-pid release
is no-op, withPageLock callback runs and releases on success/failure,
timeout-throws when held, SHA-256 filename safety for slashes/unicode.
All pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 extract-takes: dual-path phase (fs|db) + since/until_date as TEXT

src/core/cycle/extract-takes.ts — new phase that materializes the takes
table from fenced markdown blocks. Two paths mirror src/commands/extract.ts:

- extractTakesFromFs: walk *.md under repoPath, parse fences, batch upsert
- extractTakesFromDb: iterate engine.getAllSlugs(), parse each page's
  compiled_truth+timeline, batch upsert (mutation-immune snapshot iteration)

Single dispatcher extractTakes(opts) routes by source. Honors:
- slugs filter for incremental re-extract (pipes from sync→extract)
- dryRun: count would-be upserts, write nothing
- rebuild: DELETE FROM takes WHERE page_id = $1 before re-insert (clean
  slate when markdown is canonical and DB has drifted)

Schema fix: since_date/until_date were DATE in the original v31 migration.
Spec uses partial dates ('2017-01', '2026-04-29 → 2026-06') that Postgres
DATE rejects. Changed to TEXT in both the Postgres and PGLite blocks so
parser-rendered ranges round-trip cleanly. Loses the ability to do
date-range arithmetic in SQL, but date math on opinion timelines is
out of scope for v0.28 anyway. utils.ts dateOrNull now annotated as
v0.28 TEXT-aware.

Migration v31 has not been deployed yet (this branch is the v0.28 release
candidate), so the type swap is free. No data migration needed.

Tests: test/extract-takes.test.ts — 5 cases against PGLite covering full
walk + fence-skip on no-fence pages, takes-table populated post-extract,
incremental slugs filter, dry-run no-write, rebuild=true clears + re-inserts
ad-hoc rows. test/takes-engine.test.ts (16), test/takes-fence.test.ts (15)
all still pass — 36/36 takes tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 takes CLI: list, search, add, update, supersede, resolve

src/commands/takes.ts — surfaces the engine methods + takes-fence library
through a single `gbrain takes <subcommand>` entrypoint:

  takes <slug>                          list with filters + sort
  takes search "<query>"                pg_trgm keyword search across all takes
  takes add <slug> --claim ... ...      append (markdown + DB, atomic via lock)
  takes update <slug> --row N ...       mutable-fields update (markdown + DB)
  takes supersede <slug> --row N ...    strikethrough old + append new
  takes resolve <slug> --row N --outcome  record bet resolution (immutable)

Markdown is canonical. Every mutate command:
  1. acquires the per-page file lock (withPageLock)
  2. re-reads the .md file
  3. applies the edit via takes-fence (upsertTakeRow / supersedeRow)
  4. writes the .md file back
  5. mirrors to the DB via the engine method
  6. releases the lock (auto via finally)

Resolve currently writes only to DB — surfacing resolved_* in the markdown
table is deferred to v0.29 (the takes-fence renderer's column set is
fixed at # | claim | kind | who | weight | since | source per spec).

Wired into src/cli.ts dispatch + CLI_ONLY allowlist. Help text follows the
project convention (orphans/embed/extract pattern). --dir flag overrides
sync.repo_path config when working outside the configured brain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 MCP + auth: takes_list / takes_search / think ops + per-token allow-list

OperationContext gains takesHoldersAllowList — server-side filter for
takes.holder field threaded from access_tokens.permissions through dispatch
into the engine SQL. Closes Codex P0 #3 at the dispatch layer (chunker
strip already closed the page-content side in the previous commit).

src/core/operations.ts — three new ops:
- takes_list: lists takes with holder/kind/active/resolved filters; honors
  ctx.takesHoldersAllowList for MCP-bound calls
- takes_search: pg_trgm keyword search; honors allow-list
- think: op surface registered (returns not_implemented envelope until
  Lane D's pipeline lands). Remote callers cannot save/take per Codex P1 #7.

src/mcp/dispatch.ts — DispatchOpts.takesHoldersAllowList threads into
buildOperationContext.

src/mcp/http-transport.ts — validateToken now reads
access_tokens.permissions.takes_holders, defaults to ['world'] when the
column is absent or malformed (default-deny on private hunches).
auth.takesHoldersAllowList passed to dispatchToolCall.

src/mcp/server.ts (stdio) — defaults to takesHoldersAllowList: ['world']
since stdio has no per-token auth. Operators wanting full visibility use
`gbrain call <op>` directly (sets remote=false).

src/commands/auth.ts — `gbrain auth create <name> --takes-holders w,g,b`
flag persists the per-token list; new `auth permissions <name>
set-takes-holders <list>` updates an existing token.

Tests: test/takes-mcp-allowlist.test.ts — 8 cases against PGLite proving
the threading: local-CLI sees all holders, ['world'] returns only public,
['world','garry'] returns 2/3, no-overlap returns empty (no fallback),
search honors allow-list, remote save/take on think rejected with
not_implemented envelope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28.0: ship-prep — VERSION, CHANGELOG, migration orchestrator, skill

Closes the v0.28 ship-prep cycle. Bumps VERSION + package.json + bun.lock
to 0.28.0. v0_28_0 migration orchestrator runs three idempotent phases on
upgrade:

- Schema verify: asserts schema_version >= 32 (migrations v31 + v32 already
  applied by the schema runner during gbrain upgrade); fails clean if not.
- Backfill takes: inline runs `extractTakes(engine, { source: 'db' })` so
  any pre-existing fenced takes tables in markdown populate the takes
  index. Idempotent; ON CONFLICT DO UPDATE keeps the table in sync.
- Re-chunk TODO: queues a pending-host-work entry asking the host agent
  to re-import pages with takes content so the v0.28 chunker-strip rule
  (Codex P0 #3 fix) applies retroactively. Pages imported under v0.28+
  already have takes content stripped from chunks at index time; this
  TODO catches up legacy pages.

skills/migrations/v0.28.0.md — agent-readable upgrade guide. Walks
through doctor verification, deprecated-key migration, MCP token
visibility configuration, and a "try the takes layer" smoke test.

CHANGELOG.md — v0.28.0 release-summary in the GStack voice (no AI
vocabulary, no em dashes, real numbers from git diff stat) + the
mandatory "To take advantage of v0.28.0" block + itemized changes by
subsystem (schema, engine, markdown surface, model config, MCP+auth,
CLI, tests, accepted risks).

Final test sweep: 65/65 v0.28 tests pass across 6 files. typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 think pipeline: gather → sanitize → synthesize → cite-render → CLI

src/core/think/sanitize.ts — prompt-injection defense for take claims:
14 jailbreak patterns (ignore-prior, role-jailbreak, close-take tag,
DAN, system-prompt overrides, eval-shell hooks) plus structural framing
(takes wrapped in <take id="..."> tags the model is told to treat as
DATA). Length-cap at 500 chars. Renders evidence blocks for the prompt.

src/core/think/prompt.ts — system prompt + structured-output schema.
Hard rules: cite every claim, mark hunches/low-weight explicitly,
surface conflicts (never silently pick), surface gaps. JSON schema
with answer + citations[] + gaps[]. Prompt adapts to anchor / time
window / save flag.

src/core/think/cite-render.ts — structured citations + regex fallback
(Codex P1 #4 fold). normalizeStructuredCitations validates the model's
structured output; parseInlineCitations is the body-scan fallback when
the model omits the structured field. resolveCitations dispatches and
records CITATIONS_REGEX_FALLBACK warning when used.

src/core/think/gather.ts — 4-stream parallel retrieval:
  1. hybridSearch (pages, existing primitive)
  2. searchTakes (keyword, pg_trgm)
  3. searchTakesVector (vector, when embedQuestion fn supplied)
  4. traversePaths (graph, when --anchor set)
RRF fusion (k=60). Each stream wrapped in try/catch — partial gather
beats no synthesis. Honors takesHoldersAllowList for MCP-bound calls.

src/core/think/index.ts — runThink orchestrator + persistSynthesis:
INTENT (regex classify) → GATHER → render evidence blocks → resolveModel
('models.think' → 'models.default' → GBRAIN_MODEL → opus) → LLM call
(injectable client) → JSON parse with code-fence + fallback strip →
resolveCitations → ThinkResult. persistSynthesis writes a synthesis
page + synthesis_evidence rows (page_id resolved per slug; page-level
citations skip evidence). Degrades gracefully without ANTHROPIC_API_KEY.
Round-loop scaffolding in place (rounds=1 only path exercised in v0.28).

src/commands/think.ts — `gbrain think "<question>"` CLI. Flag parsing
strips --anchor, --rounds, --save, --take, --model, --since, --until,
--json. Local CLI = remote=false, so save/take honored. Human-readable
output by default; --json for agent consumption.

operations.ts — `think` op now calls runThink (was a not_implemented
stub). Remote callers can't save/take per Codex P1 #7. Returns full
ThinkResult plus saved_slug + evidence_inserted.

cli.ts — wired into dispatch + CLI_ONLY allowlist.

Tests: test/think-pipeline.test.ts — 18 cases against PGLite covering
sanitize patterns, structural rendering, citation parsing (structured +
regex fallback + dedup + invalid-slug rejection), gather streams +
allow-list filter, full pipeline with stub client, malformed-LLM
fallback path, no-API-key graceful degradation, persistSynthesis writes
page + evidence rows. All pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 dream phases: auto-think + drift + budget meter (Codex P1 #10 fold)

src/core/anthropic-pricing.ts — USD/1M-tokens map for Claude 4.7 family
plus older aliases. estimateMaxCostUsd returns null on unpriced models so
the meter caller can warn-once and bypass the gate.

src/core/cycle/budget-meter.ts — cumulative cost ledger. Each submit
estimates max-cost from (model + estimatedInputTokens + maxOutputTokens),
accumulates per-cycle, refuses next submit when projected > cap. Codex
P1 #10 fold: non-Anthropic models (gemini, gpt) bypass with one stderr
warn per process and `unpriced=true` on the result. Budget=0 disables
the gate. Audit trail at ~/.gbrain/audit/dream-budget-YYYY-Www.jsonl.

src/core/cycle/auto-think.ts — auto_think dream phase. Reads
dream.auto_think.{enabled,questions,max_per_cycle,budget,cooldown_days,
auto_commit}. Iterates configured questions through runThink with the
BudgetMeter pre-checking each submit. Cooldown timestamp written ONLY on
success (matches v0.23 synthesize pattern — retries after partial
failures pick back up). When auto_commit=true, persists synthesis pages
via persistSynthesis. Default-disabled.

src/core/cycle/drift.ts — drift dream phase scaffold. Reads
dream.drift.{enabled,lookback_days,budget,auto_update}. Surfaces takes
in the soft band (weight 0.3-0.85, unresolved) that have recent timeline
evidence on the same page. v0.28 ships the orchestration; the LLM judge
that proposes weight adjustments lands in v0.29. modelId + meter wired
now so the ledger captures gate state for callers that opt in.

Tests:
- test/budget-meter.test.ts (7 cases) — pricing-map coverage, allow path,
  cumulative-deny, budget=0 disabled, unpriced bypass+warn-once, ledger
  captures all events, ISO-week filename branch.
- test/auto-think-phase.test.ts (9 cases) — auto_think enable/skip,
  questions empty, success → cooldown ts written, cooldown blocks rerun,
  budget exhausted → partial. drift not_enabled, soft-band candidate
  detection, complete + dry-run paths.

All pass. Typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 e2e Postgres: takes engine + extract + MCP allow-list (12 cases)

test/e2e/takes-postgres.test.ts — full v0.28 takes pipeline against real
Postgres (gated on DATABASE_URL). 12 cases:
- addTakesBatch upsert via unnest() bind path (Postgres-specific)
- listTakes filters: holder, kind, sort=weight, takesHoldersAllowList
- searchTakes pg_trgm + allow-list filter
- supersedeTake transactional path (BEGIN/COMMIT semantics)
- resolveTake immutability — second resolve throws TAKE_ALREADY_RESOLVED
- synthesis_evidence FK CASCADE on take delete
- countStaleTakes + listStaleTakes filter active+null
- extractTakesFromDb populates takes from fenced markdown
- MCP dispatch with takesHoldersAllowList=['world'] returns only world
- MCP dispatch local-CLI path returns all holders
- MCP dispatch takes_search honors allow-list
- think op forces remote_persisted_blocked even for save+take

postgres-engine.ts: addTakesBatch boolean[] serialization fix.
postgres-js auto-detects element type from JS arrays; for booleans it
mis-detects as scalar. Cast through text[] (`'true' | 'false'`) then
SQL-cast to boolean[] — same pattern other batch methods rely on for
type-stable bind shapes.

test/e2e/helpers.ts: setupDB now (a) tolerates non-existent tables in
TRUNCATE (for fresh DBs where v31 hasn't yet created takes/synthesis_evidence)
and (b) calls engine.initSchema() to actually run migrations.

test/takes-mcp-allowlist.test.ts: updated 2 think-op cases to match
Lane D's landed pipeline. They previously asserted not_implemented
envelopes; now they assert remote_persisted_blocked + NO_ANTHROPIC_API_KEY
graceful-degrade behavior.

Run: DATABASE_URL=postgres://localhost:5435/gbrain_test bun test test/e2e/takes-postgres.test.ts
Result: 12/12 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 dream phases: local DreamPhaseResult type (avoid premature CyclePhase enum extension)

cycle.ts's PhaseResult is shaped {phase, status, summary, details} with a
narrow PhaseStatus enum ('ok'|'warn'|'fail'|'skipped') and CyclePhase enum
that doesn't yet include 'auto_think'/'drift'. The phases ship standalone
in v0.28 (cycle.ts dispatcher integration is v0.28.x); using PhaseResult
forced premature enum extension.

Introduces DreamPhaseResult exported from auto-think.ts:
  { name: 'auto_think'|'drift'; status: 'complete'|'partial'|'failed'|'skipped';
    detail: string; totals?: Record<string,number>; duration_ms: number }

drift.ts re-exports the same type. When v0.28.x wires the dispatcher, the
adapter at the call site can map DreamPhaseResult → PhaseResult cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 e2e: access_tokens.permissions JSONB end-to-end (5 cases)

test/e2e/auth-permissions.test.ts — closes the v0.28 token-allow-list
verification loop against real Postgres. Exercises:

- Migration v32 default backfill: new tokens created without a permissions
  column get {takes_holders: ["world"]} via the schema DEFAULT clause.
- Explicit ["world","garry"] → dispatch.takes_list filters to those
  holders only; brain hunches stay hidden from this token.
- ["world"] default-deny token → takes_search hits filtered to public claims.
- {} permissions row (operator tampered) gracefully defaults to ["world"]
  via the HTTP transport's validateToken parsing.
- revoked_at IS NOT NULL → token excluded from active token query.

Avoids the postgres-js JSONB double-encode trap (CLAUDE.md memory): pass
the object directly to executeRaw, no JSON.stringify, no ::jsonb cast.

All 5 pass against pgvector/pgvector:pg16 on port 5435. Combined v0.28
test sweep: 116/116 across 11 files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 e2e: chunker takes-strip integration test (Codex P0 #3 verification)

test/e2e/chunker-takes-strip.test.ts — verifies the chunker actually
strips fenced takes content end-to-end through the import pipeline.
This is the Codex P0 #3 fix's verification path: takes content lives
ONLY in the takes table for retrieval, never duplicated in
content_chunks where the per-token MCP allow-list cannot reach.

5 cases:
- chunkText (unit) output never contains TAKES_FENCE_BEGIN/END markers
- chunkText output never contains fenced claim text
- chunkText output retains non-fence prose (no over-stripping)
- importFromContent end-to-end: imported page has chunks but none
  contain fenced content
- takes_fence_chunk_leak doctor invariant: zero rows globally where
  chunk_text matches `<!--- gbrain:takes:%`

Final v0.28 test sweep:
  121 pass, 0 fail, 336 expect() calls, 12 files
  Coverage: schema migrations, engine methods (PGLite + Postgres),
  takes-fence parser, page-lock, extract phase, takes CLI engine
  surface, model config 6-tier resolver, MCP+auth allow-list,
  think pipeline (gather + sanitize + cite-render + synthesize),
  auto-think + drift + budget meter, JSONB end-to-end, chunker
  strip integration. ~95% of v0.28 surface area covered.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix CI: apply-migrations skippedFuture arrays + http-transport SQL mock

Two CI failures from PR #563:

test/apply-migrations.test.ts (2 fails) — `buildPlan` tests assert exact
skippedFuture arrays at fixed installed-version stamps. Adding v0.28.0 to
the migration registry means it shows up in skippedFuture when the test
runs at installed=0.11.1 / installed=0.12.0. Append '0.28.0' to both
hardcoded arrays.

test/http-transport.test.ts (8 fails) — the FakeEngine mock string-prefix
matches `SELECT id, name FROM access_tokens` to return a row. v0.28's
validateToken now selects `SELECT id, name, permissions FROM access_tokens`
to read the per-token takes_holders allow-list. Mock returned [] on the
new query → validateToken treated every token as invalid → 401.

Fix: mock now matches both query shapes. validTokens row gets a default
`{takes_holders: ['world']}` permission injected when caller didn't
supply one (mirrors the migration v33 column DEFAULT). Updated
FakeEngineConfig type to allow tests to pass explicit permissions.

Verification:
  bun test test/apply-migrations.test.ts → 18/18 pass
  bun test test/http-transport.test.ts   → 24/24 pass
  bun run typecheck                       → clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix CI: add scope annotations to v0.28 ops (takes_list/takes_search/think)

test/oauth.test.ts enforces an invariant from master's v0.26 OAuth landing:
every Operation must have `scope: 'read' | 'write' | 'admin'`, and any op
flagged `mutating: true` must be 'write' or 'admin'. My v0.28 ops were added
before master shipped v0.26 + the new invariant; the merge surfaced the gap.

Annotations:
- takes_list   → read
- takes_search → read
- think        → write (mutating: true; --save persists synthesis page)

Verification:
  bun test test/oauth.test.ts → 42/42 pass
  bun run typecheck            → clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28.2 feat: remote-source MCP + scope hierarchy + whoami (#690)

* refactor(core): extract SSRF helpers from integrations.ts to core/url-safety.ts

src/core/git-remote.ts (next commit) needs isInternalUrl etc. but importing
from src/commands/ would invert the layering boundary (no existing
src/core/ file imports from src/commands/). Extract the SSRF helpers
(parseOctet, hostnameToOctets, isPrivateIpv4, isInternalUrl) into a new
src/core/url-safety.ts and have integrations.ts re-export for backward
compat. test/integrations.test.ts continues to pass without changes (110
existing tests, 214 expects).

Why this matters for v0.28: the upcoming sources --url feature reuses
this SSRF gate for git-clone URL validation. Codex review caught that
re-rolling weaker URL classification would regress on the IPv6/v4-mapped/
metadata/CGNAT bypass forms that integrations.ts already handles.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(core): add git-remote module — SSRF-defensive clone/pull + state probe

New src/core/git-remote.ts (~210 lines) for v0.28's remote-source feature:

- GIT_SSRF_FLAGS exported const: -c http.followRedirects=false,
  -c protocol.file.allow=never, -c protocol.ext.allow=never,
  --no-recurse-submodules. Single source of truth shared by cloneRepo
  and pullRepo so a future flag added to one path lands on both.
  Closes the SSRF surfaces codex flagged: DNS rebinding via redirects,
  .gitmodules as a second-fetch surface, file:// scheme in remotes.

- parseRemoteUrl: https-only, rejects embedded credentials and path
  traversal, delegates internal-target classification to isInternalUrl
  from url-safety.ts (covers RFC1918, link-local, loopback, IPv6, CGNAT
  100.64/10, metadata hostnames, hex/octal/single-int bypass forms).
  GBRAIN_ALLOW_PRIVATE_REMOTES=1 escape hatch with stderr warning is
  needed for self-hosted git over Tailscale (CGNAT trips the gate).

- cloneRepo: --depth=1 default (full clone via depth: 0); refuses
  non-empty destDirs; spawns git via execFileSync (no shell injection)
  with GIT_TERMINAL_PROMPT=0 + askpass=/bin/false to prevent credential
  prompts. timeoutMs default 600s.

- pullRepo: -C path + GIT_SSRF_FLAGS + pull --ff-only, same env confine.

- validateRepoState: 6-state decision tree (missing | not-a-dir |
  no-git | corrupted | url-drift | healthy). Used by performSync's
  re-clone branch to recover from rmd clone dirs and refuse syncs on
  url-drift or corruption.

test/git-remote.test.ts (304 lines, 32 tests): GIT_SSRF_FLAGS exact
shape, all parseRemoteUrl rejection cases including dedicated CGNAT
100.64/10 with/without GBRAIN_ALLOW_PRIVATE_REMOTES (codex T3 case),
fake-git harness for argv assertions on cloneRepo/pullRepo, all 6
validateRepoState branches.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(core): add scope hierarchy + ALLOWED_SCOPES allowlist

New src/core/scope.ts (~120 lines) for v0.28's scoped MCP feature.

Hierarchy:
  - admin implies all (escape hatch)
  - write implies read
  - sources_admin and users_admin are siblings (different axes —
    sources-mgmt vs user-account-mgmt; neither implies the other)

Exported:
  - hasScope(grantedScopes, requiredScope): the canonical scope check.
    Replaces exact-string-match at three call sites in upcoming commits
    (serve-http.ts:673, oauth-provider.ts:365 F3 refresh, oauth-provider.ts:498
    token issuance). Without this rewrite, an admin-grant token would
    fail to refresh down to sources_admin (codex finding).
  - ALLOWED_SCOPES set + ALLOWED_SCOPES_LIST sorted array (deterministic
    for OAuth metadata wire format and drift-check output).
  - assertAllowedScopes / InvalidScopeError: registration-time gate so
    tokens with bogus scope strings (read flying-unicorn) get rejected
    with RFC 6749 §5.2 invalid_scope at auth.ts:296 + DCR /register +
    registerClientManual. Today's behavior accepts any string silently.
  - parseScopeString: space-separated wire format → array.

Forward-compat: hasScope ignores unknown granted scopes rather than
throwing, so pre-allowlist tokens with weird scope strings continue
working without crashes (registration is the gate, runtime is best-effort).

test/scope.test.ts (178 lines, 35 tests): hierarchy table including
all-implies for admin, sibling non-implication of *_admin scopes,
write→read but not the reverse, F3 refresh-token subset semantics
under hasScope, ALLOWED_SCOPES_LIST sorted-pinning, allowlist
rejection cases, parseScopeString edge cases (undefined/null/empty).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* build(admin): scope-constants mirror + drift CI for src/core/scope.ts

The admin React SPA's tsconfig.json scopes include: ['src'] to admin/src/,
so it cannot directly import ../../src/core/scope.ts. The plan considered
widening the include or generating a single source of truth; both options
either couple the SPA to the gbrain monorepo or add a build step. Eng
review picked the boring choice: hand-maintained mirror at
admin/src/lib/scope-constants.ts plus a CI drift check.

Files:
  - admin/src/lib/scope-constants.ts: hand-maintained ALLOWED_SCOPES_LIST
    duplicate, sorted alphabetically to match src/core/scope.ts.
  - scripts/check-admin-scope-drift.sh: extracts the list from each file
    via awk, normalizes via tr/sort, diffs. Exits 0 on match, 1 on drift
    (with full breakdown of which scopes diverged), 2 on internal error.
    Tested both passing and corrupted paths.
  - package.json: wires check:admin-scope-drift into both `verify` and
    `check:all` so any update to src/core/scope.ts that forgets the
    admin-side mirror fails the build.

The Agents.tsx scope-checkbox sites (5 hardcoded locations) get updated
in a later commit to import from this constants file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(oauth): hasScope hierarchy + ALLOWED_SCOPES allowlist at registration

Switch three call sites in oauth-provider.ts from exact-string-match to
hasScope() so the v0.28 sources_admin and users_admin scopes — and the
admin-implies-all + write-implies-read hierarchy in src/core/scope.ts —
work end to end:

- F3 refresh-token subset enforcement at line 365: previously rejected
  admin → sources_admin refresh because exact-match treated them as
  unrelated scopes. gstack /setup-gbrain Path 4 needs admin tokens to
  refresh down to least-privilege sources_admin scope; this fix lands
  that path.

- Token issuance intersection at line 498 (client_credentials grant):
  same hasScope swap so a client whose stored grant is `admin` can mint
  tokens including any implied scope.

- registerClient (DCR /register) and registerClientManual: validate
  every scope string against ALLOWED_SCOPES via assertAllowedScopes.
  Pre-fix the system silently accepted `--scopes "read flying-unicorn"`
  and persisted the bogus string in oauth_clients.scope. Post-fix the
  caller gets RFC 6749 §5.2 invalid_scope. Existing rows with
  pre-allowlist scopes keep working (allowlist gates registration only).

Tests amended in test/oauth.test.ts:
- T1 (eng-review): admin grant CAN refresh down to sources_admin
- T1 sibling: write grant CANNOT refresh up to sources_admin
- ALLOWED_SCOPES allowlist coverage (manual + DCR paths, all 5 valid)
- Scope-annotation contract tests widened to accept the v0.28 union

62 OAuth tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(serve-http): hasScope at /mcp + advertise full ALLOWED_SCOPES

Two changes against src/commands/serve-http.ts:

- Line 195: scopesSupported on the mcpAuthRouter options switches from the
  hardcoded ['read','write','admin'] to Array.from(ALLOWED_SCOPES_LIST).
  Without this, /.well-known/oauth-authorization-server keeps reporting
  the old triple, so MCP clients (Claude Desktop, ChatGPT, Perplexity)
  cannot discover the v0.28 sources_admin and users_admin scopes via
  standard discovery — they would have to be pre-configured out of band.

- Line 673: request-time scope check on /mcp swaps
  authInfo.scopes.includes(requiredScope) for hasScope(...). This was
  the most-cited codex finding: without it, sources_admin tokens could
  not even satisfy a `read`-scoped op (sources_admin doesn't include
  the literal string "read"). hasScope routes through the hierarchy
  table in src/core/scope.ts so admin implies all and write implies
  read at the gate too.

T2 amendment in test/e2e/serve-http-oauth.test.ts: assert
/.well-known/oauth-authorization-server includes all 5 scopes in
scopes_supported. Pre-v0.28 the list was hardcoded to ['read','write',
'admin'] and this assertion would have failed. (The test is
Postgres-gated; runs under bun run test:e2e with DATABASE_URL set.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(core): sources-ops module — atomic clone + symlink-safe cleanup

src/core/sources-ops.ts (~470 lines): pure async functions extracted from
src/commands/sources.ts so the CLI handlers and the new MCP ops share
one implementation.

addSource: D3 atomicity contract from the eng review.
  1. Validate id (matches existing SOURCE_ID_RE).
  2. Q4 pre-flight SELECT — fail loudly with structured `source_id_taken`
     before any clone work. Pre-fix the existing CLI used INSERT…ON
     CONFLICT DO NOTHING which silently no-op'd; with clone-first that
     would orphan the temp dir.
  3. parseRemoteUrl gate (delegates to isInternalUrl from url-safety.ts).
  4. Clone into $GBRAIN_HOME/clones/.tmp/<id>-<rand>/ via the new
     git-remote helpers.
  5. INSERT row with local_path=<final clone dir>, config.remote_url=<url>.
  6. fs.renameSync(tmp/, final/). Rollback on either-side failure unlinks
     the temp dir; rename-failed path also DELETEs the just-INSERTed row
     best-effort.

removeSource: clone-cleanup with realpath+lstat confinement matching
validateUploadPath() shape at src/core/operations.ts:61. String startsWith
is symlink-unsafe and would let $GBRAIN_HOME/clones/<id> → /etc resolve
out of the confine. Two defenses layered:
  - isPathContained (realpath-resolves both sides + parent-with-sep
    string check) rejects symlinks whose target falls outside the
    confine.
  - lstat-then-isSymbolicLink check refuses symlinks whose realpath
    happens to land back inside the confine (defense in depth).

getSourceStatus: returns clone_state via validateRepoState (the 6-state
decision tree from git-remote.ts). Lets a remote MCP caller diagnose
"healthy | missing | not-a-dir | no-git | url-drift | corrupted" without
SSH access to the brain host. listSources additionally exposes
remote_url so callers can see which sources are auto-managed.

recloneIfMissing: T4 follow-up for `gbrain sources restore` after the
clone dir was autopurged — re-clones via the same temp + rename
atomicity contract. Idempotent (returns false when clone is already
healthy).

test/sources-ops.test.ts (~470 lines, 24 tests): pre-flight collision
(Q4), happy paths for both --path and --url, all four D3 rollback paths
(clone-fail before INSERT, INSERT-fail after clone, rename-fail
post-INSERT, atomic temp-dir cleanup), symlink-target-OUTSIDE-clones
(realpath confinement), symlink-target-INSIDE-clones (lstat-check),
removeSource refuses to delete user-supplied paths, refuses "default"
source, getSourceStatus clone_state branches, T4 recloneIfMissing
recovery + idempotent + no-op for path-only sources, isPathContained
unit tests covering subtree / outside / symlink-escape / fail-closed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(operations): whoami + sources_{add,list,remove,status} MCP ops

Five new ops in src/core/operations.ts auto-flow through src/mcp/tool-defs.ts
so MCP clients (Claude Desktop, ChatGPT, Perplexity, OpenClaw) get them via
standard tools/list discovery — no SDK or transport code changes needed.

Operation.scope union widened to add 'sources_admin' and 'users_admin' (the
v0.28 hierarchy from src/core/scope.ts).

whoami (scope: read): introspect calling identity over MCP.
  - Returns `{transport: 'oauth', client_id, client_name, scopes, expires_at}`
    for OAuth clients (clientId starts with gbrain_cl_).
  - Returns `{transport: 'legacy', token_name, scopes, expires_at: null}`
    for grandfathered access_tokens.
  - Returns `{transport: 'local', scopes: []}` when ctx.remote === false.
    Empty scopes (NOT ['read','write','admin']) is the D2 decision —
    returning OAuth-shaped scopes for local callers would resurrect the
    v0.26.9 footgun where code conditionally trusted on
    `auth.scopes.includes('admin')` instead of `ctx.remote === false`.
  - Q3 fail-closed: throws unknown_transport when remote=true AND auth is
    missing OR ctx.remote is the literal `undefined` (cast bypass guard).
    A future transport that forgets to thread auth doesn't get a free
    pass.

sources_add (sources_admin, mutating): register a source by --path
  (existing v0.17 behavior) or --url (v0.28 federated remote-clone path).
  Calls into addSource from sources-ops.ts which owns the temp-dir +
  rename atomicity.

sources_list (read): list registered sources with page counts, federated
  flag, and remote_url. The remote_url field is new — lets a remote MCP
  caller see which sources are auto-managed.

sources_remove (sources_admin, mutating): cascade-delete a source +
  symlink-safe clone cleanup. Requires confirm_destructive: true when the
  source has data.

sources_status (read): per-source diagnostic returning clone_state
  ('healthy' | 'missing' | 'not-a-dir' | 'no-git' | 'url-drift' |
  'corrupted' | 'not-applicable') — lets a remote MCP caller diagnose a
  busted clone without SSH access to the brain host.

test/whoami.test.ts (9 tests): pinned transport-detection for all four
return shapes including Q3 fail-closed throw under both auth=undefined
and remote=undefined cast-bypass paths.

test/sources-mcp.test.ts (16 tests): op-metadata pins (scope, mutating,
localOnly), functional handler shape against PGLite, hasScope-driven
scope-enforcement smoke test simulating the serve-http.ts:673 gate
(read-only token rejected for sources_add; sources_admin token allowed;
admin token allowed for everything; gstack /setup-gbrain Path 4 token
covers all 4 ops), SSRF gate at the op layer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(sync): re-clone fallback when clone is missing/no-git/corrupted

src/commands/sync.ts gets a v0.28-aware front-half. When the source has
config.remote_url, performSync calls validateRepoState before the existing
fast-forward pull path:

  - 'healthy'    → fall through to existing pull (unchanged)
  - 'missing'    → loud stderr "auto-recovery: re-cloning <id>", then
  'no-git'         recloneIfMissing handles the temp-dir + rename. Sync
  'not-a-dir'      continues from the freshly-cloned head.
  - 'corrupted'  → throw with structured hint pointing at sources remove
                   + add (no syncing wrong state).
  - 'url-drift'  → throw with hint pointing at the (deferred) sources
                   rebase-clone command.

Closes the operator-confidence gap: rm -rf $GBRAIN_HOME/clones/<id>/ no
longer breaks future syncs. The next sync sees the missing dir and
recovers via the recorded URL.

src/core/operations.ts: extend ErrorCode with 'unknown_transport' so
whoami's Q3 fail-closed path types check.

test/sources-resync-recovery.test.ts (12 tests): full validateRepoState
state matrix exercised under fake-git, recloneIfMissing recovery from
each degraded state, idempotent on healthy clones, the sync.ts:320
integration path that drives the recovery.

test/sources-ops.test.ts + test/sources-mcp.test.ts: drop the
GBRAIN_PGLITE_SNAPSHOT-disable line so these tests stop forcing cold
init across the parallel-shard runner. With snapshot allowed, init time
drops from 6+s to ~50ms and parallel runs stay under the 5s hook
timeout.

test/sources-mcp.test.ts: tighten scope literal-type so tsc keeps the
union narrow.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cli): sources add --url + restore re-clone, thin-wrapper refactor

src/commands/sources.ts now delegates the data-mutation work to
src/core/sources-ops.ts (added in the previous commit). The CLI handler
parses argv, calls into addSource, and formats output.

Two new flags on `gbrain sources add`:
  - `--url <https-url>` : federated remote-clone path (clone + INSERT +
    rename, atomic rollback on failure).
  - `--clone-dir <path>` : override the default
    $GBRAIN_HOME/clones/<id>/ destination.

Validation rejects mutually-exclusive `--url` + `--path`. Errors from
the ops layer (SourceOpError) propagate through the CLI's standard
error wrapper in src/cli.ts so existing tests that assert throw shape
keep passing.

`gbrain sources restore <id>` (T4 from eng review): if the source has a
remote_url AND the on-disk clone was autopurged, call recloneIfMissing
before declaring success. Clone errors print a WARN with recovery
hints rather than failing the restore — the DB row is what restore
guarantees; the clone is best-effort.

54 sources-related tests pass (existing test/sources.test.ts +
sources-ops + sources-mcp).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(doctor,cycle): orphan-clones surface + autopilot purge phase (P1)

addSource's atomicity contract uses a temp dir that gets renamed to the
final clone path. If the process is SIGKILL'd between clone-finish and
rename, the temp dir orphans on disk. Without sweeping these, a brain
server accumulates gigabytes over months of failed `sources add --url`
attempts.

Two layers:

1. `gbrain doctor` now surfaces stale entries. A new orphan_clones check
   walks $GBRAIN_HOME/clones/.tmp/, names anything older than 24h, and
   prints a warn with disk-byte estimate. Operators see the leak before
   `df` complains.

2. The autopilot cycle's existing `purge` phase grows a substep that
   nukes .tmp/ entries past the same 72h TTL the page-soft-delete purge
   uses. Operator behavior stays uniform across all soft-delete-style
   surfaces.

Both layers are filesystem-only (no DB). On a brain that never used
--url cloning, both are no-ops.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* build(admin): scope checkboxes source from scope-constants mirror + dist

admin/src/pages/Agents.tsx Register Client modal:
  - useState default sources from ALLOWED_SCOPES_LIST (defaulting `read`
    to true, others false; unchanged UX for the common case).
  - Scope checkbox map iterates ALLOWED_SCOPES_LIST instead of the old
    hardcoded ['read','write','admin'].

Without this commit, even with the v0.28.1 server-side scope hierarchy,
operators registering an OAuth client from the admin UI cannot tick the
new sources_admin / users_admin scopes — defeats the whole gstack
/setup-gbrain Path 4 unblock.

The drift-check CI gate (scripts/check-admin-scope-drift.sh) ensures
this list stays in sync with src/core/scope.ts going forward.

admin/dist/* rebuilt via `cd admin && bun run build`. Old hash bundle
removed; new bundle (224.96 kB / 68.70 kB gzip).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: v0.28.1 — remote-source MCP + scope hierarchy + whoami

VERSION + package.json: bump to 0.28.1 (per CLAUDE.md branch-scoped
versioning rule — this branch adds substantial new features on top of
v0.28.0).

CHANGELOG.md: new top-level entry for v0.28.1 in the gstack/Garry voice
(no AI vocabulary, no em dashes, real numbers + commands). Lead
paragraph names what the user can now do that they couldn't before.
"Numbers that matter" table calls out the +5 MCP ops, +2 OAuth scopes,
and the 4-to-0 SSH-step number for gstack /setup-gbrain Path 4. "What
this means for you" closer ties the work to the operator workflow shift.
"To take advantage of v0.28.1" block has paste-ready upgrade commands
including the admin SPA rebuild step. Itemized changes section
describes the architecture cleanly without exposing scope-string
internals to public attack-surface enumeration (per CLAUDE.md
responsible-disclosure rule).

TODOS.md: file 6 follow-ups under a new "Remote-source MCP follow-ups
(v0.28.1)" section: token rotation, migration introspection in
get_health, Accept-header friendliness, sources rebase-clone for
URL-drift recovery, --filter=blob:none partial-clone option, and the
chunker_version PGLite-schema parity codex caught.

README.md: short subsection under the existing sources CLI listing
that names the new --url flag and what auto-recovery does. Capability
framing (no scope-string enumeration).

llms.txt + llms-full.txt: regenerated via `bun run build:llms` so the
documentation bundle reflects the v0.28.1 entry. The build-llms
generator's drift check passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(e2e): sources-remote-mcp — full gstack /setup-gbrain Path 4 round-trip

Spins up `gbrain serve --http` against real Postgres with a fake-git binary
in PATH (so `git clone` is exercised end-to-end without network), registers
two OAuth clients (sources_admin + read-only), mints tokens, calls the new
v0.28.1 MCP ops via /mcp, and asserts the gstack /setup-gbrain Path 4 flow
works end to end.

12 tests cover the full lifecycle:
- whoami over HTTP MCP returns transport=oauth + the right scopes
- /.well-known/oauth-authorization-server advertises all 5 scopes
- sources_add: clone fires, INSERT lands, row carries config.remote_url
- sources_status: clone_state=healthy after add
- sources_list: surfaces remote_url for the new source
- SSRF rejection: sources_add with RFC1918 URL fails at parseRemoteUrl gate
- Scope enforcement: read-only token gets insufficient_scope on sources_add
- Read-only token CAN call sources_list (read-scoped op)
- ALLOWED_SCOPES allowlist: CLI register-client rejects bogus scope
- Recovery: rm clone dir + sources_status reports clone_state=missing
- sources_remove: cascades + cleans up the auto-managed clone dir

Subprocess env threading replicates the v0.26.2 bun execSync inheritance
pattern — bun does NOT inherit process.env mutations, so every CLI
subprocess call passes env: { ...process.env } explicitly.

Cleanup contract mirrors test/e2e/serve-http-oauth.test.ts: revoke any
clients we registered, force-kill the server subprocess on SIGTERM
timeout, surface cleanup failures to stderr without throwing so real
test failures aren't masked.

The base table list in helpers.ts (ALL_TABLES) doesn't include sources
or oauth_clients, so this test explicitly truncates them in beforeAll
to avoid Q4 pre-flight collisions on re-run.

Skipped gracefully when DATABASE_URL is unset.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: codex adversarial review — confine remote sources_admin + close SSRF gaps

Pre-ship adversarial review (codex exec) caught five issues. Four ship in
this commit; the fifth (DNS rebinding) is filed as v0.28.x follow-up.

CRITICAL — `sources_admin` tokens over HTTP MCP could plant content at any
host path. The MCP op exposed `path` and `clone_dir` to remote callers; the
op layer trusted them verbatim, then auto-recovery's rm -rf on degraded
state turned that into arbitrary delete primitives. src/core/operations.ts
sources_add handler now drops both fields when ctx.remote !== false. Local
CLI keeps the override (operator trust). Loud logger.warn when a remote
caller tries — visible in the SSE feed without leaking values.

HIGH — Steady-state `git pull --ff-only` bypassed GIT_SSRF_FLAGS entirely.
The legacy helper at src/commands/sync.ts:192 spawned git without the
-c http.followRedirects=false -c protocol.{file,ext}.allow=never
--no-recurse-submodules set that cloneRepo applies. Every recurring sync
was reopening the redirect/submodule/protocol bypass. Routed the call site
at sync.ts:381 through pullRepo from git-remote.ts so initial clone and
ongoing pull share one defensive flag set.

MEDIUM — listSources ignored its `include_archived` flag. The op
advertised the param but the function destructured it as `_opts` and
queried every row. Archived sources' ids, local_paths, and remote_urls
were leaking to read-scoped MCP callers by default. Filter in SQL
(`WHERE archived IS NOT TRUE` unless the flag is set) so archived rows
never reach the wire.

PARTIAL HIGH — IPv6 ULA fc00::/7 and link-local fe80::/10 were not in
the isInternalUrl bypass list. Only ::1/:: and IPv4-mapped IPv6 were
blocked. Added regex-based ULA + link-local rejection to url-safety.ts.

Test coverage:
- test/git-remote.test.ts: 4 new IPv6 cases (ULA fc-prefix + fd-prefix,
  link-local fe80::, public IPv6 still allowed).
- test/sources-mcp.test.ts: 3 new cases pinning the remote/local
  asymmetry (clone_dir override silently ignored over MCP, path nulled,
  local CLI keeps the override).
- test/sources-mcp.test.ts: 2 new cases for include_archived honored.

DNS rebinding (codex finding #3): the current gate is lexical only.
A deliberate attacker who controls a hostname's A/AAAA records can still
resolve to an internal IP. Closing this requires async DNS resolution +
revalidation; filed as v0.28.x follow-up in TODOS.md so the API change
surface (parseRemoteUrl becomes async, every caller updates) lands in
its own PR.

323 tests pass (9 files); 4071 unit tests pass (full suite).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: rebump v0.28.1 → v0.28.2 (master collision)

Caught after PR creation. master is at v0.28.1 already; this branch
forked from garrytan/v0.28-release at v0.28.0 and naively bumped to
v0.28.1 without checking the master queue. CI version-gate would have
rejected at merge time (requires VERSION strictly greater than
master's).

Root cause: I bumped VERSION mechanically during plan implementation
(echo "0.28.1" > VERSION) without consulting the queue-aware allocator
at bin/gstack-next-version. /ship Step 12's idempotency check then
classified state as ALREADY_BUMPED and the workflow's "queue drift"
comparison was the safety net I should have hit — but I skipped it.

Files updated:
- VERSION + package.json: 0.28.1 → 0.28.2
- CHANGELOG.md: header + "To take advantage of v0.28.2" subsection
- README.md: sources --url note version reference
- TODOS.md: 7 follow-up entries' version references
- llms.txt + llms-full.txt: regenerated

PR title rewrite via gstack-pr-title-rewrite.sh handled in a separate
gh pr edit call; CI version-gate now passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request May 7, 2026
#696)

* feat: recency boost for search (v0.27.0) — temporal intent auto-detection, date filters, configurable decay

New search pipeline stage: keyword + vector → RRF → cosine re-score → backlink boost → recency boost → dedup

- applyRecencyBoost: hyperbolic decay, two strengths (moderate 30-day halflife, aggressive 7-day halflife)
- Auto-enabled when intent.ts detects temporal/event queries (detail='high')
- Manual override via SearchOpts.recencyBoost (0/1/2)
- Date filtering: afterDate/beforeDate on all three search paths (keyword, keywordChunks, vector)
- getPageTimestamps on both Postgres and PGLite engines
- 15 tests passing (boost math + intent classification)

* v0.29.1 schema: pages.{effective_date, effective_date_source, import_filename, salience_touched_at} + expression index

Migration v38 adds 4 nullable columns to pages and an expression index on
COALESCE(effective_date, updated_at) to support the new since/until date
filters. All additive — no behavior change in the default search path; only
consulted when callers opt into the new salience='on' / recency='on' axes
or pass since/until.

  effective_date         — content date (event_date / date / published /
                           filename-date / fallback). Read by recency boost
                           and date-filter paths only. Auto-link doesn't
                           touch it (immune to updated_at churn).
  effective_date_source  — sentinel for the doctor's effective_date_health
                           check ('event_date' | 'date' | 'published' |
                           'filename' | 'fallback').
  import_filename        — basename without extension, captured at import.
                           Used for filename-date precedence on daily/,
                           meetings/. Older rows leave it NULL.
  salience_touched_at    — bumped by recompute_emotional_weight when
                           emotional_weight changes. Salience window uses
                           GREATEST(updated_at, salience_touched_at) so
                           newly-salient old pages enter the recent salience
                           query.

Index strategy: a partial index on effective_date alone wouldn't help the
COALESCE expression in since/until filters (planner can't use it for the
negative side). The expression index ((COALESCE(effective_date, updated_at)))
is what actually accelerates the filter.

Postgres uses CONCURRENTLY + v14-style pg_index.indisvalid pre-drop guard
for prior failed CONCURRENTLY runs; PGLite uses plain CREATE INDEX. Mirror
of v34's pattern.

src/schema.sql + src/core/pglite-schema.ts updated for fresh installs;
src/core/schema-embedded.ts regenerated via bun run build:schema.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: computeEffectiveDate helper + putPage integration

Pure helper computing a page's effective_date from frontmatter precedence:
  1. event_date (meeting/event pages)
  2. date (dated essays)
  3. published (writing/)
  4. filename-date (leading YYYY-MM-DD in basename)
  5. updated_at (fallback)
  6. created_at (last resort)

Per-prefix override: for daily/ and meetings/ slugs, filename-date jumps
to position 1 — the filename is the user's primary signal there.

Returns {date, source}. The source label powers the doctor's
effective_date_health check to detect "fell back to updated_at" rows that
look populated but are functionally a NULL.

Range validation: parsed value must be in [1990-01-01, NOW + 1 year].
Out-of-range values drop to the next chain element.

Wired into importFromContent + importFromFile. The put_page MCP op derives
filename from slug-tail when no caller-supplied filename is available.

putPage SQL on both engines extended to write the new columns. ON CONFLICT
uses COALESCE(EXCLUDED.x, pages.x) so callers that don't know about the
new columns (auto-link, code reindex) preserve existing values rather than
blanking them. SELECT projection extended to return them; rowToPage threads
them through.

21 unit tests covering: precedence chain default order, per-prefix override,
parse failure fall-through, range validation [1990, NOW+1y], parseDateLoose
shape variants. All pass; typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: backfill orchestrator + library function for existing pages

src/core/backfill-effective-date.ts is the shared library function. Walks
pages in keyset-paginated batches (id > last_id ORDER BY id LIMIT 1000),
runs computeEffectiveDate per row, UPDATEs effective_date +
effective_date_source. Resumable via the `backfill.effective_date.last_id`
checkpoint key in the config table — a killed process can re-run and pick
up without re-doing rows. Idempotent: a full re-walk produces the same
writes.

Postgres-only: SET LOCAL statement_timeout = '600s' per batch. Doesn't
refuse the migration on low session settings (codex pass-2 #16).

src/commands/migrations/v0_29_1.ts is the orchestrator (4 phases mirroring
v0_12_2). Phase A schema (gbrain init --migrate-only), Phase B backfill
(via the library function), Phase C verify (count NULL effective_date),
Phase D record (handled by runner). The library function is reusable from
the gbrain reindex-frontmatter CLI command in the next commit.

import_filename stays NULL for backfilled rows — pre-v0.29.1 imports
didn't capture it. computeEffectiveDate uses the slug-tail when filename
is NULL; daily/2024-03-15 backfilled gets effective_date from the slug.

Registered in src/commands/migrations/index.ts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: gbrain reindex-frontmatter CLI command

Recovery / explicit-rebuild path for pages.effective_date. Used when:
  - User edited frontmatter dates after import
  - Post-upgrade backfill orchestrator finished but the user wants to
    re-walk a subset (e.g. just meetings/) after fixing some frontmatter
  - Precedence rules change between releases

Thin wrapper over backfillEffectiveDate from commit 3 — same code path
the v0_29_1 orchestrator uses; one source of truth.

Flags mirror reindex-code:
  --source <id>      Scope to one sources row (placeholder; library
                     library doesn't filter by source today, tracked v0.30+)
  --slug-prefix P    Scope to slugs starting with P (e.g. 'meetings/')
  --dry-run          Print what WOULD change, no DB writes
  --yes              Skip confirmation prompt (required for non-TTY non-JSON)
  --json             Machine-readable result envelope
  --force            Re-apply even when computed value matches existing

Wired into src/cli.ts. CLI handles its own engine lifecycle (creates +
disconnects).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: recency-decay map + buildRecencyComponentSql (pure, unused)

src/core/search/recency-decay.ts mirrors source-boost.ts in shape but
drives RECENCY ONLY (per D9 codex resolution). Salience is a separate
orthogonal axis; this map does not feed it.

DEFAULT_RECENCY_DECAY: 10 generic prefixes (no fork-specific names).
  - concepts/      evergreen (halflifeDays=0)
  - originals/     180d × 0.5 (long-tail decay; new essays nudged)
  - writing/       365d × 0.4
  - daily/         14d × 1.5  (aggressive — freshness IS the signal)
  - meetings/      60d × 1.0
  - chat/          7d × 1.0
  - media/x/       7d × 1.5
  - media/articles/ 90d × 0.5
  - people/companies/ 365d × 0.3
  - deals/         180d × 0.5

DEFAULT_FALLBACK: 90d × 0.5 for unmatched slugs.

Override priority: defaults < gbrain.yml recency: < env (GBRAIN_RECENCY_DECAY)
< per-call SearchOpts.recency_decay.

parseRecencyDecayEnv format: comma-separated prefix:halflifeDays:coefficient
triples. Refuses LOUD on parse error (RecencyDecayParseError) — codex
pass-2 #M3 finding. No silent fallback like source-boost's parser.

parseRecencyDecayYaml takes already-parsed YAML; throws on bad shape.

buildRecencyComponentSql in sql-ranking.ts emits a CASE expression with
longest-prefix-first ordering, evergreen short-circuit (literal 0 when
halflifeDays=0 or coefficient=0), and EXTRACT(EPOCH ...) for non-zero
branches. Output: ((CASE WHEN p.slug LIKE 'daily/%' THEN 1.5 * 14.0 /
(14.0 + EXTRACT(EPOCH FROM (NOW() - <dateExpr>))/86400.0) ... END))

Typed NowExpr enum prevents SQL injection (codex pass-1 #5). Tests pass
{ kind: 'fixed', isoUtc } for deterministic output; production NOW().
The 'fixed' branch escapes single quotes via escapeSqlLiteral.

25 unit tests covering: env parser shape, env error cases, yaml parser
shape, merge precedence (defaults < yaml < env < caller), CASE longest-
prefix-first ordering, evergreen short-circuit, NowExpr fixed/now,
single-quote injection defense, empty decayMap fallback path, default
map composition (no fork names, concepts/ evergreen, daily/ aggressive).

Pure module. Zero consumers in this commit; commit 6 wires it into
getRecentSalience, commit 10 wires it into the post-fusion stage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: refactor getRecentSalience to consume buildRecencyComponentSql

Both engines (Postgres + PGLite) now build the salience formula's third
term via buildRecencyComponentSql instead of inlining 1.0 / (1 + days_old).
Parameters: empty decayMap + fallback { halflifeDays: 1, coefficient: 1.0 }.
Math expands to 1 * 1.0 / (1.0 + days_old) = 1 / (1 + days_old) — same
numeric output as v0.29.0.

This is a no-behavior-change refactor preparing for commit 7's recency_bias
param. recency_bias='flat' (default) reproduces v0.29.0 exactly; 'on'
swaps in DEFAULT_RECENCY_DECAY for per-prefix decay.

Single source of truth for the recency math: same builder feeds the
salience query AND (in commit 10) the post-fusion applyRecencyBoost stage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: get_recent_salience gains recency_bias param (default 'flat')

SalienceOpts.recency_bias: 'flat' | 'on' added; default 'flat' preserves
v0.29.0 ranking verbatim. Pass 'on' to opt into per-prefix decay map
(concepts/originals/writing/ evergreen; daily/, media/x/, chat/ aggressive
decay).

When recency_bias='on', the salience query reads
COALESCE(p.effective_date, p.updated_at) instead of bare p.updated_at, so
the recency component is immune to auto-link updated_at churn — old
concepts/ pages just-touched by auto-link don't suddenly look fresh.

Both engines (Postgres + PGLite) wire the param through. resolveRecencyDecayMap()
honors gbrain.yml + GBRAIN_RECENCY_DECAY env at runtime.

MCP op surface: get_recent_salience gains the param with a load-bearing
description teaching the agent when to use 'on' vs 'flat' (current state →
on; mattering across all time → flat).

No silent v0.29.0 behavior change — opt-in only (per D11 codex resolution).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: recompute_emotional_weight writes salience_touched_at; window picks up newly-salient pages

setEmotionalWeightBatch on both engines now bumps salience_touched_at to
NOW() ONLY when the new emotional_weight differs from the existing one
(IS DISTINCT FROM, NULL-safe). No-op writes (same weight) leave the
column alone — preserves "actual change" semantics.

getRecentSalience window changes from
  WHERE p.updated_at >= boundary
to
  WHERE GREATEST(p.updated_at, COALESCE(p.salience_touched_at, p.updated_at)) >= boundary

Closes codex pass-1 finding #4: pages whose emotional_weight just changed
in the dream cycle (because tags or takes shifted) but whose updated_at
is older than the salience window now correctly enter the recent-salience
results. Without this, "Garry just added a take to a 6-month-old page"
stayed invisible to get_recent_salience until the next content edit.

COALESCE(salience_touched_at, p.updated_at) handles pre-v0.29.1 rows
where salience_touched_at is NULL — they fall back to p.updated_at and
behave identically to v0.29.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: merge intent.ts → query-intent.ts; emit 3 suggestions per query

D1 + D4 + D6 + D8: single regex-pass classifier returning
{intent, suggestedDetail, suggestedSalience, suggestedRecency}.

intent + suggestedDetail are v0.29.0 behavior verbatim (legacy intent.ts
deleted; classifyQueryIntent + autoDetectDetail compat shims preserved).

NEW for v0.29.1 — two orthogonal recency-axis suggestions:

  suggestedSalience: 'off' | 'on' | 'strong'
  suggestedRecency:  'off' | 'on' | 'strong'

Resolution rules (per D6 narrow temporal-bound exception):
  - CANONICAL patterns (who is X / what is Y / code / graph) → both off
  - UNLESS an EXPLICIT_TEMPORAL_BOUND also matches (today / right now /
    this week / since X / last N days), in which case temporal-bound wins
  - STRONG_RECENCY (today / right now / this morning / just now) → strong
  - RECENCY_ON (latest / recent / this week / meeting prep / catch up
    / remind me / status update) → on
  - SALIENCE_ON (catch up / remind me / status update / prep me /
    what's going on / what matters) → on
  - default → off for both axes (v0.29.1 prime-directive: pure opt-in)

Salience and recency are TRULY orthogonal (per D9). A query like
"latest news on AI" → recency='on' but salience='off' (the user wants
fresh, not emotionally-weighted). "What's going on with widget-co" →
both on. "Who is X right now" → both 'strong'/'on' (temporal bound
beats canonical 'who is').

intent.ts deleted; test/intent.test.ts renamed → test/query-intent-legacy.test.ts
(unchanged behavior coverage). New test/query-intent.test.ts adds 21
cases covering all three axes' interactions: canonical wins on bare
'who is', temporal bound overrides, "catch me up" matches with up to 15
chars between, "today" → strong, intent vs recency independence.

Updated callers:
  - src/core/search/hybrid.ts (autoDetectDetail import)
  - test/recency-boost.test.ts (classifyQueryIntent import)
  - test/benchmark-search-quality.ts (autoDetectDetail import)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: applySalienceBoost + applyRecencyBoost + runPostFusionStages wrapper

D9 + codex pass-1 #2 + #3 + pass-2 #4: salience and recency are TRULY
ORTHOGONAL post-fusion stages, both running from ALL THREE hybridSearch
return paths (keyword-only, embed-failure-fallback, full-hybrid).

NEW src/core/search/hybrid.ts exports:
  - applySalienceBoost(results, scores, strength)
      score *= 1 + k * log(1 + score) where k = 0.15 (on) or 0.30 (strong)
      No time component. Pure mattering signal.
  - applyRecencyBoost(results, dates, strength, decayMap, fallback, nowMs?)
      Per-prefix decay factor: 1 + strengthMul * coefficient * halflife / (halflife + days_old)
      strengthMul: 1.0 (on) or 1.5 (strong)
      Evergreen prefixes (halflifeDays=0) skipped (factor 1.0).
      Pure recency signal. Independent of mattering.
  - runPostFusionStages(engine, results, opts)
      Wraps backlink + salience + recency. Called from EACH return path so
      keyless installs and embed failures get the same boost surface as
      the full hybrid path.

NEW engine methods (composite-keyed for multi-source isolation):
  - getEffectiveDates(refs: Array<{slug, source_id}>): Map<key, Date>
      Returns COALESCE(effective_date, updated_at, created_at). Key format:
      `${source_id}::${slug}`. Mirror of getBacklinkCounts shape.
  - getSalienceScores(refs: Array<{slug, source_id}>): Map<key, number>
      Returns emotional_weight × 5 + ln(1 + take_count). Composite key.

Deprecated (kept for back-compat through v0.29.x):
  - SearchOpts.afterDate / beforeDate (alias for since/until)
  - SearchOpts.recencyBoost: 0|1|2 (alias for recency: 'off'|'on'|'strong')
  - getPageTimestamps (use getEffectiveDates instead)

NEW SearchOpts fields:
  - salience: 'off' | 'on' | 'strong'
  - recency:  'off' | 'on' | 'strong'
  - since:    string (ISO-8601 or relative, replaces afterDate)
  - until:    string (replaces beforeDate)

Resolution: caller-explicit > legacy alias (recencyBoost) > heuristic
(classifyQuery's suggestedSalience / suggestedRecency).

Deleted: src/core/search/recency.ts (PR #618's, replaced) +
test/recency-boost.test.ts (its scope is replaced by query-intent.test.ts +
future post-fusion tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Wintermute <wintermute@garrytan.com>

* v0.29.1: query op gains salience + recency + since + until params; PGLite since/until parity

Combines commits 12 + 13 of the plan.

Query op surface (src/core/operations.ts):
  - salience: 'off' | 'on' | 'strong' (with load-bearing description)
  - recency:  'off' | 'on' | 'strong'
  - since:    string (ISO-8601 or relative; replaces deprecated afterDate)
  - until:    string (replaces deprecated beforeDate)

Tool descriptions teach the calling agent:
  - salience axis = mattering, no time component
  - recency axis = age decay, no mattering signal
  - omit either to let gbrain auto-detect from query text via classifyQuery

hybrid.ts maps since/until → afterDate/beforeDate at the engine call
boundary so PR #618's existing engine plumbing keeps working without
rename. Codex pass-1 #10 finding closed.

PGLite engine (codex pass-1 #10): since/until parity added to all three
search methods (searchKeyword, searchKeywordChunks, searchVector). SQL
filter against COALESCE(p.effective_date, p.updated_at, p.created_at)
so date filtering matches user content-date intent (a meeting was on
event_date, not when it got reimported). Filter is applied INSIDE the
HNSW inner CTE in searchVector so HNSW's candidate pool already
excludes out-of-range pages — preserves pagination contract.

This also closes existing cross-engine drift: pre-v0.29.1 Postgres had
afterDate/beforeDate from PR #618; PGLite had nothing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: migration v39 — eval_candidates capture columns for replay reproducibility

D11 codex pass-2 resolution: extend eval_candidates with 7 new nullable
columns so `gbrain eval replay` can reproduce captured runs of agent-explicit
salience + recency choices.

Without these columns, replays of the new axis params drift. The live
behavior depends on the resolved {salience, recency} values; v0.29.0's
schema doesn't capture them.

  as_of_ts            TIMESTAMPTZ  — brain's logical NOW at capture
                                     (replay uses this instead of wall-clock)
  salience_param      TEXT         — what the caller passed (NULL if omitted)
  recency_param       TEXT         — same
  salience_resolved   TEXT         — final value applied
  recency_resolved    TEXT         — same
  salience_source     TEXT         — 'caller' or 'auto_heuristic'
  recency_source      TEXT         — same

All nullable + additive. Pre-v0.29.1 rows stay valid. NDJSON
schema_version STAYS at 1 — consumers ignore unknown fields (codex
pass-1 #C2 dissolves; no cross-repo coordination needed).

ADD COLUMN with no DEFAULT is metadata-only on PG 11+ and PGLite —
instant on tables of any size.

src/schema.sql + src/core/pglite-schema.ts mirror the additions for fresh
installs; src/core/schema-embedded.ts regenerated. eval_capture.ts
populates the new fields in commit 16 (docs + ship).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: doctor checks — effective_date_health + salience_health

effective_date_health: sample-1000 scan detects three classes of
problems (codex pass-1 #5 resolution via the effective_date_source
sentinel column added in commit 1):

  fallback_with_fm_date  — page fell back to updated_at even though
                           frontmatter has parseable event_date / date /
                           published. The "wrong but populated" residual
                           that earlier review iterations missed.
  future_dated            — effective_date > NOW() + 1 year (corrupt
                            or typo'd century).
  pre_1990                — effective_date < 1990-01-01 (epoch math gone
                            wrong, bad parse).

Sample of last 1000 pages by default — fast on 200K-page brains. Fix
hint: gbrain reindex-frontmatter.

salience_health: detects pages with active takes whose emotional_weight
is still 0 (recompute_emotional_weight phase hasn't run since the
take landed). Reports the brain's non-zero emotional_weight count as
an informational baseline. Fix hint: gbrain dream --phase
recompute_emotional_weight.

Both checks gracefully skip on pre-v0.29.1 brains (column doesn't
exist → 42703) without surfacing as warnings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: docs + skills convention + CHANGELOG + version bump

- VERSION 0.29.0 → 0.29.1
- package.json version bump
- CHANGELOG.md: full release-summary + itemized + "To take advantage"
  block per the project's voice rules. Two-line headline + concrete
  pathology framing (existing callers unchanged; new axes opt-in;
  agent in charge per the prime directive).
- skills/conventions/salience-and-recency.md: agent-readable decision
  rules. "Current state → on. Canonical truth → off." plus the narrow
  temporal-bound exception. Cross-cutting convention propagates to
  brain skills via RESOLVER.md.
- skills/migrations/v0.29.1.md: agent-readable upgrade instructions.
  Verify steps + behavior-change reference + recovery commands.

The build-time tool-description generator from D2 (extract decision
tables from skills/conventions/salience-and-recency.md, embed into
operations.ts at build time) is deferred to a follow-up commit. The
tool descriptions on the query op + get_recent_salience are inline in
operations.ts for v0.29.1; the auto-gen + CI staleness gate land in
v0.29.2 if drift becomes a problem in practice.

148 unit tests pass across the v0.29.1 surface (effective-date,
recency-decay, query-intent, migrate, salience, recompute-emotional-weight).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Wintermute <wintermute@garrytan.com>

---------

Co-authored-by: Wintermute <wintermute@garrytan.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request May 8, 2026
* v0.28 schema: takes + synthesis_evidence (v31) + access_tokens.permissions (v32)

Migration v31 adds the takes table (typed/weighted/attributed claims) and
synthesis_evidence (provenance for `gbrain think` outputs). Page-scoped via
page_id FK (slug isn't unique alone in v0.18+ multi-source). HNSW partial
index on embedding for active rows. ON DELETE CASCADE on synthesis_evidence
so deleting a source take cascades the provenance row.

Migration v32 adds access_tokens.permissions JSONB with safe-default
backfill (`{"takes_holders":["world"]}`). Default keeps non-world holders
hidden from MCP-bound tokens until the operator explicitly grants access
via the v0.28 auth permissions CLI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 engine: addTakesBatch, listTakes, searchTakes/Vector, supersede, resolve, synthesis_evidence

Extends BrainEngine with the takes domain object. Both engines implement the
same surface; PGLite uses manual `$N` placeholders, Postgres uses postgres-js
unnest() — same shape as addLinksBatch and addTimelineEntriesBatch.

Methods:
- addTakesBatch (upsert via ON CONFLICT (page_id, row_num) DO UPDATE)
- listTakes (filter by holder/kind/active/resolved, takesHoldersAllowList
  for MCP-bound calls, sortBy weight/since_date/created_at)
- searchTakes / searchTakesVector (pg_trgm + cosine; honor allow-list)
- countStaleTakes / listStaleTakes (mirror countStaleChunks pattern;
  embedding column intentionally omitted from listStale payload)
- updateTake (mutable fields only; throws TAKE_ROW_NOT_FOUND)
- supersedeTake (transactional: insert new at next row_num, mark old
  active=false, set superseded_by; throws TAKE_RESOLVED_IMMUTABLE on
  resolved bets)
- resolveTake (sets resolved_*; throws TAKE_ALREADY_RESOLVED on re-resolve;
  resolution is immutable per Codex P1 #13 fold)
- addSynthesisEvidence (provenance persist; ON CONFLICT DO NOTHING)
- getTakeEmbeddings (parallel to getEmbeddingsByChunkIds)

Types live in src/core/engine.ts adjacent to LinkBatchInput. Page-scoped
via page_id (slug not unique in v0.18+ multi-source). PageType gains
'synthesis'. takeRowToTake mapper in utils.ts handles Date → ISO string
normalization.

Tests: test/takes-engine.test.ts — 16 cases against PGLite covering
upsert/list/filter/search happy paths, takesHoldersAllowList isolation,
the four invariant errors (TAKE_ROW_NOT_FOUND, TAKES_WEIGHT_CLAMPED,
TAKE_RESOLVED_IMMUTABLE, TAKE_ALREADY_RESOLVED), supersede flow, resolve
metadata round-trip, FK CASCADE on synthesis_evidence when source take
deletes. All pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 model-config: unified resolveModel with 6-tier precedence + alias resolution

Replaces every hardcoded `claude-*-X` and per-phase `dream.<phase>.model`
config key with a single resolver. Hierarchy:

  1. CLI flag (--model)
  2. New-key config (e.g. models.dream.synthesize)
  3. Old-key config (deprecated dream.synthesize.model, dream.patterns.model)
     — read with stderr deprecation warning, one-per-process
  4. Global default (models.default)
  5. Env var (GBRAIN_MODEL or caller-supplied)
  6. Hardcoded fallback

Aliases (`opus`, `sonnet`, `haiku`, `gemini`, `gpt`) resolve at the end so
any tier can use a short name. User-defined `models.aliases.<name>` config
overrides built-ins. Cycle-safe (depth 2 break). Unknown alias passes
through unchanged so users can pass full provider IDs without registering.

When new-key + old-key are BOTH set (Codex P1 #11 fix), new-key wins and
stderr warns "deprecated config X ignored; Y is set and wins". When only
old-key is set, it's honored with a softer "rename to Y before v0.30"
warning. Both warnings emit once per (key, process) — a Set memo prevents
log spam in long-running daemons.

Migrated call sites: synthesize.ts (model + verdictModel), patterns.ts
(model). subagent.ts and search/expansion.ts to be migrated later in v0.28
(staying compatible until then).

Tests: test/model-config.test.ts — 11 cases pinning the 6-tier ordering,
alias resolution + cycle break, deprecated-key warning emit-once, and
unknown-alias pass-through. All pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 takes-fence: parser/renderer/upserter + chunker strip (privacy P0 fix)

src/core/takes-fence.ts — pure functions for the fenced markdown surface:
- parseTakesFence(body) — extracts ParsedTake[] from `<!--- gbrain:takes:begin/end -->`
  blocks. Strict on canonical form, lenient on hand-edits with warnings
  (TAKES_FENCE_UNBALANCED, TAKES_TABLE_MALFORMED, TAKES_ROW_NUM_COLLISION).
  Strikethrough `~~claim~~` → active=false; date ranges `since → until`
  split into sinceDate/untilDate.
- renderTakesFence(takes) — round-trip safe with parseTakesFence.
- upsertTakeRow(body, row) — append-only per CEO-D6 + eng-D9. Creates a
  fresh `## Takes` section if no fence present. row_num is monotonic
  (max + 1, never gap-filled — keeps cross-page refs and synthesis_evidence
  stable forever).
- supersedeRow(body, oldRow, replacement) — strikes through old row's claim
  AND appends the new row at end. Both rows preserved in markdown for
  git-blame archaeology.
- stripTakesFence(body) — removes the fenced block entirely. Used by the
  chunker so takes content lives ONLY in the takes table.

Codex P0 #3 fix: src/core/chunkers/recursive.ts now calls stripTakesFence()
before computing chunk boundaries. Without this, page chunks would contain
the rendered takes table and the per-token MCP allow-list would be
bypassed at the index layer (token bound to takes_holders=['world'] would
see garry's hunches via page hits). Doctor's takes_fence_chunk_leak check
(plan-side) asserts no chunk contains the begin marker.

Tests: 15 cases covering canonical parse, strikethrough, date range, fence
unbalanced detection, malformed-row skip + warning, row_num collision
detection, round-trip render, append-only upsert into existing fence,
fresh-section creation, monotonic row_num under hand-edit gaps, supersede
flow, stripTakesFence verifying takes content removed AND surrounding
prose preserved. Existing chunker tests still pass (15 + 15 = 30).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 page-lock: PID-liveness file lock for atomic markdown read-modify-write

src/core/page-lock.ts — per-page file lock at
~/.gbrain/page-locks/<sha256-of-slug>.lock so two concurrent `gbrain takes
add` calls or `takes seed --refresh` from autopilot can't race on the
same `<slug>.md` read-modify-write. Eng-review fold: reuses the v0.17
cycle.lock pattern (mtime + PID liveness) but per-slug.

Differences from cycle.ts's lock:
- SHA-256 of slug for safe filenames (slashes, unicode, etc.)
- Same-pid + fresh mtime = LIVE (cycle.ts assumes one lock per process and
  reclaims same-pid; page-lock allows concurrent locks for DIFFERENT slugs
  in one process). mtime expiry still rescues post-crash leftovers.
- 5-min TTL (vs cycle's 30 min — page edits are short)
- `withPageLock(slug, fn)` convenience wrapper with default 30s timeout

API:
- acquirePageLock(slug, opts) → handle | null (poll-with-timeout)
- handle.refresh() / handle.release() (idempotent — only releases if pid matches)
- withPageLock(slug, fn, opts) — acquire + run + release-in-finally

Tests: 10 cases — fresh acquire, live holder returns null, stale-mtime
reclaim, dead-PID reclaim, refresh updates timestamp, foreign-pid release
is no-op, withPageLock callback runs and releases on success/failure,
timeout-throws when held, SHA-256 filename safety for slashes/unicode.
All pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 extract-takes: dual-path phase (fs|db) + since/until_date as TEXT

src/core/cycle/extract-takes.ts — new phase that materializes the takes
table from fenced markdown blocks. Two paths mirror src/commands/extract.ts:

- extractTakesFromFs: walk *.md under repoPath, parse fences, batch upsert
- extractTakesFromDb: iterate engine.getAllSlugs(), parse each page's
  compiled_truth+timeline, batch upsert (mutation-immune snapshot iteration)

Single dispatcher extractTakes(opts) routes by source. Honors:
- slugs filter for incremental re-extract (pipes from sync→extract)
- dryRun: count would-be upserts, write nothing
- rebuild: DELETE FROM takes WHERE page_id = $1 before re-insert (clean
  slate when markdown is canonical and DB has drifted)

Schema fix: since_date/until_date were DATE in the original v31 migration.
Spec uses partial dates ('2017-01', '2026-04-29 → 2026-06') that Postgres
DATE rejects. Changed to TEXT in both the Postgres and PGLite blocks so
parser-rendered ranges round-trip cleanly. Loses the ability to do
date-range arithmetic in SQL, but date math on opinion timelines is
out of scope for v0.28 anyway. utils.ts dateOrNull now annotated as
v0.28 TEXT-aware.

Migration v31 has not been deployed yet (this branch is the v0.28 release
candidate), so the type swap is free. No data migration needed.

Tests: test/extract-takes.test.ts — 5 cases against PGLite covering full
walk + fence-skip on no-fence pages, takes-table populated post-extract,
incremental slugs filter, dry-run no-write, rebuild=true clears + re-inserts
ad-hoc rows. test/takes-engine.test.ts (16), test/takes-fence.test.ts (15)
all still pass — 36/36 takes tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 takes CLI: list, search, add, update, supersede, resolve

src/commands/takes.ts — surfaces the engine methods + takes-fence library
through a single `gbrain takes <subcommand>` entrypoint:

  takes <slug>                          list with filters + sort
  takes search "<query>"                pg_trgm keyword search across all takes
  takes add <slug> --claim ... ...      append (markdown + DB, atomic via lock)
  takes update <slug> --row N ...       mutable-fields update (markdown + DB)
  takes supersede <slug> --row N ...    strikethrough old + append new
  takes resolve <slug> --row N --outcome  record bet resolution (immutable)

Markdown is canonical. Every mutate command:
  1. acquires the per-page file lock (withPageLock)
  2. re-reads the .md file
  3. applies the edit via takes-fence (upsertTakeRow / supersedeRow)
  4. writes the .md file back
  5. mirrors to the DB via the engine method
  6. releases the lock (auto via finally)

Resolve currently writes only to DB — surfacing resolved_* in the markdown
table is deferred to v0.29 (the takes-fence renderer's column set is
fixed at # | claim | kind | who | weight | since | source per spec).

Wired into src/cli.ts dispatch + CLI_ONLY allowlist. Help text follows the
project convention (orphans/embed/extract pattern). --dir flag overrides
sync.repo_path config when working outside the configured brain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 MCP + auth: takes_list / takes_search / think ops + per-token allow-list

OperationContext gains takesHoldersAllowList — server-side filter for
takes.holder field threaded from access_tokens.permissions through dispatch
into the engine SQL. Closes Codex P0 #3 at the dispatch layer (chunker
strip already closed the page-content side in the previous commit).

src/core/operations.ts — three new ops:
- takes_list: lists takes with holder/kind/active/resolved filters; honors
  ctx.takesHoldersAllowList for MCP-bound calls
- takes_search: pg_trgm keyword search; honors allow-list
- think: op surface registered (returns not_implemented envelope until
  Lane D's pipeline lands). Remote callers cannot save/take per Codex P1 #7.

src/mcp/dispatch.ts — DispatchOpts.takesHoldersAllowList threads into
buildOperationContext.

src/mcp/http-transport.ts — validateToken now reads
access_tokens.permissions.takes_holders, defaults to ['world'] when the
column is absent or malformed (default-deny on private hunches).
auth.takesHoldersAllowList passed to dispatchToolCall.

src/mcp/server.ts (stdio) — defaults to takesHoldersAllowList: ['world']
since stdio has no per-token auth. Operators wanting full visibility use
`gbrain call <op>` directly (sets remote=false).

src/commands/auth.ts — `gbrain auth create <name> --takes-holders w,g,b`
flag persists the per-token list; new `auth permissions <name>
set-takes-holders <list>` updates an existing token.

Tests: test/takes-mcp-allowlist.test.ts — 8 cases against PGLite proving
the threading: local-CLI sees all holders, ['world'] returns only public,
['world','garry'] returns 2/3, no-overlap returns empty (no fallback),
search honors allow-list, remote save/take on think rejected with
not_implemented envelope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28.0: ship-prep — VERSION, CHANGELOG, migration orchestrator, skill

Closes the v0.28 ship-prep cycle. Bumps VERSION + package.json + bun.lock
to 0.28.0. v0_28_0 migration orchestrator runs three idempotent phases on
upgrade:

- Schema verify: asserts schema_version >= 32 (migrations v31 + v32 already
  applied by the schema runner during gbrain upgrade); fails clean if not.
- Backfill takes: inline runs `extractTakes(engine, { source: 'db' })` so
  any pre-existing fenced takes tables in markdown populate the takes
  index. Idempotent; ON CONFLICT DO UPDATE keeps the table in sync.
- Re-chunk TODO: queues a pending-host-work entry asking the host agent
  to re-import pages with takes content so the v0.28 chunker-strip rule
  (Codex P0 #3 fix) applies retroactively. Pages imported under v0.28+
  already have takes content stripped from chunks at index time; this
  TODO catches up legacy pages.

skills/migrations/v0.28.0.md — agent-readable upgrade guide. Walks
through doctor verification, deprecated-key migration, MCP token
visibility configuration, and a "try the takes layer" smoke test.

CHANGELOG.md — v0.28.0 release-summary in the GStack voice (no AI
vocabulary, no em dashes, real numbers from git diff stat) + the
mandatory "To take advantage of v0.28.0" block + itemized changes by
subsystem (schema, engine, markdown surface, model config, MCP+auth,
CLI, tests, accepted risks).

Final test sweep: 65/65 v0.28 tests pass across 6 files. typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 think pipeline: gather → sanitize → synthesize → cite-render → CLI

src/core/think/sanitize.ts — prompt-injection defense for take claims:
14 jailbreak patterns (ignore-prior, role-jailbreak, close-take tag,
DAN, system-prompt overrides, eval-shell hooks) plus structural framing
(takes wrapped in <take id="..."> tags the model is told to treat as
DATA). Length-cap at 500 chars. Renders evidence blocks for the prompt.

src/core/think/prompt.ts — system prompt + structured-output schema.
Hard rules: cite every claim, mark hunches/low-weight explicitly,
surface conflicts (never silently pick), surface gaps. JSON schema
with answer + citations[] + gaps[]. Prompt adapts to anchor / time
window / save flag.

src/core/think/cite-render.ts — structured citations + regex fallback
(Codex P1 #4 fold). normalizeStructuredCitations validates the model's
structured output; parseInlineCitations is the body-scan fallback when
the model omits the structured field. resolveCitations dispatches and
records CITATIONS_REGEX_FALLBACK warning when used.

src/core/think/gather.ts — 4-stream parallel retrieval:
  1. hybridSearch (pages, existing primitive)
  2. searchTakes (keyword, pg_trgm)
  3. searchTakesVector (vector, when embedQuestion fn supplied)
  4. traversePaths (graph, when --anchor set)
RRF fusion (k=60). Each stream wrapped in try/catch — partial gather
beats no synthesis. Honors takesHoldersAllowList for MCP-bound calls.

src/core/think/index.ts — runThink orchestrator + persistSynthesis:
INTENT (regex classify) → GATHER → render evidence blocks → resolveModel
('models.think' → 'models.default' → GBRAIN_MODEL → opus) → LLM call
(injectable client) → JSON parse with code-fence + fallback strip →
resolveCitations → ThinkResult. persistSynthesis writes a synthesis
page + synthesis_evidence rows (page_id resolved per slug; page-level
citations skip evidence). Degrades gracefully without ANTHROPIC_API_KEY.
Round-loop scaffolding in place (rounds=1 only path exercised in v0.28).

src/commands/think.ts — `gbrain think "<question>"` CLI. Flag parsing
strips --anchor, --rounds, --save, --take, --model, --since, --until,
--json. Local CLI = remote=false, so save/take honored. Human-readable
output by default; --json for agent consumption.

operations.ts — `think` op now calls runThink (was a not_implemented
stub). Remote callers can't save/take per Codex P1 #7. Returns full
ThinkResult plus saved_slug + evidence_inserted.

cli.ts — wired into dispatch + CLI_ONLY allowlist.

Tests: test/think-pipeline.test.ts — 18 cases against PGLite covering
sanitize patterns, structural rendering, citation parsing (structured +
regex fallback + dedup + invalid-slug rejection), gather streams +
allow-list filter, full pipeline with stub client, malformed-LLM
fallback path, no-API-key graceful degradation, persistSynthesis writes
page + evidence rows. All pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 dream phases: auto-think + drift + budget meter (Codex P1 #10 fold)

src/core/anthropic-pricing.ts — USD/1M-tokens map for Claude 4.7 family
plus older aliases. estimateMaxCostUsd returns null on unpriced models so
the meter caller can warn-once and bypass the gate.

src/core/cycle/budget-meter.ts — cumulative cost ledger. Each submit
estimates max-cost from (model + estimatedInputTokens + maxOutputTokens),
accumulates per-cycle, refuses next submit when projected > cap. Codex
P1 #10 fold: non-Anthropic models (gemini, gpt) bypass with one stderr
warn per process and `unpriced=true` on the result. Budget=0 disables
the gate. Audit trail at ~/.gbrain/audit/dream-budget-YYYY-Www.jsonl.

src/core/cycle/auto-think.ts — auto_think dream phase. Reads
dream.auto_think.{enabled,questions,max_per_cycle,budget,cooldown_days,
auto_commit}. Iterates configured questions through runThink with the
BudgetMeter pre-checking each submit. Cooldown timestamp written ONLY on
success (matches v0.23 synthesize pattern — retries after partial
failures pick back up). When auto_commit=true, persists synthesis pages
via persistSynthesis. Default-disabled.

src/core/cycle/drift.ts — drift dream phase scaffold. Reads
dream.drift.{enabled,lookback_days,budget,auto_update}. Surfaces takes
in the soft band (weight 0.3-0.85, unresolved) that have recent timeline
evidence on the same page. v0.28 ships the orchestration; the LLM judge
that proposes weight adjustments lands in v0.29. modelId + meter wired
now so the ledger captures gate state for callers that opt in.

Tests:
- test/budget-meter.test.ts (7 cases) — pricing-map coverage, allow path,
  cumulative-deny, budget=0 disabled, unpriced bypass+warn-once, ledger
  captures all events, ISO-week filename branch.
- test/auto-think-phase.test.ts (9 cases) — auto_think enable/skip,
  questions empty, success → cooldown ts written, cooldown blocks rerun,
  budget exhausted → partial. drift not_enabled, soft-band candidate
  detection, complete + dry-run paths.

All pass. Typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 e2e Postgres: takes engine + extract + MCP allow-list (12 cases)

test/e2e/takes-postgres.test.ts — full v0.28 takes pipeline against real
Postgres (gated on DATABASE_URL). 12 cases:
- addTakesBatch upsert via unnest() bind path (Postgres-specific)
- listTakes filters: holder, kind, sort=weight, takesHoldersAllowList
- searchTakes pg_trgm + allow-list filter
- supersedeTake transactional path (BEGIN/COMMIT semantics)
- resolveTake immutability — second resolve throws TAKE_ALREADY_RESOLVED
- synthesis_evidence FK CASCADE on take delete
- countStaleTakes + listStaleTakes filter active+null
- extractTakesFromDb populates takes from fenced markdown
- MCP dispatch with takesHoldersAllowList=['world'] returns only world
- MCP dispatch local-CLI path returns all holders
- MCP dispatch takes_search honors allow-list
- think op forces remote_persisted_blocked even for save+take

postgres-engine.ts: addTakesBatch boolean[] serialization fix.
postgres-js auto-detects element type from JS arrays; for booleans it
mis-detects as scalar. Cast through text[] (`'true' | 'false'`) then
SQL-cast to boolean[] — same pattern other batch methods rely on for
type-stable bind shapes.

test/e2e/helpers.ts: setupDB now (a) tolerates non-existent tables in
TRUNCATE (for fresh DBs where v31 hasn't yet created takes/synthesis_evidence)
and (b) calls engine.initSchema() to actually run migrations.

test/takes-mcp-allowlist.test.ts: updated 2 think-op cases to match
Lane D's landed pipeline. They previously asserted not_implemented
envelopes; now they assert remote_persisted_blocked + NO_ANTHROPIC_API_KEY
graceful-degrade behavior.

Run: DATABASE_URL=postgres://localhost:5435/gbrain_test bun test test/e2e/takes-postgres.test.ts
Result: 12/12 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 dream phases: local DreamPhaseResult type (avoid premature CyclePhase enum extension)

cycle.ts's PhaseResult is shaped {phase, status, summary, details} with a
narrow PhaseStatus enum ('ok'|'warn'|'fail'|'skipped') and CyclePhase enum
that doesn't yet include 'auto_think'/'drift'. The phases ship standalone
in v0.28 (cycle.ts dispatcher integration is v0.28.x); using PhaseResult
forced premature enum extension.

Introduces DreamPhaseResult exported from auto-think.ts:
  { name: 'auto_think'|'drift'; status: 'complete'|'partial'|'failed'|'skipped';
    detail: string; totals?: Record<string,number>; duration_ms: number }

drift.ts re-exports the same type. When v0.28.x wires the dispatcher, the
adapter at the call site can map DreamPhaseResult → PhaseResult cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 e2e: access_tokens.permissions JSONB end-to-end (5 cases)

test/e2e/auth-permissions.test.ts — closes the v0.28 token-allow-list
verification loop against real Postgres. Exercises:

- Migration v32 default backfill: new tokens created without a permissions
  column get {takes_holders: ["world"]} via the schema DEFAULT clause.
- Explicit ["world","garry"] → dispatch.takes_list filters to those
  holders only; brain hunches stay hidden from this token.
- ["world"] default-deny token → takes_search hits filtered to public claims.
- {} permissions row (operator tampered) gracefully defaults to ["world"]
  via the HTTP transport's validateToken parsing.
- revoked_at IS NOT NULL → token excluded from active token query.

Avoids the postgres-js JSONB double-encode trap (CLAUDE.md memory): pass
the object directly to executeRaw, no JSON.stringify, no ::jsonb cast.

All 5 pass against pgvector/pgvector:pg16 on port 5435. Combined v0.28
test sweep: 116/116 across 11 files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28 e2e: chunker takes-strip integration test (Codex P0 #3 verification)

test/e2e/chunker-takes-strip.test.ts — verifies the chunker actually
strips fenced takes content end-to-end through the import pipeline.
This is the Codex P0 #3 fix's verification path: takes content lives
ONLY in the takes table for retrieval, never duplicated in
content_chunks where the per-token MCP allow-list cannot reach.

5 cases:
- chunkText (unit) output never contains TAKES_FENCE_BEGIN/END markers
- chunkText output never contains fenced claim text
- chunkText output retains non-fence prose (no over-stripping)
- importFromContent end-to-end: imported page has chunks but none
  contain fenced content
- takes_fence_chunk_leak doctor invariant: zero rows globally where
  chunk_text matches `<!--- gbrain:takes:%`

Final v0.28 test sweep:
  121 pass, 0 fail, 336 expect() calls, 12 files
  Coverage: schema migrations, engine methods (PGLite + Postgres),
  takes-fence parser, page-lock, extract phase, takes CLI engine
  surface, model config 6-tier resolver, MCP+auth allow-list,
  think pipeline (gather + sanitize + cite-render + synthesize),
  auto-think + drift + budget meter, JSONB end-to-end, chunker
  strip integration. ~95% of v0.28 surface area covered.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix CI: apply-migrations skippedFuture arrays + http-transport SQL mock

Two CI failures from PR #563:

test/apply-migrations.test.ts (2 fails) — `buildPlan` tests assert exact
skippedFuture arrays at fixed installed-version stamps. Adding v0.28.0 to
the migration registry means it shows up in skippedFuture when the test
runs at installed=0.11.1 / installed=0.12.0. Append '0.28.0' to both
hardcoded arrays.

test/http-transport.test.ts (8 fails) — the FakeEngine mock string-prefix
matches `SELECT id, name FROM access_tokens` to return a row. v0.28's
validateToken now selects `SELECT id, name, permissions FROM access_tokens`
to read the per-token takes_holders allow-list. Mock returned [] on the
new query → validateToken treated every token as invalid → 401.

Fix: mock now matches both query shapes. validTokens row gets a default
`{takes_holders: ['world']}` permission injected when caller didn't
supply one (mirrors the migration v33 column DEFAULT). Updated
FakeEngineConfig type to allow tests to pass explicit permissions.

Verification:
  bun test test/apply-migrations.test.ts → 18/18 pass
  bun test test/http-transport.test.ts   → 24/24 pass
  bun run typecheck                       → clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix CI: add scope annotations to v0.28 ops (takes_list/takes_search/think)

test/oauth.test.ts enforces an invariant from master's v0.26 OAuth landing:
every Operation must have `scope: 'read' | 'write' | 'admin'`, and any op
flagged `mutating: true` must be 'write' or 'admin'. My v0.28 ops were added
before master shipped v0.26 + the new invariant; the merge surfaced the gap.

Annotations:
- takes_list   → read
- takes_search → read
- think        → write (mutating: true; --save persists synthesis page)

Verification:
  bun test test/oauth.test.ts → 42/42 pass
  bun run typecheck            → clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(v0.28.1): export INJECTION_PATTERNS for shared sanitization

The same pattern set protects takes from prompt-injection (think/sanitize.ts)
and now retrieved chat content in the LongMemEval harness. One source of
truth for both surfaces; adding a new pattern in this file automatically
covers benchmarks too.

Existing consumers (sanitizeTakeForPrompt, renderTakesBlock) keep working
unchanged. Verified via test/think-pipeline.test.ts (18 pass, 0 fail).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(v0.28.1): longmemeval harness — reset-in-place over in-memory PGLite

One in-memory PGLiteEngine per benchmark run; TRUNCATE between questions
with runtime-enumerated tables via pg_tables so future schema migrations
don't silently leak across questions. Infrastructure tables (sources,
config, gbrain_cycle_locks, subagent_rate_leases) preserved across resets
so initSchema-seeded rows like sources.'default' survive (FK target for
pages.source_id).

Files:
- src/eval/longmemeval/harness.ts: createBenchmarkBrain + resetTables +
  withBenchmarkBrain. ~50 lines, no class wrapper.
- src/eval/longmemeval/adapter.ts: pure haystackToPages() converter.
  Slug prefix `chat/` (verified non-matching against DEFAULT_SOURCE_BOOSTS).
- src/eval/longmemeval/sanitize.ts: re-uses INJECTION_PATTERNS from
  think/sanitize.ts; wraps each session in <chat_session id date> tags;
  4000-char cap.
- test/longmemeval-sanitize.test.ts: 12 cases pinning the F8 contract.

Hermetic: no DATABASE_URL, no API keys.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(v0.28.1): gbrain eval longmemeval CLI command

Run the LongMemEval public benchmark against gbrain's hybrid retrieval.
Dataset is a positional path (download from xiaowu0162/longmemeval on HF).
Per-question loop wraps everything in try/catch; one bad question doesn't
kill the run, error JSONL line emitted instead.

Wiring:
- src/cli.ts: pre-dispatch bypass for `eval longmemeval` so the user's
  ~/.gbrain brain is never opened. Hermeticity gate verified: --help works
  on machines with no gbrain config.
- src/commands/eval-longmemeval.ts: arg parsing, JSONL emit (LF + UTF-8
  pinned), hybridSearch with optional expandQuery from search/expansion.ts,
  resolveModel from model-config.ts (6-tier chain), ThinkLLMClient injection
  seam from think/index.ts, structural <chat_session> framing.
- test/eval-longmemeval.test.ts: 12 cases covering harness lifecycle,
  reset clears all tables, schema-migration robustness, p50/p99 speed gate
  (warm reset+import+search target <500ms), adapter shape, source-boost
  regression guard, end-to-end with stubbed LLM, JSONL format guard,
  per-question failure handling.
- test/fixtures/longmemeval-mini.jsonl: 5 hand-authored questions with
  keyword-friendly overlap so --keyword-only works in CI.

Speed: warm reset+import 5 pages+search p50=25.9ms p99=30.3ms locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(v0.28.1): bump VERSION + CHANGELOG

VERSION + package.json synchronized at 0.28.1. CHANGELOG entry uses the
release-summary voice + "To take advantage of v0.28.1" block per CLAUDE.md.

Sequential release on garrytan/v0.28-release; lands after v0.28.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: surface v0.28.1 LongMemEval CLI across project docs

- README.md: add EVAL section to Commands reference (eval --qrels, export,
  prune, replay, longmemeval); add v0.28.1 announce paragraph next to the
  v0.25.0 BrainBench-Real intro.
- CLAUDE.md: add Key files entry for src/eval/longmemeval/ +
  src/commands/eval-longmemeval.ts; add "Key commands added in v0.28.1"
  subsection (mirrors the v0.26.5 / v0.25.0 pattern); inventory
  test/eval-longmemeval.test.ts + test/longmemeval-sanitize.test.ts under
  the unit-test list.
- docs/eval-bench.md: cross-link from the "What it actually does" section
  to LongMemEval as the third evaluation axis (public benchmark,
  ground-truth labels, full QA pipeline); append "Public benchmarks:
  LongMemEval (v0.28.1)" section with architecture, flags table, and
  perf numbers.
- CONTRIBUTING.md: append a paragraph after the eval-replay block pointing
  contributors at gbrain eval longmemeval for public-benchmark coverage.
- AGENTS.md: extend the existing eval-retrieval bullet with a one-line
  mention of gbrain eval longmemeval.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.28.2 feat: remote-source MCP + scope hierarchy + whoami (#690)

* refactor(core): extract SSRF helpers from integrations.ts to core/url-safety.ts

src/core/git-remote.ts (next commit) needs isInternalUrl etc. but importing
from src/commands/ would invert the layering boundary (no existing
src/core/ file imports from src/commands/). Extract the SSRF helpers
(parseOctet, hostnameToOctets, isPrivateIpv4, isInternalUrl) into a new
src/core/url-safety.ts and have integrations.ts re-export for backward
compat. test/integrations.test.ts continues to pass without changes (110
existing tests, 214 expects).

Why this matters for v0.28: the upcoming sources --url feature reuses
this SSRF gate for git-clone URL validation. Codex review caught that
re-rolling weaker URL classification would regress on the IPv6/v4-mapped/
metadata/CGNAT bypass forms that integrations.ts already handles.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(core): add git-remote module — SSRF-defensive clone/pull + state probe

New src/core/git-remote.ts (~210 lines) for v0.28's remote-source feature:

- GIT_SSRF_FLAGS exported const: -c http.followRedirects=false,
  -c protocol.file.allow=never, -c protocol.ext.allow=never,
  --no-recurse-submodules. Single source of truth shared by cloneRepo
  and pullRepo so a future flag added to one path lands on both.
  Closes the SSRF surfaces codex flagged: DNS rebinding via redirects,
  .gitmodules as a second-fetch surface, file:// scheme in remotes.

- parseRemoteUrl: https-only, rejects embedded credentials and path
  traversal, delegates internal-target classification to isInternalUrl
  from url-safety.ts (covers RFC1918, link-local, loopback, IPv6, CGNAT
  100.64/10, metadata hostnames, hex/octal/single-int bypass forms).
  GBRAIN_ALLOW_PRIVATE_REMOTES=1 escape hatch with stderr warning is
  needed for self-hosted git over Tailscale (CGNAT trips the gate).

- cloneRepo: --depth=1 default (full clone via depth: 0); refuses
  non-empty destDirs; spawns git via execFileSync (no shell injection)
  with GIT_TERMINAL_PROMPT=0 + askpass=/bin/false to prevent credential
  prompts. timeoutMs default 600s.

- pullRepo: -C path + GIT_SSRF_FLAGS + pull --ff-only, same env confine.

- validateRepoState: 6-state decision tree (missing | not-a-dir |
  no-git | corrupted | url-drift | healthy). Used by performSync's
  re-clone branch to recover from rmd clone dirs and refuse syncs on
  url-drift or corruption.

test/git-remote.test.ts (304 lines, 32 tests): GIT_SSRF_FLAGS exact
shape, all parseRemoteUrl rejection cases including dedicated CGNAT
100.64/10 with/without GBRAIN_ALLOW_PRIVATE_REMOTES (codex T3 case),
fake-git harness for argv assertions on cloneRepo/pullRepo, all 6
validateRepoState branches.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(core): add scope hierarchy + ALLOWED_SCOPES allowlist

New src/core/scope.ts (~120 lines) for v0.28's scoped MCP feature.

Hierarchy:
  - admin implies all (escape hatch)
  - write implies read
  - sources_admin and users_admin are siblings (different axes —
    sources-mgmt vs user-account-mgmt; neither implies the other)

Exported:
  - hasScope(grantedScopes, requiredScope): the canonical scope check.
    Replaces exact-string-match at three call sites in upcoming commits
    (serve-http.ts:673, oauth-provider.ts:365 F3 refresh, oauth-provider.ts:498
    token issuance). Without this rewrite, an admin-grant token would
    fail to refresh down to sources_admin (codex finding).
  - ALLOWED_SCOPES set + ALLOWED_SCOPES_LIST sorted array (deterministic
    for OAuth metadata wire format and drift-check output).
  - assertAllowedScopes / InvalidScopeError: registration-time gate so
    tokens with bogus scope strings (read flying-unicorn) get rejected
    with RFC 6749 §5.2 invalid_scope at auth.ts:296 + DCR /register +
    registerClientManual. Today's behavior accepts any string silently.
  - parseScopeString: space-separated wire format → array.

Forward-compat: hasScope ignores unknown granted scopes rather than
throwing, so pre-allowlist tokens with weird scope strings continue
working without crashes (registration is the gate, runtime is best-effort).

test/scope.test.ts (178 lines, 35 tests): hierarchy table including
all-implies for admin, sibling non-implication of *_admin scopes,
write→read but not the reverse, F3 refresh-token subset semantics
under hasScope, ALLOWED_SCOPES_LIST sorted-pinning, allowlist
rejection cases, parseScopeString edge cases (undefined/null/empty).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* build(admin): scope-constants mirror + drift CI for src/core/scope.ts

The admin React SPA's tsconfig.json scopes include: ['src'] to admin/src/,
so it cannot directly import ../../src/core/scope.ts. The plan considered
widening the include or generating a single source of truth; both options
either couple the SPA to the gbrain monorepo or add a build step. Eng
review picked the boring choice: hand-maintained mirror at
admin/src/lib/scope-constants.ts plus a CI drift check.

Files:
  - admin/src/lib/scope-constants.ts: hand-maintained ALLOWED_SCOPES_LIST
    duplicate, sorted alphabetically to match src/core/scope.ts.
  - scripts/check-admin-scope-drift.sh: extracts the list from each file
    via awk, normalizes via tr/sort, diffs. Exits 0 on match, 1 on drift
    (with full breakdown of which scopes diverged), 2 on internal error.
    Tested both passing and corrupted paths.
  - package.json: wires check:admin-scope-drift into both `verify` and
    `check:all` so any update to src/core/scope.ts that forgets the
    admin-side mirror fails the build.

The Agents.tsx scope-checkbox sites (5 hardcoded locations) get updated
in a later commit to import from this constants file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(oauth): hasScope hierarchy + ALLOWED_SCOPES allowlist at registration

Switch three call sites in oauth-provider.ts from exact-string-match to
hasScope() so the v0.28 sources_admin and users_admin scopes — and the
admin-implies-all + write-implies-read hierarchy in src/core/scope.ts —
work end to end:

- F3 refresh-token subset enforcement at line 365: previously rejected
  admin → sources_admin refresh because exact-match treated them as
  unrelated scopes. gstack /setup-gbrain Path 4 needs admin tokens to
  refresh down to least-privilege sources_admin scope; this fix lands
  that path.

- Token issuance intersection at line 498 (client_credentials grant):
  same hasScope swap so a client whose stored grant is `admin` can mint
  tokens including any implied scope.

- registerClient (DCR /register) and registerClientManual: validate
  every scope string against ALLOWED_SCOPES via assertAllowedScopes.
  Pre-fix the system silently accepted `--scopes "read flying-unicorn"`
  and persisted the bogus string in oauth_clients.scope. Post-fix the
  caller gets RFC 6749 §5.2 invalid_scope. Existing rows with
  pre-allowlist scopes keep working (allowlist gates registration only).

Tests amended in test/oauth.test.ts:
- T1 (eng-review): admin grant CAN refresh down to sources_admin
- T1 sibling: write grant CANNOT refresh up to sources_admin
- ALLOWED_SCOPES allowlist coverage (manual + DCR paths, all 5 valid)
- Scope-annotation contract tests widened to accept the v0.28 union

62 OAuth tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(serve-http): hasScope at /mcp + advertise full ALLOWED_SCOPES

Two changes against src/commands/serve-http.ts:

- Line 195: scopesSupported on the mcpAuthRouter options switches from the
  hardcoded ['read','write','admin'] to Array.from(ALLOWED_SCOPES_LIST).
  Without this, /.well-known/oauth-authorization-server keeps reporting
  the old triple, so MCP clients (Claude Desktop, ChatGPT, Perplexity)
  cannot discover the v0.28 sources_admin and users_admin scopes via
  standard discovery — they would have to be pre-configured out of band.

- Line 673: request-time scope check on /mcp swaps
  authInfo.scopes.includes(requiredScope) for hasScope(...). This was
  the most-cited codex finding: without it, sources_admin tokens could
  not even satisfy a `read`-scoped op (sources_admin doesn't include
  the literal string "read"). hasScope routes through the hierarchy
  table in src/core/scope.ts so admin implies all and write implies
  read at the gate too.

T2 amendment in test/e2e/serve-http-oauth.test.ts: assert
/.well-known/oauth-authorization-server includes all 5 scopes in
scopes_supported. Pre-v0.28 the list was hardcoded to ['read','write',
'admin'] and this assertion would have failed. (The test is
Postgres-gated; runs under bun run test:e2e with DATABASE_URL set.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(core): sources-ops module — atomic clone + symlink-safe cleanup

src/core/sources-ops.ts (~470 lines): pure async functions extracted from
src/commands/sources.ts so the CLI handlers and the new MCP ops share
one implementation.

addSource: D3 atomicity contract from the eng review.
  1. Validate id (matches existing SOURCE_ID_RE).
  2. Q4 pre-flight SELECT — fail loudly with structured `source_id_taken`
     before any clone work. Pre-fix the existing CLI used INSERT…ON
     CONFLICT DO NOTHING which silently no-op'd; with clone-first that
     would orphan the temp dir.
  3. parseRemoteUrl gate (delegates to isInternalUrl from url-safety.ts).
  4. Clone into $GBRAIN_HOME/clones/.tmp/<id>-<rand>/ via the new
     git-remote helpers.
  5. INSERT row with local_path=<final clone dir>, config.remote_url=<url>.
  6. fs.renameSync(tmp/, final/). Rollback on either-side failure unlinks
     the temp dir; rename-failed path also DELETEs the just-INSERTed row
     best-effort.

removeSource: clone-cleanup with realpath+lstat confinement matching
validateUploadPath() shape at src/core/operations.ts:61. String startsWith
is symlink-unsafe and would let $GBRAIN_HOME/clones/<id> → /etc resolve
out of the confine. Two defenses layered:
  - isPathContained (realpath-resolves both sides + parent-with-sep
    string check) rejects symlinks whose target falls outside the
    confine.
  - lstat-then-isSymbolicLink check refuses symlinks whose realpath
    happens to land back inside the confine (defense in depth).

getSourceStatus: returns clone_state via validateRepoState (the 6-state
decision tree from git-remote.ts). Lets a remote MCP caller diagnose
"healthy | missing | not-a-dir | no-git | url-drift | corrupted" without
SSH access to the brain host. listSources additionally exposes
remote_url so callers can see which sources are auto-managed.

recloneIfMissing: T4 follow-up for `gbrain sources restore` after the
clone dir was autopurged — re-clones via the same temp + rename
atomicity contract. Idempotent (returns false when clone is already
healthy).

test/sources-ops.test.ts (~470 lines, 24 tests): pre-flight collision
(Q4), happy paths for both --path and --url, all four D3 rollback paths
(clone-fail before INSERT, INSERT-fail after clone, rename-fail
post-INSERT, atomic temp-dir cleanup), symlink-target-OUTSIDE-clones
(realpath confinement), symlink-target-INSIDE-clones (lstat-check),
removeSource refuses to delete user-supplied paths, refuses "default"
source, getSourceStatus clone_state branches, T4 recloneIfMissing
recovery + idempotent + no-op for path-only sources, isPathContained
unit tests covering subtree / outside / symlink-escape / fail-closed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(operations): whoami + sources_{add,list,remove,status} MCP ops

Five new ops in src/core/operations.ts auto-flow through src/mcp/tool-defs.ts
so MCP clients (Claude Desktop, ChatGPT, Perplexity, OpenClaw) get them via
standard tools/list discovery — no SDK or transport code changes needed.

Operation.scope union widened to add 'sources_admin' and 'users_admin' (the
v0.28 hierarchy from src/core/scope.ts).

whoami (scope: read): introspect calling identity over MCP.
  - Returns `{transport: 'oauth', client_id, client_name, scopes, expires_at}`
    for OAuth clients (clientId starts with gbrain_cl_).
  - Returns `{transport: 'legacy', token_name, scopes, expires_at: null}`
    for grandfathered access_tokens.
  - Returns `{transport: 'local', scopes: []}` when ctx.remote === false.
    Empty scopes (NOT ['read','write','admin']) is the D2 decision —
    returning OAuth-shaped scopes for local callers would resurrect the
    v0.26.9 footgun where code conditionally trusted on
    `auth.scopes.includes('admin')` instead of `ctx.remote === false`.
  - Q3 fail-closed: throws unknown_transport when remote=true AND auth is
    missing OR ctx.remote is the literal `undefined` (cast bypass guard).
    A future transport that forgets to thread auth doesn't get a free
    pass.

sources_add (sources_admin, mutating): register a source by --path
  (existing v0.17 behavior) or --url (v0.28 federated remote-clone path).
  Calls into addSource from sources-ops.ts which owns the temp-dir +
  rename atomicity.

sources_list (read): list registered sources with page counts, federated
  flag, and remote_url. The remote_url field is new — lets a remote MCP
  caller see which sources are auto-managed.

sources_remove (sources_admin, mutating): cascade-delete a source +
  symlink-safe clone cleanup. Requires confirm_destructive: true when the
  source has data.

sources_status (read): per-source diagnostic returning clone_state
  ('healthy' | 'missing' | 'not-a-dir' | 'no-git' | 'url-drift' |
  'corrupted' | 'not-applicable') — lets a remote MCP caller diagnose a
  busted clone without SSH access to the brain host.

test/whoami.test.ts (9 tests): pinned transport-detection for all four
return shapes including Q3 fail-closed throw under both auth=undefined
and remote=undefined cast-bypass paths.

test/sources-mcp.test.ts (16 tests): op-metadata pins (scope, mutating,
localOnly), functional handler shape against PGLite, hasScope-driven
scope-enforcement smoke test simulating the serve-http.ts:673 gate
(read-only token rejected for sources_add; sources_admin token allowed;
admin token allowed for everything; gstack /setup-gbrain Path 4 token
covers all 4 ops), SSRF gate at the op layer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(sync): re-clone fallback when clone is missing/no-git/corrupted

src/commands/sync.ts gets a v0.28-aware front-half. When the source has
config.remote_url, performSync calls validateRepoState before the existing
fast-forward pull path:

  - 'healthy'    → fall through to existing pull (unchanged)
  - 'missing'    → loud stderr "auto-recovery: re-cloning <id>", then
  'no-git'         recloneIfMissing handles the temp-dir + rename. Sync
  'not-a-dir'      continues from the freshly-cloned head.
  - 'corrupted'  → throw with structured hint pointing at sources remove
                   + add (no syncing wrong state).
  - 'url-drift'  → throw with hint pointing at the (deferred) sources
                   rebase-clone command.

Closes the operator-confidence gap: rm -rf $GBRAIN_HOME/clones/<id>/ no
longer breaks future syncs. The next sync sees the missing dir and
recovers via the recorded URL.

src/core/operations.ts: extend ErrorCode with 'unknown_transport' so
whoami's Q3 fail-closed path types check.

test/sources-resync-recovery.test.ts (12 tests): full validateRepoState
state matrix exercised under fake-git, recloneIfMissing recovery from
each degraded state, idempotent on healthy clones, the sync.ts:320
integration path that drives the recovery.

test/sources-ops.test.ts + test/sources-mcp.test.ts: drop the
GBRAIN_PGLITE_SNAPSHOT-disable line so these tests stop forcing cold
init across the parallel-shard runner. With snapshot allowed, init time
drops from 6+s to ~50ms and parallel runs stay under the 5s hook
timeout.

test/sources-mcp.test.ts: tighten scope literal-type so tsc keeps the
union narrow.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cli): sources add --url + restore re-clone, thin-wrapper refactor

src/commands/sources.ts now delegates the data-mutation work to
src/core/sources-ops.ts (added in the previous commit). The CLI handler
parses argv, calls into addSource, and formats output.

Two new flags on `gbrain sources add`:
  - `--url <https-url>` : federated remote-clone path (clone + INSERT +
    rename, atomic rollback on failure).
  - `--clone-dir <path>` : override the default
    $GBRAIN_HOME/clones/<id>/ destination.

Validation rejects mutually-exclusive `--url` + `--path`. Errors from
the ops layer (SourceOpError) propagate through the CLI's standard
error wrapper in src/cli.ts so existing tests that assert throw shape
keep passing.

`gbrain sources restore <id>` (T4 from eng review): if the source has a
remote_url AND the on-disk clone was autopurged, call recloneIfMissing
before declaring success. Clone errors print a WARN with recovery
hints rather than failing the restore — the DB row is what restore
guarantees; the clone is best-effort.

54 sources-related tests pass (existing test/sources.test.ts +
sources-ops + sources-mcp).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(doctor,cycle): orphan-clones surface + autopilot purge phase (P1)

addSource's atomicity contract uses a temp dir that gets renamed to the
final clone path. If the process is SIGKILL'd between clone-finish and
rename, the temp dir orphans on disk. Without sweeping these, a brain
server accumulates gigabytes over months of failed `sources add --url`
attempts.

Two layers:

1. `gbrain doctor` now surfaces stale entries. A new orphan_clones check
   walks $GBRAIN_HOME/clones/.tmp/, names anything older than 24h, and
   prints a warn with disk-byte estimate. Operators see the leak before
   `df` complains.

2. The autopilot cycle's existing `purge` phase grows a substep that
   nukes .tmp/ entries past the same 72h TTL the page-soft-delete purge
   uses. Operator behavior stays uniform across all soft-delete-style
   surfaces.

Both layers are filesystem-only (no DB). On a brain that never used
--url cloning, both are no-ops.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* build(admin): scope checkboxes source from scope-constants mirror + dist

admin/src/pages/Agents.tsx Register Client modal:
  - useState default sources from ALLOWED_SCOPES_LIST (defaulting `read`
    to true, others false; unchanged UX for the common case).
  - Scope checkbox map iterates ALLOWED_SCOPES_LIST instead of the old
    hardcoded ['read','write','admin'].

Without this commit, even with the v0.28.1 server-side scope hierarchy,
operators registering an OAuth client from the admin UI cannot tick the
new sources_admin / users_admin scopes — defeats the whole gstack
/setup-gbrain Path 4 unblock.

The drift-check CI gate (scripts/check-admin-scope-drift.sh) ensures
this list stays in sync with src/core/scope.ts going forward.

admin/dist/* rebuilt via `cd admin && bun run build`. Old hash bundle
removed; new bundle (224.96 kB / 68.70 kB gzip).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: v0.28.1 — remote-source MCP + scope hierarchy + whoami

VERSION + package.json: bump to 0.28.1 (per CLAUDE.md branch-scoped
versioning rule — this branch adds substantial new features on top of
v0.28.0).

CHANGELOG.md: new top-level entry for v0.28.1 in the gstack/Garry voice
(no AI vocabulary, no em dashes, real numbers + commands). Lead
paragraph names what the user can now do that they couldn't before.
"Numbers that matter" table calls out the +5 MCP ops, +2 OAuth scopes,
and the 4-to-0 SSH-step number for gstack /setup-gbrain Path 4. "What
this means for you" closer ties the work to the operator workflow shift.
"To take advantage of v0.28.1" block has paste-ready upgrade commands
including the admin SPA rebuild step. Itemized changes section
describes the architecture cleanly without exposing scope-string
internals to public attack-surface enumeration (per CLAUDE.md
responsible-disclosure rule).

TODOS.md: file 6 follow-ups under a new "Remote-source MCP follow-ups
(v0.28.1)" section: token rotation, migration introspection in
get_health, Accept-header friendliness, sources rebase-clone for
URL-drift recovery, --filter=blob:none partial-clone option, and the
chunker_version PGLite-schema parity codex caught.

README.md: short subsection under the existing sources CLI listing
that names the new --url flag and what auto-recovery does. Capability
framing (no scope-string enumeration).

llms.txt + llms-full.txt: regenerated via `bun run build:llms` so the
documentation bundle reflects the v0.28.1 entry. The build-llms
generator's drift check passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(e2e): sources-remote-mcp — full gstack /setup-gbrain Path 4 round-trip

Spins up `gbrain serve --http` against real Postgres with a fake-git binary
in PATH (so `git clone` is exercised end-to-end without network), registers
two OAuth clients (sources_admin + read-only), mints tokens, calls the new
v0.28.1 MCP ops via /mcp, and asserts the gstack /setup-gbrain Path 4 flow
works end to end.

12 tests cover the full lifecycle:
- whoami over HTTP MCP returns transport=oauth + the right scopes
- /.well-known/oauth-authorization-server advertises all 5 scopes
- sources_add: clone fires, INSERT lands, row carries config.remote_url
- sources_status: clone_state=healthy after add
- sources_list: surfaces remote_url for the new source
- SSRF rejection: sources_add with RFC1918 URL fails at parseRemoteUrl gate
- Scope enforcement: read-only token gets insufficient_scope on sources_add
- Read-only token CAN call sources_list (read-scoped op)
- ALLOWED_SCOPES allowlist: CLI register-client rejects bogus scope
- Recovery: rm clone dir + sources_status reports clone_state=missing
- sources_remove: cascades + cleans up the auto-managed clone dir

Subprocess env threading replicates the v0.26.2 bun execSync inheritance
pattern — bun does NOT inherit process.env mutations, so every CLI
subprocess call passes env: { ...process.env } explicitly.

Cleanup contract mirrors test/e2e/serve-http-oauth.test.ts: revoke any
clients we registered, force-kill the server subprocess on SIGTERM
timeout, surface cleanup failures to stderr without throwing so real
test failures aren't masked.

The base table list in helpers.ts (ALL_TABLES) doesn't include sources
or oauth_clients, so this test explicitly truncates them in beforeAll
to avoid Q4 pre-flight collisions on re-run.

Skipped gracefully when DATABASE_URL is unset.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: codex adversarial review — confine remote sources_admin + close SSRF gaps

Pre-ship adversarial review (codex exec) caught five issues. Four ship in
this commit; the fifth (DNS rebinding) is filed as v0.28.x follow-up.

CRITICAL — `sources_admin` tokens over HTTP MCP could plant content at any
host path. The MCP op exposed `path` and `clone_dir` to remote callers; the
op layer trusted them verbatim, then auto-recovery's rm -rf on degraded
state turned that into arbitrary delete primitives. src/core/operations.ts
sources_add handler now drops both fields when ctx.remote !== false. Local
CLI keeps the override (operator trust). Loud logger.warn when a remote
caller tries — visible in the SSE feed without leaking values.

HIGH — Steady-state `git pull --ff-only` bypassed GIT_SSRF_FLAGS entirely.
The legacy helper at src/commands/sync.ts:192 spawned git without the
-c http.followRedirects=false -c protocol.{file,ext}.allow=never
--no-recurse-submodules set that cloneRepo applies. Every recurring sync
was reopening the redirect/submodule/protocol bypass. Routed the call site
at sync.ts:381 through pullRepo from git-remote.ts so initial clone and
ongoing pull share one defensive flag set.

MEDIUM — listSources ignored its `include_archived` flag. The op
advertised the param but the function destructured it as `_opts` and
queried every row. Archived sources' ids, local_paths, and remote_urls
were leaking to read-scoped MCP callers by default. Filter in SQL
(`WHERE archived IS NOT TRUE` unless the flag is set) so archived rows
never reach the wire.

PARTIAL HIGH — IPv6 ULA fc00::/7 and link-local fe80::/10 were not in
the isInternalUrl bypass list. Only ::1/:: and IPv4-mapped IPv6 were
blocked. Added regex-based ULA + link-local rejection to url-safety.ts.

Test coverage:
- test/git-remote.test.ts: 4 new IPv6 cases (ULA fc-prefix + fd-prefix,
  link-local fe80::, public IPv6 still allowed).
- test/sources-mcp.test.ts: 3 new cases pinning the remote/local
  asymmetry (clone_dir override silently ignored over MCP, path nulled,
  local CLI keeps the override).
- test/sources-mcp.test.ts: 2 new cases for include_archived honored.

DNS rebinding (codex finding #3): the current gate is lexical only.
A deliberate attacker who controls a hostname's A/AAAA records can still
resolve to an internal IP. Closing this requires async DNS resolution +
revalidation; filed as v0.28.x follow-up in TODOS.md so the API change
surface (parseRemoteUrl becomes async, every caller updates) lands in
its own PR.

323 tests pass (9 files); 4071 unit tests pass (full suite).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: rebump v0.28.1 → v0.28.2 (master collision)

Caught after PR creation. master is at v0.28.1 already; this branch
forked from garrytan/v0.28-release at v0.28.0 and naively bumped to
v0.28.1 without checking the master queue. CI version-gate would have
rejected at merge time (requires VERSION strictly greater than
master's).

Root cause: I bumped VERSION mechanically during plan implementation
(echo "0.28.1" > VERSION) without consulting the queue-aware allocator
at bin/gstack-next-version. /ship Step 12's idempotency check then
classified state as ALREADY_BUMPED and the workflow's "queue drift"
comparison was the safety net I should have hit — but I skipped it.

Files updated:
- VERSION + package.json: 0.28.1 → 0.28.2
- CHANGELOG.md: header + "To take advantage of v0.28.2" subsection
- README.md: sources --url note version reference
- TODOS.md: 7 follow-up entries' version references
- llms.txt + llms-full.txt: regenerated

PR title rewrite via gstack-pr-title-rewrite.sh handled in a separate
gh pr edit call; CI version-gate now passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(todos): close longmemeval-publication, file 4 follow-up TODOs

Full 500-question 4-adapter LongMemEval _s benchmark landed at
github.com/garrytan/gbrain-evals#main:ced01f0. gbrain-hybrid 97.60% R@5,
+1.0pt over MemPal raw 96.6%. Replacing the now-stale "needs full run"
TODO with closure + 4 grounded follow-ups:

  1. Timeline-aware retrieval signal for temporal-reasoning questions
     (P2 — closes the only category we lose to MemPal-raw)
  2. Per-question batch consolidation for ~10x cold-cache speedup
     (P3 — makes daily benchmark CI gate practical)
  3. LongMemEval _m split run (P3 — differentiated, not yet published
     by MemPal)
  4. Cheaper-embedding-model recipe (P4 — recall-cost tradeoff curve)

Each TODO has the standard What/Why/Pros/Cons/Context/Depends-on shape per
the gbrain TODOS-format convention.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(llms): regenerate llms-full.txt to match merged CLAUDE.md

CI test/build-llms.test.ts asserts the committed llms.txt/llms-full.txt
are byte-for-byte identical to what scripts/build-llms.ts produces. The
master merge brought in v0.28.9/v0.28.10/v0.28.11 + multimodal embedding
notes that updated CLAUDE.md; the bundle was stale.

No content changes. Pure regeneration via `bun run build:llms`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(changelog): rewrite v0.28.12 entry — lead with the LongMemEval result

Old entry buried the headline ("LongMemEval lands in the box…") under
process detail (hermetic CI test count, 25.9ms p50, schema-table
runtime enumeration). The reader cares what gbrain DOES — not how we
plumbed the harness.

New entry leads with the actual number — 97.60% R@5 on the public
LongMemEval _s split, beating MemPalace raw by 1.0pt — followed by
the per-category win table that proves gbrain ties or beats MemPal in
5 of 6 question types and shows the +7.1pt assistant-voice lift.

Links to the full gbrain-evals report (97.60% headline + full
methodology + reproducible runner) so curious readers can dig deeper.

Two honest findings published in plain text: vector-only is
essentially tied with hybrid at K=5, and query expansion via Haiku is
a clean null result on this dataset. Better to publish the null than
hide it.

Reproduction block updated to match the actual gbrain-evals workflow
(clone + bun install + dataset download + bash batch runner). The
prior "download / run / hand to evaluate_qa.py" block stayed for the
in-tree CLI path.

Regenerated llms-full.txt to keep the build-llms regen-drift guard
green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request May 8, 2026
…t without being asked (#592)

* v0.29 foundation: emotional_weight column + formula + anomaly stats

Migration v34 adds pages.emotional_weight REAL DEFAULT 0.0 (column-only,
no index — salience query orders by computed score, not raw weight).
Embedded DDL (schema.sql + pglite-schema.ts + schema-embedded.ts)
mirrors the column so fresh installs don't need migration replay.

types.ts gains: PageFilters.sort enum + PAGE_SORT_SQL whitelist (engines
hardcoded ORDER BY updated_at DESC; threading lands in the next commit);
SalienceOpts/SalienceResult, AnomaliesOpts/AnomalyResult,
EmotionalWeightInputRow/EmotionalWeightWriteRow contracts.

cycle/emotional-weight.ts: pure-function score in [0..1] from tags +
takes (anglocentric default seed list; user-overridable via config key
emotional_weight.high_tags). cycle/anomaly.ts: meanStddev + cohort
threshold helpers with zero-stddev fallback (count > mean + 1) so rare
cohorts don't produce NaN sigmas.

Test coverage: migrate v34 structural assertions + 14-case formula
unit + 13-case anomaly stats unit. Codex review fixes baked in:
formula clamped to [0,1]; per-take weight clamped to [0,1] before
averaging; zero-stddev fallback finite, never NaN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 engine: batch emotional-weight methods + listPages sort

BrainEngine adds 4 methods, both engines implement:

- batchLoadEmotionalInputs(slugs?): CTE-shaped read with per-table
  pre-aggregates. A page with N tags + M takes never produces N×M rows
  (codex C4#4) — page_tags + page_takes CTEs aggregate independently,
  then LEFT JOIN to pages.

- setEmotionalWeightBatch(rows): UPDATE FROM unnest($1::text[],
  $2::text[], $3::real[]) composite-keyed on (slug, source_id). Multi-
  source brains can't fan out (codex C4#3) — pages.slug is unique only
  within source_id. Same shape that v0.18 link batches use.

- getRecentSalience: time boundary computed in JS, bound as TIMESTAMPTZ.
  SQL identical across engines (codex C5/D5 — avoids dialect drift on
  $1::interval binding which has zero current uses on PGLite).

- findAnomalies: tag + type cohort baselines via generate_series-
  densified daily-count CTEs (codex C4#6). Sparse-day rare cohorts get
  correct (mean, stddev) instead of biased upward by zero-omission.
  Year cohort deferred to v0.30.

listPages threads the new PageFilters.sort enum through both engines.
Was hardcoded ORDER BY updated_at DESC; now PAGE_SORT_SQL whitelist
maps the 4 enum values to literal SQL fragments — no injection surface.
postgres.js uses sql.unsafe; PGLite splices the fragment directly.

Regression tests (PGLite, no DATABASE_URL needed):

- multi-source-emotional-weight: same slug under two source_ids,
  setEmotionalWeightBatch on one of them, asserts the other survives
  untouched. Direct codex C4#3 guard.

- list-pages-regression (IRON RULE): old call shape (type, tag, limit)
  still returns updated_desc default; new sort=updated_asc reverses;
  sort=created_desc orders by created_at; sort=slug alphabetical;
  unsupported sort enum falls back to default (defense in depth).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 cycle: new recompute_emotional_weight phase

Adds a 9th cycle phase between extract and embed. Sees the union of
syncPagesAffected + synthesizeWrittenSlugs for incremental mode (so
synthesize-written pages get their weight computed too — codex C2 caught
that the prior plan threaded only sync). Full mode (no incremental
anchors) walks every page; users hit this path on first upgrade via
gbrain dream --phase recompute_emotional_weight.

Phase orchestrator (cycle/recompute-emotional-weight.ts) is two SQL
round-trips total regardless of brain size:
  1. batchLoadEmotionalInputs(slugs?) → per-page tag/take inputs.
  2. computeEmotionalWeight in memory (pure function).
  3. setEmotionalWeightBatch(rows) → composite-keyed UPDATE FROM unnest.

Empty affectedSlugs short-circuits (no DB read, no write). Dry-run
computes weights and reports the would-write count without touching
the DB. Engine throw bubbles into status:fail with code
RECOMPUTE_EMOTIONAL_WEIGHT_FAIL — cycle continues to the next phase.

Plumbing:
- CyclePhase type adds 'recompute_emotional_weight'.
- ALL_PHASES + NEEDS_LOCK_PHASES include it.
- CycleReport.totals adds pages_emotional_weight_recomputed (additive,
  schema_version stays "1").
- runCycle's totals rollup + status derivation honor the new field.
- synthesize.ts emits writtenSlugs in details so cycle.ts can union
  with syncPagesAffected for incremental backfill.

Tests: 7-case unit (fake-engine), 3-case PGLite e2e (full mode + dry-
run + ALL_PHASES position), 1000-page perf budget (<5s on PGLite).

Codex C2 → A: clean separation. Phase doesn't modify runExtractCore;
runs on its own seam after the existing 8 phases plus synthesize.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 ops: get_recent_salience + find_anomalies + get_recent_transcripts

Three new MCP operations + a transcripts library:

- get_recent_salience: pages ranked by emotional + activity salience.
  Subagent-allow-listed. params: days (default 14), limit (default 20,
  capped 100), slugPrefix (renamed from `kind` per codex C4#10 to
  avoid collision with PageKind/TakeKind).

- find_anomalies: cohort-level activity outliers (tag + type).
  Subagent-allow-listed. Year cohort deferred to v0.30.

- get_recent_transcripts: raw .txt transcripts from the dream-cycle
  corpus dirs. LOCAL-ONLY: rejects ctx.remote === true with
  permission_denied (codex C3). NOT in the subagent allow-list — all
  subagent calls run with remote=true, would always reject (footgun if
  visible). Cycle's synthesize phase calls discoverTranscripts
  directly, so subagents that need transcripts go through the library
  function, not the op.

Tool descriptions extracted to src/core/operations-descriptions.ts so
they're pinnable in tests and stable for the Tier-2 LLM routing eval.
Redirects on query/search/list_pages: personal/emotional questions
should reach the new ops, not semantic search. Anti-flattery hint on
query: "Do NOT assume words like crazy, notable, or big mean
impressive — they often mean difficult or emotionally charged."

list_pages gains updated_after (string ISO) and sort enum params,
surfacing the engine threading from the prior commit.

src/core/transcripts.ts: filesystem walk shared by the gated MCP op
and the (commit 5) CLI command. Reuses discoverTranscripts corpus-dir
resolution + isDreamOutput from cycle/transcript-discovery.ts. Trust
gate lives in the op handler, not the library — the library is
trusted by both the gated op and the local CLI.

Allow-list: 11 → 13 (add salience + anomalies; transcripts excluded
per codex C3, with a comment explaining why).

Tests: 21-case description pin (catches accidental edits that change
LLM-facing surface); 11-case transcripts unit covering trust gate,
mtime window, dream-output skip, summary truncation, no corpus_dir;
2-case salience type-contract smoke (full Garry-test fixture in commit
6's e2e suite).

Codex C1: routing-eval fixtures (skills/<x>/routing-eval.jsonl)
deliberately NOT shipped — routing-eval.ts is substring-match on
resolver triggers, not MCP tool routing. Real coverage lands as
test/e2e/salience-llm-routing.test.ts in commit 6.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 CLI: gbrain salience / anomalies / transcripts

Three new CLI commands wired into src/cli.ts dispatch + CLI_ONLY set +
help text:

- gbrain salience [--days N] [--limit N] [--kind PREFIX] [--json]
- gbrain anomalies [--since YYYY-MM-DD] [--lookback-days N] [--sigma N] [--json]
- gbrain transcripts recent [--days N] [--full] [--json]

Each command file mirrors src/commands/orphans.ts shape: pure data fn
+ JSON formatter + human formatter. Calls into engine.getRecentSalience
/ findAnomalies (already shipped) and src/core/transcripts.ts.

salience and anomalies show ranked rows with per-cohort
mean/stddev/sigma. transcripts honors `--full` (caps at 100KB/file)
vs default summary (first non-empty line + ~250 chars). All three
emit JSON with --json for agent consumption.

`--kind` is accepted as a slug-prefix shorthand on `gbrain salience`
even though the underlying op param is `slugPrefix` (kept the CLI
flag short; the MCP-facing param uses the more-explicit name to
align with PageKind/TakeKind/slugPrefix vocabulary).

CLI_ONLY set in src/cli.ts gains the three new command names so
they don't get forwarded to MCP-only routing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 e2e: Garry-test fixtures + Postgres parity + LLM routing eval

PGLite e2e (no DATABASE_URL needed):

- salience-pglite: the Garry test. 7 wedding-tagged pages updated today
  + 100 background pages backdated across 30 days via raw SQL UPDATE
  (codex C4#7 — engine.putPage stamps updated_at = now(), so seeding
  via the engine alone can't reproduce historical recency windows).
  Asserts wedding pages outrank random-tag noise in the 7-day window;
  slugPrefix filter narrows correctly; days=0 boundary case; limit cap.

- anomalies-pglite: same fixture shape (7 wedding pages today, 100
  background backdated). findAnomalies with sigma=3 returns the
  wedding-tag cohort with sigma_observed > 3 vs near-zero baseline;
  page_slugs sample carries the wedding pages; date with no activity
  returns []; high sigma threshold suppresses borderline cohorts
  (zero-stddev fallback stays finite — no NaN sigma).

Postgres-gated e2e:

- engine-parity-salience: PGLite ↔ Postgres parity for getRecentSalience
  and findAnomalies. Same fixture into both engines; top-result and
  cohort-set match. Closes the v0.22.0-style parity gap for the new
  v0.29 SQL idioms (EXTRACT(EPOCH ...), generate_series, CTE chain).

Tier-2 LLM routing eval (ANTHROPIC_API_KEY-gated):

- salience-llm-routing: calls Claude with v0.29 tool descriptions and
  12 personal-query phrasings ("anything crazy lately", "what's been
  going on with me", etc.). Asserts the chosen tool is in the v0.29
  set, not query() / search(). ~$0.10 per CI run on Haiku. Tests the
  ACTUAL ship criterion — replaces the discarded fake-coverage
  routing-eval.jsonl fixtures (codex C1 → B).

This is the only test that proves the description edits drive routing.
Without it, we'd ship description changes and only learn from
production behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.0: ship-prep — VERSION + CHANGELOG + CLAUDE Key Files

VERSION + package.json bump 0.28.0 → 0.29.0.

CHANGELOG.md adds a v0.29.0 release-summary in the GStack/Garry voice
plus the "To take advantage of v0.29.0" block. Headline two-liner:
"The brain tells you what's hot without being asked. Salience +
anomaly detection ship. Search rewards hypotheses; salience surfaces
them." Numbers-that-matter table covers engine surface delta, MCP op
delta, allow-list delta, cycle-phase delta, schema migration, list_pages
param surface, and test count. Itemized changes section lists the
schema migration + new cycle phase + new MCP ops + redirect
descriptions + subagent allow-list rules + new tests + a contributor
note clarifying that routing-eval.ts is not the right surface for
testing MCP tool routing (use the Tier-2 LLM eval pattern instead).

CLAUDE.md Key Files updated for the v0.29 surface:

- src/core/engine.ts: notes the 4 new methods + PageFilters.sort threading.
- src/core/migrate.ts: v34 (pages_emotional_weight) entry.
- src/core/cycle.ts: 8 → 9 phases, recompute_emotional_weight inserted
  between patterns and embed; totals.pages_emotional_weight_recomputed.
- src/core/cycle/emotional-weight.ts (NEW): formula + override path.
- src/core/cycle/anomaly.ts (NEW): stats helpers + zero-stddev fallback.
- src/core/cycle/recompute-emotional-weight.ts (NEW): phase orchestrator.
- src/core/transcripts.ts (NEW): library shared by gated MCP op + CLI.
- src/core/operations-descriptions.ts (NEW): pinned tool descriptions.
- src/core/minions/tools/brain-allowlist.ts: 11 → 13 entries; comment
  on why get_recent_transcripts is excluded.
- src/commands/salience.ts / anomalies.ts / transcripts.ts (NEW): CLI surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1 feat: recency + salience as two orthogonal options on query op (#696)

* feat: recency boost for search (v0.27.0) — temporal intent auto-detection, date filters, configurable decay

New search pipeline stage: keyword + vector → RRF → cosine re-score → backlink boost → recency boost → dedup

- applyRecencyBoost: hyperbolic decay, two strengths (moderate 30-day halflife, aggressive 7-day halflife)
- Auto-enabled when intent.ts detects temporal/event queries (detail='high')
- Manual override via SearchOpts.recencyBoost (0/1/2)
- Date filtering: afterDate/beforeDate on all three search paths (keyword, keywordChunks, vector)
- getPageTimestamps on both Postgres and PGLite engines
- 15 tests passing (boost math + intent classification)

* v0.29.1 schema: pages.{effective_date, effective_date_source, import_filename, salience_touched_at} + expression index

Migration v38 adds 4 nullable columns to pages and an expression index on
COALESCE(effective_date, updated_at) to support the new since/until date
filters. All additive — no behavior change in the default search path; only
consulted when callers opt into the new salience='on' / recency='on' axes
or pass since/until.

  effective_date         — content date (event_date / date / published /
                           filename-date / fallback). Read by recency boost
                           and date-filter paths only. Auto-link doesn't
                           touch it (immune to updated_at churn).
  effective_date_source  — sentinel for the doctor's effective_date_health
                           check ('event_date' | 'date' | 'published' |
                           'filename' | 'fallback').
  import_filename        — basename without extension, captured at import.
                           Used for filename-date precedence on daily/,
                           meetings/. Older rows leave it NULL.
  salience_touched_at    — bumped by recompute_emotional_weight when
                           emotional_weight changes. Salience window uses
                           GREATEST(updated_at, salience_touched_at) so
                           newly-salient old pages enter the recent salience
                           query.

Index strategy: a partial index on effective_date alone wouldn't help the
COALESCE expression in since/until filters (planner can't use it for the
negative side). The expression index ((COALESCE(effective_date, updated_at)))
is what actually accelerates the filter.

Postgres uses CONCURRENTLY + v14-style pg_index.indisvalid pre-drop guard
for prior failed CONCURRENTLY runs; PGLite uses plain CREATE INDEX. Mirror
of v34's pattern.

src/schema.sql + src/core/pglite-schema.ts updated for fresh installs;
src/core/schema-embedded.ts regenerated via bun run build:schema.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: computeEffectiveDate helper + putPage integration

Pure helper computing a page's effective_date from frontmatter precedence:
  1. event_date (meeting/event pages)
  2. date (dated essays)
  3. published (writing/)
  4. filename-date (leading YYYY-MM-DD in basename)
  5. updated_at (fallback)
  6. created_at (last resort)

Per-prefix override: for daily/ and meetings/ slugs, filename-date jumps
to position 1 — the filename is the user's primary signal there.

Returns {date, source}. The source label powers the doctor's
effective_date_health check to detect "fell back to updated_at" rows that
look populated but are functionally a NULL.

Range validation: parsed value must be in [1990-01-01, NOW + 1 year].
Out-of-range values drop to the next chain element.

Wired into importFromContent + importFromFile. The put_page MCP op derives
filename from slug-tail when no caller-supplied filename is available.

putPage SQL on both engines extended to write the new columns. ON CONFLICT
uses COALESCE(EXCLUDED.x, pages.x) so callers that don't know about the
new columns (auto-link, code reindex) preserve existing values rather than
blanking them. SELECT projection extended to return them; rowToPage threads
them through.

21 unit tests covering: precedence chain default order, per-prefix override,
parse failure fall-through, range validation [1990, NOW+1y], parseDateLoose
shape variants. All pass; typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: backfill orchestrator + library function for existing pages

src/core/backfill-effective-date.ts is the shared library function. Walks
pages in keyset-paginated batches (id > last_id ORDER BY id LIMIT 1000),
runs computeEffectiveDate per row, UPDATEs effective_date +
effective_date_source. Resumable via the `backfill.effective_date.last_id`
checkpoint key in the config table — a killed process can re-run and pick
up without re-doing rows. Idempotent: a full re-walk produces the same
writes.

Postgres-only: SET LOCAL statement_timeout = '600s' per batch. Doesn't
refuse the migration on low session settings (codex pass-2 #16).

src/commands/migrations/v0_29_1.ts is the orchestrator (4 phases mirroring
v0_12_2). Phase A schema (gbrain init --migrate-only), Phase B backfill
(via the library function), Phase C verify (count NULL effective_date),
Phase D record (handled by runner). The library function is reusable from
the gbrain reindex-frontmatter CLI command in the next commit.

import_filename stays NULL for backfilled rows — pre-v0.29.1 imports
didn't capture it. computeEffectiveDate uses the slug-tail when filename
is NULL; daily/2024-03-15 backfilled gets effective_date from the slug.

Registered in src/commands/migrations/index.ts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: gbrain reindex-frontmatter CLI command

Recovery / explicit-rebuild path for pages.effective_date. Used when:
  - User edited frontmatter dates after import
  - Post-upgrade backfill orchestrator finished but the user wants to
    re-walk a subset (e.g. just meetings/) after fixing some frontmatter
  - Precedence rules change between releases

Thin wrapper over backfillEffectiveDate from commit 3 — same code path
the v0_29_1 orchestrator uses; one source of truth.

Flags mirror reindex-code:
  --source <id>      Scope to one sources row (placeholder; library
                     library doesn't filter by source today, tracked v0.30+)
  --slug-prefix P    Scope to slugs starting with P (e.g. 'meetings/')
  --dry-run          Print what WOULD change, no DB writes
  --yes              Skip confirmation prompt (required for non-TTY non-JSON)
  --json             Machine-readable result envelope
  --force            Re-apply even when computed value matches existing

Wired into src/cli.ts. CLI handles its own engine lifecycle (creates +
disconnects).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: recency-decay map + buildRecencyComponentSql (pure, unused)

src/core/search/recency-decay.ts mirrors source-boost.ts in shape but
drives RECENCY ONLY (per D9 codex resolution). Salience is a separate
orthogonal axis; this map does not feed it.

DEFAULT_RECENCY_DECAY: 10 generic prefixes (no fork-specific names).
  - concepts/      evergreen (halflifeDays=0)
  - originals/     180d × 0.5 (long-tail decay; new essays nudged)
  - writing/       365d × 0.4
  - daily/         14d × 1.5  (aggressive — freshness IS the signal)
  - meetings/      60d × 1.0
  - chat/          7d × 1.0
  - media/x/       7d × 1.5
  - media/articles/ 90d × 0.5
  - people/companies/ 365d × 0.3
  - deals/         180d × 0.5

DEFAULT_FALLBACK: 90d × 0.5 for unmatched slugs.

Override priority: defaults < gbrain.yml recency: < env (GBRAIN_RECENCY_DECAY)
< per-call SearchOpts.recency_decay.

parseRecencyDecayEnv format: comma-separated prefix:halflifeDays:coefficient
triples. Refuses LOUD on parse error (RecencyDecayParseError) — codex
pass-2 #M3 finding. No silent fallback like source-boost's parser.

parseRecencyDecayYaml takes already-parsed YAML; throws on bad shape.

buildRecencyComponentSql in sql-ranking.ts emits a CASE expression with
longest-prefix-first ordering, evergreen short-circuit (literal 0 when
halflifeDays=0 or coefficient=0), and EXTRACT(EPOCH ...) for non-zero
branches. Output: ((CASE WHEN p.slug LIKE 'daily/%' THEN 1.5 * 14.0 /
(14.0 + EXTRACT(EPOCH FROM (NOW() - <dateExpr>))/86400.0) ... END))

Typed NowExpr enum prevents SQL injection (codex pass-1 #5). Tests pass
{ kind: 'fixed', isoUtc } for deterministic output; production NOW().
The 'fixed' branch escapes single quotes via escapeSqlLiteral.

25 unit tests covering: env parser shape, env error cases, yaml parser
shape, merge precedence (defaults < yaml < env < caller), CASE longest-
prefix-first ordering, evergreen short-circuit, NowExpr fixed/now,
single-quote injection defense, empty decayMap fallback path, default
map composition (no fork names, concepts/ evergreen, daily/ aggressive).

Pure module. Zero consumers in this commit; commit 6 wires it into
getRecentSalience, commit 10 wires it into the post-fusion stage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: refactor getRecentSalience to consume buildRecencyComponentSql

Both engines (Postgres + PGLite) now build the salience formula's third
term via buildRecencyComponentSql instead of inlining 1.0 / (1 + days_old).
Parameters: empty decayMap + fallback { halflifeDays: 1, coefficient: 1.0 }.
Math expands to 1 * 1.0 / (1.0 + days_old) = 1 / (1 + days_old) — same
numeric output as v0.29.0.

This is a no-behavior-change refactor preparing for commit 7's recency_bias
param. recency_bias='flat' (default) reproduces v0.29.0 exactly; 'on'
swaps in DEFAULT_RECENCY_DECAY for per-prefix decay.

Single source of truth for the recency math: same builder feeds the
salience query AND (in commit 10) the post-fusion applyRecencyBoost stage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: get_recent_salience gains recency_bias param (default 'flat')

SalienceOpts.recency_bias: 'flat' | 'on' added; default 'flat' preserves
v0.29.0 ranking verbatim. Pass 'on' to opt into per-prefix decay map
(concepts/originals/writing/ evergreen; daily/, media/x/, chat/ aggressive
decay).

When recency_bias='on', the salience query reads
COALESCE(p.effective_date, p.updated_at) instead of bare p.updated_at, so
the recency component is immune to auto-link updated_at churn — old
concepts/ pages just-touched by auto-link don't suddenly look fresh.

Both engines (Postgres + PGLite) wire the param through. resolveRecencyDecayMap()
honors gbrain.yml + GBRAIN_RECENCY_DECAY env at runtime.

MCP op surface: get_recent_salience gains the param with a load-bearing
description teaching the agent when to use 'on' vs 'flat' (current state →
on; mattering across all time → flat).

No silent v0.29.0 behavior change — opt-in only (per D11 codex resolution).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: recompute_emotional_weight writes salience_touched_at; window picks up newly-salient pages

setEmotionalWeightBatch on both engines now bumps salience_touched_at to
NOW() ONLY when the new emotional_weight differs from the existing one
(IS DISTINCT FROM, NULL-safe). No-op writes (same weight) leave the
column alone — preserves "actual change" semantics.

getRecentSalience window changes from
  WHERE p.updated_at >= boundary
to
  WHERE GREATEST(p.updated_at, COALESCE(p.salience_touched_at, p.updated_at)) >= boundary

Closes codex pass-1 finding #4: pages whose emotional_weight just changed
in the dream cycle (because tags or takes shifted) but whose updated_at
is older than the salience window now correctly enter the recent-salience
results. Without this, "Garry just added a take to a 6-month-old page"
stayed invisible to get_recent_salience until the next content edit.

COALESCE(salience_touched_at, p.updated_at) handles pre-v0.29.1 rows
where salience_touched_at is NULL — they fall back to p.updated_at and
behave identically to v0.29.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: merge intent.ts → query-intent.ts; emit 3 suggestions per query

D1 + D4 + D6 + D8: single regex-pass classifier returning
{intent, suggestedDetail, suggestedSalience, suggestedRecency}.

intent + suggestedDetail are v0.29.0 behavior verbatim (legacy intent.ts
deleted; classifyQueryIntent + autoDetectDetail compat shims preserved).

NEW for v0.29.1 — two orthogonal recency-axis suggestions:

  suggestedSalience: 'off' | 'on' | 'strong'
  suggestedRecency:  'off' | 'on' | 'strong'

Resolution rules (per D6 narrow temporal-bound exception):
  - CANONICAL patterns (who is X / what is Y / code / graph) → both off
  - UNLESS an EXPLICIT_TEMPORAL_BOUND also matches (today / right now /
    this week / since X / last N days), in which case temporal-bound wins
  - STRONG_RECENCY (today / right now / this morning / just now) → strong
  - RECENCY_ON (latest / recent / this week / meeting prep / catch up
    / remind me / status update) → on
  - SALIENCE_ON (catch up / remind me / status update / prep me /
    what's going on / what matters) → on
  - default → off for both axes (v0.29.1 prime-directive: pure opt-in)

Salience and recency are TRULY orthogonal (per D9). A query like
"latest news on AI" → recency='on' but salience='off' (the user wants
fresh, not emotionally-weighted). "What's going on with widget-co" →
both on. "Who is X right now" → both 'strong'/'on' (temporal bound
beats canonical 'who is').

intent.ts deleted; test/intent.test.ts renamed → test/query-intent-legacy.test.ts
(unchanged behavior coverage). New test/query-intent.test.ts adds 21
cases covering all three axes' interactions: canonical wins on bare
'who is', temporal bound overrides, "catch me up" matches with up to 15
chars between, "today" → strong, intent vs recency independence.

Updated callers:
  - src/core/search/hybrid.ts (autoDetectDetail import)
  - test/recency-boost.test.ts (classifyQueryIntent import)
  - test/benchmark-search-quality.ts (autoDetectDetail import)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: applySalienceBoost + applyRecencyBoost + runPostFusionStages wrapper

D9 + codex pass-1 #2 + #3 + pass-2 #4: salience and recency are TRULY
ORTHOGONAL post-fusion stages, both running from ALL THREE hybridSearch
return paths (keyword-only, embed-failure-fallback, full-hybrid).

NEW src/core/search/hybrid.ts exports:
  - applySalienceBoost(results, scores, strength)
      score *= 1 + k * log(1 + score) where k = 0.15 (on) or 0.30 (strong)
      No time component. Pure mattering signal.
  - applyRecencyBoost(results, dates, strength, decayMap, fallback, nowMs?)
      Per-prefix decay factor: 1 + strengthMul * coefficient * halflife / (halflife + days_old)
      strengthMul: 1.0 (on) or 1.5 (strong)
      Evergreen prefixes (halflifeDays=0) skipped (factor 1.0).
      Pure recency signal. Independent of mattering.
  - runPostFusionStages(engine, results, opts)
      Wraps backlink + salience + recency. Called from EACH return path so
      keyless installs and embed failures get the same boost surface as
      the full hybrid path.

NEW engine methods (composite-keyed for multi-source isolation):
  - getEffectiveDates(refs: Array<{slug, source_id}>): Map<key, Date>
      Returns COALESCE(effective_date, updated_at, created_at). Key format:
      `${source_id}::${slug}`. Mirror of getBacklinkCounts shape.
  - getSalienceScores(refs: Array<{slug, source_id}>): Map<key, number>
      Returns emotional_weight × 5 + ln(1 + take_count). Composite key.

Deprecated (kept for back-compat through v0.29.x):
  - SearchOpts.afterDate / beforeDate (alias for since/until)
  - SearchOpts.recencyBoost: 0|1|2 (alias for recency: 'off'|'on'|'strong')
  - getPageTimestamps (use getEffectiveDates instead)

NEW SearchOpts fields:
  - salience: 'off' | 'on' | 'strong'
  - recency:  'off' | 'on' | 'strong'
  - since:    string (ISO-8601 or relative, replaces afterDate)
  - until:    string (replaces beforeDate)

Resolution: caller-explicit > legacy alias (recencyBoost) > heuristic
(classifyQuery's suggestedSalience / suggestedRecency).

Deleted: src/core/search/recency.ts (PR #618's, replaced) +
test/recency-boost.test.ts (its scope is replaced by query-intent.test.ts +
future post-fusion tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Wintermute <wintermute@garrytan.com>

* v0.29.1: query op gains salience + recency + since + until params; PGLite since/until parity

Combines commits 12 + 13 of the plan.

Query op surface (src/core/operations.ts):
  - salience: 'off' | 'on' | 'strong' (with load-bearing description)
  - recency:  'off' | 'on' | 'strong'
  - since:    string (ISO-8601 or relative; replaces deprecated afterDate)
  - until:    string (replaces deprecated beforeDate)

Tool descriptions teach the calling agent:
  - salience axis = mattering, no time component
  - recency axis = age decay, no mattering signal
  - omit either to let gbrain auto-detect from query text via classifyQuery

hybrid.ts maps since/until → afterDate/beforeDate at the engine call
boundary so PR #618's existing engine plumbing keeps working without
rename. Codex pass-1 #10 finding closed.

PGLite engine (codex pass-1 #10): since/until parity added to all three
search methods (searchKeyword, searchKeywordChunks, searchVector). SQL
filter against COALESCE(p.effective_date, p.updated_at, p.created_at)
so date filtering matches user content-date intent (a meeting was on
event_date, not when it got reimported). Filter is applied INSIDE the
HNSW inner CTE in searchVector so HNSW's candidate pool already
excludes out-of-range pages — preserves pagination contract.

This also closes existing cross-engine drift: pre-v0.29.1 Postgres had
afterDate/beforeDate from PR #618; PGLite had nothing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: migration v39 — eval_candidates capture columns for replay reproducibility

D11 codex pass-2 resolution: extend eval_candidates with 7 new nullable
columns so `gbrain eval replay` can reproduce captured runs of agent-explicit
salience + recency choices.

Without these columns, replays of the new axis params drift. The live
behavior depends on the resolved {salience, recency} values; v0.29.0's
schema doesn't capture them.

  as_of_ts            TIMESTAMPTZ  — brain's logical NOW at capture
                                     (replay uses this instead of wall-clock)
  salience_param      TEXT         — what the caller passed (NULL if omitted)
  recency_param       TEXT         — same
  salience_resolved   TEXT         — final value applied
  recency_resolved    TEXT         — same
  salience_source     TEXT         — 'caller' or 'auto_heuristic'
  recency_source      TEXT         — same

All nullable + additive. Pre-v0.29.1 rows stay valid. NDJSON
schema_version STAYS at 1 — consumers ignore unknown fields (codex
pass-1 #C2 dissolves; no cross-repo coordination needed).

ADD COLUMN with no DEFAULT is metadata-only on PG 11+ and PGLite —
instant on tables of any size.

src/schema.sql + src/core/pglite-schema.ts mirror the additions for fresh
installs; src/core/schema-embedded.ts regenerated. eval_capture.ts
populates the new fields in commit 16 (docs + ship).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: doctor checks — effective_date_health + salience_health

effective_date_health: sample-1000 scan detects three classes of
problems (codex pass-1 #5 resolution via the effective_date_source
sentinel column added in commit 1):

  fallback_with_fm_date  — page fell back to updated_at even though
                           frontmatter has parseable event_date / date /
                           published. The "wrong but populated" residual
                           that earlier review iterations missed.
  future_dated            — effective_date > NOW() + 1 year (corrupt
                            or typo'd century).
  pre_1990                — effective_date < 1990-01-01 (epoch math gone
                            wrong, bad parse).

Sample of last 1000 pages by default — fast on 200K-page brains. Fix
hint: gbrain reindex-frontmatter.

salience_health: detects pages with active takes whose emotional_weight
is still 0 (recompute_emotional_weight phase hasn't run since the
take landed). Reports the brain's non-zero emotional_weight count as
an informational baseline. Fix hint: gbrain dream --phase
recompute_emotional_weight.

Both checks gracefully skip on pre-v0.29.1 brains (column doesn't
exist → 42703) without surfacing as warnings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: docs + skills convention + CHANGELOG + version bump

- VERSION 0.29.0 → 0.29.1
- package.json version bump
- CHANGELOG.md: full release-summary + itemized + "To take advantage"
  block per the project's voice rules. Two-line headline + concrete
  pathology framing (existing callers unchanged; new axes opt-in;
  agent in charge per the prime directive).
- skills/conventions/salience-and-recency.md: agent-readable decision
  rules. "Current state → on. Canonical truth → off." plus the narrow
  temporal-bound exception. Cross-cutting convention propagates to
  brain skills via RESOLVER.md.
- skills/migrations/v0.29.1.md: agent-readable upgrade instructions.
  Verify steps + behavior-change reference + recovery commands.

The build-time tool-description generator from D2 (extract decision
tables from skills/conventions/salience-and-recency.md, embed into
operations.ts at build time) is deferred to a follow-up commit. The
tool descriptions on the query op + get_recent_salience are inline in
operations.ts for v0.29.1; the auto-gen + CI staleness gate land in
v0.29.2 if drift becomes a problem in practice.

148 unit tests pass across the v0.29.1 surface (effective-date,
recency-decay, query-intent, migrate, salience, recompute-emotional-weight).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Wintermute <wintermute@garrytan.com>

---------

Co-authored-by: Wintermute <wintermute@garrytan.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Wintermute <wintermute@garrytan.com>
garrytan added a commit that referenced this pull request May 8, 2026
… what's hot without being asked (#730)

* v0.29 foundation: emotional_weight column + formula + anomaly stats

Migration v34 adds pages.emotional_weight REAL DEFAULT 0.0 (column-only,
no index — salience query orders by computed score, not raw weight).
Embedded DDL (schema.sql + pglite-schema.ts + schema-embedded.ts)
mirrors the column so fresh installs don't need migration replay.

types.ts gains: PageFilters.sort enum + PAGE_SORT_SQL whitelist (engines
hardcoded ORDER BY updated_at DESC; threading lands in the next commit);
SalienceOpts/SalienceResult, AnomaliesOpts/AnomalyResult,
EmotionalWeightInputRow/EmotionalWeightWriteRow contracts.

cycle/emotional-weight.ts: pure-function score in [0..1] from tags +
takes (anglocentric default seed list; user-overridable via config key
emotional_weight.high_tags). cycle/anomaly.ts: meanStddev + cohort
threshold helpers with zero-stddev fallback (count > mean + 1) so rare
cohorts don't produce NaN sigmas.

Test coverage: migrate v34 structural assertions + 14-case formula
unit + 13-case anomaly stats unit. Codex review fixes baked in:
formula clamped to [0,1]; per-take weight clamped to [0,1] before
averaging; zero-stddev fallback finite, never NaN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 engine: batch emotional-weight methods + listPages sort

BrainEngine adds 4 methods, both engines implement:

- batchLoadEmotionalInputs(slugs?): CTE-shaped read with per-table
  pre-aggregates. A page with N tags + M takes never produces N×M rows
  (codex C4#4) — page_tags + page_takes CTEs aggregate independently,
  then LEFT JOIN to pages.

- setEmotionalWeightBatch(rows): UPDATE FROM unnest($1::text[],
  $2::text[], $3::real[]) composite-keyed on (slug, source_id). Multi-
  source brains can't fan out (codex C4#3) — pages.slug is unique only
  within source_id. Same shape that v0.18 link batches use.

- getRecentSalience: time boundary computed in JS, bound as TIMESTAMPTZ.
  SQL identical across engines (codex C5/D5 — avoids dialect drift on
  $1::interval binding which has zero current uses on PGLite).

- findAnomalies: tag + type cohort baselines via generate_series-
  densified daily-count CTEs (codex C4#6). Sparse-day rare cohorts get
  correct (mean, stddev) instead of biased upward by zero-omission.
  Year cohort deferred to v0.30.

listPages threads the new PageFilters.sort enum through both engines.
Was hardcoded ORDER BY updated_at DESC; now PAGE_SORT_SQL whitelist
maps the 4 enum values to literal SQL fragments — no injection surface.
postgres.js uses sql.unsafe; PGLite splices the fragment directly.

Regression tests (PGLite, no DATABASE_URL needed):

- multi-source-emotional-weight: same slug under two source_ids,
  setEmotionalWeightBatch on one of them, asserts the other survives
  untouched. Direct codex C4#3 guard.

- list-pages-regression (IRON RULE): old call shape (type, tag, limit)
  still returns updated_desc default; new sort=updated_asc reverses;
  sort=created_desc orders by created_at; sort=slug alphabetical;
  unsupported sort enum falls back to default (defense in depth).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 cycle: new recompute_emotional_weight phase

Adds a 9th cycle phase between extract and embed. Sees the union of
syncPagesAffected + synthesizeWrittenSlugs for incremental mode (so
synthesize-written pages get their weight computed too — codex C2 caught
that the prior plan threaded only sync). Full mode (no incremental
anchors) walks every page; users hit this path on first upgrade via
gbrain dream --phase recompute_emotional_weight.

Phase orchestrator (cycle/recompute-emotional-weight.ts) is two SQL
round-trips total regardless of brain size:
  1. batchLoadEmotionalInputs(slugs?) → per-page tag/take inputs.
  2. computeEmotionalWeight in memory (pure function).
  3. setEmotionalWeightBatch(rows) → composite-keyed UPDATE FROM unnest.

Empty affectedSlugs short-circuits (no DB read, no write). Dry-run
computes weights and reports the would-write count without touching
the DB. Engine throw bubbles into status:fail with code
RECOMPUTE_EMOTIONAL_WEIGHT_FAIL — cycle continues to the next phase.

Plumbing:
- CyclePhase type adds 'recompute_emotional_weight'.
- ALL_PHASES + NEEDS_LOCK_PHASES include it.
- CycleReport.totals adds pages_emotional_weight_recomputed (additive,
  schema_version stays "1").
- runCycle's totals rollup + status derivation honor the new field.
- synthesize.ts emits writtenSlugs in details so cycle.ts can union
  with syncPagesAffected for incremental backfill.

Tests: 7-case unit (fake-engine), 3-case PGLite e2e (full mode + dry-
run + ALL_PHASES position), 1000-page perf budget (<5s on PGLite).

Codex C2 → A: clean separation. Phase doesn't modify runExtractCore;
runs on its own seam after the existing 8 phases plus synthesize.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 ops: get_recent_salience + find_anomalies + get_recent_transcripts

Three new MCP operations + a transcripts library:

- get_recent_salience: pages ranked by emotional + activity salience.
  Subagent-allow-listed. params: days (default 14), limit (default 20,
  capped 100), slugPrefix (renamed from `kind` per codex C4#10 to
  avoid collision with PageKind/TakeKind).

- find_anomalies: cohort-level activity outliers (tag + type).
  Subagent-allow-listed. Year cohort deferred to v0.30.

- get_recent_transcripts: raw .txt transcripts from the dream-cycle
  corpus dirs. LOCAL-ONLY: rejects ctx.remote === true with
  permission_denied (codex C3). NOT in the subagent allow-list — all
  subagent calls run with remote=true, would always reject (footgun if
  visible). Cycle's synthesize phase calls discoverTranscripts
  directly, so subagents that need transcripts go through the library
  function, not the op.

Tool descriptions extracted to src/core/operations-descriptions.ts so
they're pinnable in tests and stable for the Tier-2 LLM routing eval.
Redirects on query/search/list_pages: personal/emotional questions
should reach the new ops, not semantic search. Anti-flattery hint on
query: "Do NOT assume words like crazy, notable, or big mean
impressive — they often mean difficult or emotionally charged."

list_pages gains updated_after (string ISO) and sort enum params,
surfacing the engine threading from the prior commit.

src/core/transcripts.ts: filesystem walk shared by the gated MCP op
and the (commit 5) CLI command. Reuses discoverTranscripts corpus-dir
resolution + isDreamOutput from cycle/transcript-discovery.ts. Trust
gate lives in the op handler, not the library — the library is
trusted by both the gated op and the local CLI.

Allow-list: 11 → 13 (add salience + anomalies; transcripts excluded
per codex C3, with a comment explaining why).

Tests: 21-case description pin (catches accidental edits that change
LLM-facing surface); 11-case transcripts unit covering trust gate,
mtime window, dream-output skip, summary truncation, no corpus_dir;
2-case salience type-contract smoke (full Garry-test fixture in commit
6's e2e suite).

Codex C1: routing-eval fixtures (skills/<x>/routing-eval.jsonl)
deliberately NOT shipped — routing-eval.ts is substring-match on
resolver triggers, not MCP tool routing. Real coverage lands as
test/e2e/salience-llm-routing.test.ts in commit 6.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 CLI: gbrain salience / anomalies / transcripts

Three new CLI commands wired into src/cli.ts dispatch + CLI_ONLY set +
help text:

- gbrain salience [--days N] [--limit N] [--kind PREFIX] [--json]
- gbrain anomalies [--since YYYY-MM-DD] [--lookback-days N] [--sigma N] [--json]
- gbrain transcripts recent [--days N] [--full] [--json]

Each command file mirrors src/commands/orphans.ts shape: pure data fn
+ JSON formatter + human formatter. Calls into engine.getRecentSalience
/ findAnomalies (already shipped) and src/core/transcripts.ts.

salience and anomalies show ranked rows with per-cohort
mean/stddev/sigma. transcripts honors `--full` (caps at 100KB/file)
vs default summary (first non-empty line + ~250 chars). All three
emit JSON with --json for agent consumption.

`--kind` is accepted as a slug-prefix shorthand on `gbrain salience`
even though the underlying op param is `slugPrefix` (kept the CLI
flag short; the MCP-facing param uses the more-explicit name to
align with PageKind/TakeKind/slugPrefix vocabulary).

CLI_ONLY set in src/cli.ts gains the three new command names so
they don't get forwarded to MCP-only routing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 e2e: Garry-test fixtures + Postgres parity + LLM routing eval

PGLite e2e (no DATABASE_URL needed):

- salience-pglite: the Garry test. 7 wedding-tagged pages updated today
  + 100 background pages backdated across 30 days via raw SQL UPDATE
  (codex C4#7 — engine.putPage stamps updated_at = now(), so seeding
  via the engine alone can't reproduce historical recency windows).
  Asserts wedding pages outrank random-tag noise in the 7-day window;
  slugPrefix filter narrows correctly; days=0 boundary case; limit cap.

- anomalies-pglite: same fixture shape (7 wedding pages today, 100
  background backdated). findAnomalies with sigma=3 returns the
  wedding-tag cohort with sigma_observed > 3 vs near-zero baseline;
  page_slugs sample carries the wedding pages; date with no activity
  returns []; high sigma threshold suppresses borderline cohorts
  (zero-stddev fallback stays finite — no NaN sigma).

Postgres-gated e2e:

- engine-parity-salience: PGLite ↔ Postgres parity for getRecentSalience
  and findAnomalies. Same fixture into both engines; top-result and
  cohort-set match. Closes the v0.22.0-style parity gap for the new
  v0.29 SQL idioms (EXTRACT(EPOCH ...), generate_series, CTE chain).

Tier-2 LLM routing eval (ANTHROPIC_API_KEY-gated):

- salience-llm-routing: calls Claude with v0.29 tool descriptions and
  12 personal-query phrasings ("anything crazy lately", "what's been
  going on with me", etc.). Asserts the chosen tool is in the v0.29
  set, not query() / search(). ~$0.10 per CI run on Haiku. Tests the
  ACTUAL ship criterion — replaces the discarded fake-coverage
  routing-eval.jsonl fixtures (codex C1 → B).

This is the only test that proves the description edits drive routing.
Without it, we'd ship description changes and only learn from
production behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.0: ship-prep — VERSION + CHANGELOG + CLAUDE Key Files

VERSION + package.json bump 0.28.0 → 0.29.0.

CHANGELOG.md adds a v0.29.0 release-summary in the GStack/Garry voice
plus the "To take advantage of v0.29.0" block. Headline two-liner:
"The brain tells you what's hot without being asked. Salience +
anomaly detection ship. Search rewards hypotheses; salience surfaces
them." Numbers-that-matter table covers engine surface delta, MCP op
delta, allow-list delta, cycle-phase delta, schema migration, list_pages
param surface, and test count. Itemized changes section lists the
schema migration + new cycle phase + new MCP ops + redirect
descriptions + subagent allow-list rules + new tests + a contributor
note clarifying that routing-eval.ts is not the right surface for
testing MCP tool routing (use the Tier-2 LLM eval pattern instead).

CLAUDE.md Key Files updated for the v0.29 surface:

- src/core/engine.ts: notes the 4 new methods + PageFilters.sort threading.
- src/core/migrate.ts: v34 (pages_emotional_weight) entry.
- src/core/cycle.ts: 8 → 9 phases, recompute_emotional_weight inserted
  between patterns and embed; totals.pages_emotional_weight_recomputed.
- src/core/cycle/emotional-weight.ts (NEW): formula + override path.
- src/core/cycle/anomaly.ts (NEW): stats helpers + zero-stddev fallback.
- src/core/cycle/recompute-emotional-weight.ts (NEW): phase orchestrator.
- src/core/transcripts.ts (NEW): library shared by gated MCP op + CLI.
- src/core/operations-descriptions.ts (NEW): pinned tool descriptions.
- src/core/minions/tools/brain-allowlist.ts: 11 → 13 entries; comment
  on why get_recent_transcripts is excluded.
- src/commands/salience.ts / anomalies.ts / transcripts.ts (NEW): CLI surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1 feat: recency + salience as two orthogonal options on query op (#696)

* feat: recency boost for search (v0.27.0) — temporal intent auto-detection, date filters, configurable decay

New search pipeline stage: keyword + vector → RRF → cosine re-score → backlink boost → recency boost → dedup

- applyRecencyBoost: hyperbolic decay, two strengths (moderate 30-day halflife, aggressive 7-day halflife)
- Auto-enabled when intent.ts detects temporal/event queries (detail='high')
- Manual override via SearchOpts.recencyBoost (0/1/2)
- Date filtering: afterDate/beforeDate on all three search paths (keyword, keywordChunks, vector)
- getPageTimestamps on both Postgres and PGLite engines
- 15 tests passing (boost math + intent classification)

* v0.29.1 schema: pages.{effective_date, effective_date_source, import_filename, salience_touched_at} + expression index

Migration v38 adds 4 nullable columns to pages and an expression index on
COALESCE(effective_date, updated_at) to support the new since/until date
filters. All additive — no behavior change in the default search path; only
consulted when callers opt into the new salience='on' / recency='on' axes
or pass since/until.

  effective_date         — content date (event_date / date / published /
                           filename-date / fallback). Read by recency boost
                           and date-filter paths only. Auto-link doesn't
                           touch it (immune to updated_at churn).
  effective_date_source  — sentinel for the doctor's effective_date_health
                           check ('event_date' | 'date' | 'published' |
                           'filename' | 'fallback').
  import_filename        — basename without extension, captured at import.
                           Used for filename-date precedence on daily/,
                           meetings/. Older rows leave it NULL.
  salience_touched_at    — bumped by recompute_emotional_weight when
                           emotional_weight changes. Salience window uses
                           GREATEST(updated_at, salience_touched_at) so
                           newly-salient old pages enter the recent salience
                           query.

Index strategy: a partial index on effective_date alone wouldn't help the
COALESCE expression in since/until filters (planner can't use it for the
negative side). The expression index ((COALESCE(effective_date, updated_at)))
is what actually accelerates the filter.

Postgres uses CONCURRENTLY + v14-style pg_index.indisvalid pre-drop guard
for prior failed CONCURRENTLY runs; PGLite uses plain CREATE INDEX. Mirror
of v34's pattern.

src/schema.sql + src/core/pglite-schema.ts updated for fresh installs;
src/core/schema-embedded.ts regenerated via bun run build:schema.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: computeEffectiveDate helper + putPage integration

Pure helper computing a page's effective_date from frontmatter precedence:
  1. event_date (meeting/event pages)
  2. date (dated essays)
  3. published (writing/)
  4. filename-date (leading YYYY-MM-DD in basename)
  5. updated_at (fallback)
  6. created_at (last resort)

Per-prefix override: for daily/ and meetings/ slugs, filename-date jumps
to position 1 — the filename is the user's primary signal there.

Returns {date, source}. The source label powers the doctor's
effective_date_health check to detect "fell back to updated_at" rows that
look populated but are functionally a NULL.

Range validation: parsed value must be in [1990-01-01, NOW + 1 year].
Out-of-range values drop to the next chain element.

Wired into importFromContent + importFromFile. The put_page MCP op derives
filename from slug-tail when no caller-supplied filename is available.

putPage SQL on both engines extended to write the new columns. ON CONFLICT
uses COALESCE(EXCLUDED.x, pages.x) so callers that don't know about the
new columns (auto-link, code reindex) preserve existing values rather than
blanking them. SELECT projection extended to return them; rowToPage threads
them through.

21 unit tests covering: precedence chain default order, per-prefix override,
parse failure fall-through, range validation [1990, NOW+1y], parseDateLoose
shape variants. All pass; typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: backfill orchestrator + library function for existing pages

src/core/backfill-effective-date.ts is the shared library function. Walks
pages in keyset-paginated batches (id > last_id ORDER BY id LIMIT 1000),
runs computeEffectiveDate per row, UPDATEs effective_date +
effective_date_source. Resumable via the `backfill.effective_date.last_id`
checkpoint key in the config table — a killed process can re-run and pick
up without re-doing rows. Idempotent: a full re-walk produces the same
writes.

Postgres-only: SET LOCAL statement_timeout = '600s' per batch. Doesn't
refuse the migration on low session settings (codex pass-2 #16).

src/commands/migrations/v0_29_1.ts is the orchestrator (4 phases mirroring
v0_12_2). Phase A schema (gbrain init --migrate-only), Phase B backfill
(via the library function), Phase C verify (count NULL effective_date),
Phase D record (handled by runner). The library function is reusable from
the gbrain reindex-frontmatter CLI command in the next commit.

import_filename stays NULL for backfilled rows — pre-v0.29.1 imports
didn't capture it. computeEffectiveDate uses the slug-tail when filename
is NULL; daily/2024-03-15 backfilled gets effective_date from the slug.

Registered in src/commands/migrations/index.ts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: gbrain reindex-frontmatter CLI command

Recovery / explicit-rebuild path for pages.effective_date. Used when:
  - User edited frontmatter dates after import
  - Post-upgrade backfill orchestrator finished but the user wants to
    re-walk a subset (e.g. just meetings/) after fixing some frontmatter
  - Precedence rules change between releases

Thin wrapper over backfillEffectiveDate from commit 3 — same code path
the v0_29_1 orchestrator uses; one source of truth.

Flags mirror reindex-code:
  --source <id>      Scope to one sources row (placeholder; library
                     library doesn't filter by source today, tracked v0.30+)
  --slug-prefix P    Scope to slugs starting with P (e.g. 'meetings/')
  --dry-run          Print what WOULD change, no DB writes
  --yes              Skip confirmation prompt (required for non-TTY non-JSON)
  --json             Machine-readable result envelope
  --force            Re-apply even when computed value matches existing

Wired into src/cli.ts. CLI handles its own engine lifecycle (creates +
disconnects).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: recency-decay map + buildRecencyComponentSql (pure, unused)

src/core/search/recency-decay.ts mirrors source-boost.ts in shape but
drives RECENCY ONLY (per D9 codex resolution). Salience is a separate
orthogonal axis; this map does not feed it.

DEFAULT_RECENCY_DECAY: 10 generic prefixes (no fork-specific names).
  - concepts/      evergreen (halflifeDays=0)
  - originals/     180d × 0.5 (long-tail decay; new essays nudged)
  - writing/       365d × 0.4
  - daily/         14d × 1.5  (aggressive — freshness IS the signal)
  - meetings/      60d × 1.0
  - chat/          7d × 1.0
  - media/x/       7d × 1.5
  - media/articles/ 90d × 0.5
  - people/companies/ 365d × 0.3
  - deals/         180d × 0.5

DEFAULT_FALLBACK: 90d × 0.5 for unmatched slugs.

Override priority: defaults < gbrain.yml recency: < env (GBRAIN_RECENCY_DECAY)
< per-call SearchOpts.recency_decay.

parseRecencyDecayEnv format: comma-separated prefix:halflifeDays:coefficient
triples. Refuses LOUD on parse error (RecencyDecayParseError) — codex
pass-2 #M3 finding. No silent fallback like source-boost's parser.

parseRecencyDecayYaml takes already-parsed YAML; throws on bad shape.

buildRecencyComponentSql in sql-ranking.ts emits a CASE expression with
longest-prefix-first ordering, evergreen short-circuit (literal 0 when
halflifeDays=0 or coefficient=0), and EXTRACT(EPOCH ...) for non-zero
branches. Output: ((CASE WHEN p.slug LIKE 'daily/%' THEN 1.5 * 14.0 /
(14.0 + EXTRACT(EPOCH FROM (NOW() - <dateExpr>))/86400.0) ... END))

Typed NowExpr enum prevents SQL injection (codex pass-1 #5). Tests pass
{ kind: 'fixed', isoUtc } for deterministic output; production NOW().
The 'fixed' branch escapes single quotes via escapeSqlLiteral.

25 unit tests covering: env parser shape, env error cases, yaml parser
shape, merge precedence (defaults < yaml < env < caller), CASE longest-
prefix-first ordering, evergreen short-circuit, NowExpr fixed/now,
single-quote injection defense, empty decayMap fallback path, default
map composition (no fork names, concepts/ evergreen, daily/ aggressive).

Pure module. Zero consumers in this commit; commit 6 wires it into
getRecentSalience, commit 10 wires it into the post-fusion stage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: refactor getRecentSalience to consume buildRecencyComponentSql

Both engines (Postgres + PGLite) now build the salience formula's third
term via buildRecencyComponentSql instead of inlining 1.0 / (1 + days_old).
Parameters: empty decayMap + fallback { halflifeDays: 1, coefficient: 1.0 }.
Math expands to 1 * 1.0 / (1.0 + days_old) = 1 / (1 + days_old) — same
numeric output as v0.29.0.

This is a no-behavior-change refactor preparing for commit 7's recency_bias
param. recency_bias='flat' (default) reproduces v0.29.0 exactly; 'on'
swaps in DEFAULT_RECENCY_DECAY for per-prefix decay.

Single source of truth for the recency math: same builder feeds the
salience query AND (in commit 10) the post-fusion applyRecencyBoost stage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: get_recent_salience gains recency_bias param (default 'flat')

SalienceOpts.recency_bias: 'flat' | 'on' added; default 'flat' preserves
v0.29.0 ranking verbatim. Pass 'on' to opt into per-prefix decay map
(concepts/originals/writing/ evergreen; daily/, media/x/, chat/ aggressive
decay).

When recency_bias='on', the salience query reads
COALESCE(p.effective_date, p.updated_at) instead of bare p.updated_at, so
the recency component is immune to auto-link updated_at churn — old
concepts/ pages just-touched by auto-link don't suddenly look fresh.

Both engines (Postgres + PGLite) wire the param through. resolveRecencyDecayMap()
honors gbrain.yml + GBRAIN_RECENCY_DECAY env at runtime.

MCP op surface: get_recent_salience gains the param with a load-bearing
description teaching the agent when to use 'on' vs 'flat' (current state →
on; mattering across all time → flat).

No silent v0.29.0 behavior change — opt-in only (per D11 codex resolution).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: recompute_emotional_weight writes salience_touched_at; window picks up newly-salient pages

setEmotionalWeightBatch on both engines now bumps salience_touched_at to
NOW() ONLY when the new emotional_weight differs from the existing one
(IS DISTINCT FROM, NULL-safe). No-op writes (same weight) leave the
column alone — preserves "actual change" semantics.

getRecentSalience window changes from
  WHERE p.updated_at >= boundary
to
  WHERE GREATEST(p.updated_at, COALESCE(p.salience_touched_at, p.updated_at)) >= boundary

Closes codex pass-1 finding #4: pages whose emotional_weight just changed
in the dream cycle (because tags or takes shifted) but whose updated_at
is older than the salience window now correctly enter the recent-salience
results. Without this, "Garry just added a take to a 6-month-old page"
stayed invisible to get_recent_salience until the next content edit.

COALESCE(salience_touched_at, p.updated_at) handles pre-v0.29.1 rows
where salience_touched_at is NULL — they fall back to p.updated_at and
behave identically to v0.29.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: merge intent.ts → query-intent.ts; emit 3 suggestions per query

D1 + D4 + D6 + D8: single regex-pass classifier returning
{intent, suggestedDetail, suggestedSalience, suggestedRecency}.

intent + suggestedDetail are v0.29.0 behavior verbatim (legacy intent.ts
deleted; classifyQueryIntent + autoDetectDetail compat shims preserved).

NEW for v0.29.1 — two orthogonal recency-axis suggestions:

  suggestedSalience: 'off' | 'on' | 'strong'
  suggestedRecency:  'off' | 'on' | 'strong'

Resolution rules (per D6 narrow temporal-bound exception):
  - CANONICAL patterns (who is X / what is Y / code / graph) → both off
  - UNLESS an EXPLICIT_TEMPORAL_BOUND also matches (today / right now /
    this week / since X / last N days), in which case temporal-bound wins
  - STRONG_RECENCY (today / right now / this morning / just now) → strong
  - RECENCY_ON (latest / recent / this week / meeting prep / catch up
    / remind me / status update) → on
  - SALIENCE_ON (catch up / remind me / status update / prep me /
    what's going on / what matters) → on
  - default → off for both axes (v0.29.1 prime-directive: pure opt-in)

Salience and recency are TRULY orthogonal (per D9). A query like
"latest news on AI" → recency='on' but salience='off' (the user wants
fresh, not emotionally-weighted). "What's going on with widget-co" →
both on. "Who is X right now" → both 'strong'/'on' (temporal bound
beats canonical 'who is').

intent.ts deleted; test/intent.test.ts renamed → test/query-intent-legacy.test.ts
(unchanged behavior coverage). New test/query-intent.test.ts adds 21
cases covering all three axes' interactions: canonical wins on bare
'who is', temporal bound overrides, "catch me up" matches with up to 15
chars between, "today" → strong, intent vs recency independence.

Updated callers:
  - src/core/search/hybrid.ts (autoDetectDetail import)
  - test/recency-boost.test.ts (classifyQueryIntent import)
  - test/benchmark-search-quality.ts (autoDetectDetail import)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: applySalienceBoost + applyRecencyBoost + runPostFusionStages wrapper

D9 + codex pass-1 #2 + #3 + pass-2 #4: salience and recency are TRULY
ORTHOGONAL post-fusion stages, both running from ALL THREE hybridSearch
return paths (keyword-only, embed-failure-fallback, full-hybrid).

NEW src/core/search/hybrid.ts exports:
  - applySalienceBoost(results, scores, strength)
      score *= 1 + k * log(1 + score) where k = 0.15 (on) or 0.30 (strong)
      No time component. Pure mattering signal.
  - applyRecencyBoost(results, dates, strength, decayMap, fallback, nowMs?)
      Per-prefix decay factor: 1 + strengthMul * coefficient * halflife / (halflife + days_old)
      strengthMul: 1.0 (on) or 1.5 (strong)
      Evergreen prefixes (halflifeDays=0) skipped (factor 1.0).
      Pure recency signal. Independent of mattering.
  - runPostFusionStages(engine, results, opts)
      Wraps backlink + salience + recency. Called from EACH return path so
      keyless installs and embed failures get the same boost surface as
      the full hybrid path.

NEW engine methods (composite-keyed for multi-source isolation):
  - getEffectiveDates(refs: Array<{slug, source_id}>): Map<key, Date>
      Returns COALESCE(effective_date, updated_at, created_at). Key format:
      `${source_id}::${slug}`. Mirror of getBacklinkCounts shape.
  - getSalienceScores(refs: Array<{slug, source_id}>): Map<key, number>
      Returns emotional_weight × 5 + ln(1 + take_count). Composite key.

Deprecated (kept for back-compat through v0.29.x):
  - SearchOpts.afterDate / beforeDate (alias for since/until)
  - SearchOpts.recencyBoost: 0|1|2 (alias for recency: 'off'|'on'|'strong')
  - getPageTimestamps (use getEffectiveDates instead)

NEW SearchOpts fields:
  - salience: 'off' | 'on' | 'strong'
  - recency:  'off' | 'on' | 'strong'
  - since:    string (ISO-8601 or relative, replaces afterDate)
  - until:    string (replaces beforeDate)

Resolution: caller-explicit > legacy alias (recencyBoost) > heuristic
(classifyQuery's suggestedSalience / suggestedRecency).

Deleted: src/core/search/recency.ts (PR #618's, replaced) +
test/recency-boost.test.ts (its scope is replaced by query-intent.test.ts +
future post-fusion tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Wintermute <wintermute@garrytan.com>

* v0.29.1: query op gains salience + recency + since + until params; PGLite since/until parity

Combines commits 12 + 13 of the plan.

Query op surface (src/core/operations.ts):
  - salience: 'off' | 'on' | 'strong' (with load-bearing description)
  - recency:  'off' | 'on' | 'strong'
  - since:    string (ISO-8601 or relative; replaces deprecated afterDate)
  - until:    string (replaces deprecated beforeDate)

Tool descriptions teach the calling agent:
  - salience axis = mattering, no time component
  - recency axis = age decay, no mattering signal
  - omit either to let gbrain auto-detect from query text via classifyQuery

hybrid.ts maps since/until → afterDate/beforeDate at the engine call
boundary so PR #618's existing engine plumbing keeps working without
rename. Codex pass-1 #10 finding closed.

PGLite engine (codex pass-1 #10): since/until parity added to all three
search methods (searchKeyword, searchKeywordChunks, searchVector). SQL
filter against COALESCE(p.effective_date, p.updated_at, p.created_at)
so date filtering matches user content-date intent (a meeting was on
event_date, not when it got reimported). Filter is applied INSIDE the
HNSW inner CTE in searchVector so HNSW's candidate pool already
excludes out-of-range pages — preserves pagination contract.

This also closes existing cross-engine drift: pre-v0.29.1 Postgres had
afterDate/beforeDate from PR #618; PGLite had nothing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: migration v39 — eval_candidates capture columns for replay reproducibility

D11 codex pass-2 resolution: extend eval_candidates with 7 new nullable
columns so `gbrain eval replay` can reproduce captured runs of agent-explicit
salience + recency choices.

Without these columns, replays of the new axis params drift. The live
behavior depends on the resolved {salience, recency} values; v0.29.0's
schema doesn't capture them.

  as_of_ts            TIMESTAMPTZ  — brain's logical NOW at capture
                                     (replay uses this instead of wall-clock)
  salience_param      TEXT         — what the caller passed (NULL if omitted)
  recency_param       TEXT         — same
  salience_resolved   TEXT         — final value applied
  recency_resolved    TEXT         — same
  salience_source     TEXT         — 'caller' or 'auto_heuristic'
  recency_source      TEXT         — same

All nullable + additive. Pre-v0.29.1 rows stay valid. NDJSON
schema_version STAYS at 1 — consumers ignore unknown fields (codex
pass-1 #C2 dissolves; no cross-repo coordination needed).

ADD COLUMN with no DEFAULT is metadata-only on PG 11+ and PGLite —
instant on tables of any size.

src/schema.sql + src/core/pglite-schema.ts mirror the additions for fresh
installs; src/core/schema-embedded.ts regenerated. eval_capture.ts
populates the new fields in commit 16 (docs + ship).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: doctor checks — effective_date_health + salience_health

effective_date_health: sample-1000 scan detects three classes of
problems (codex pass-1 #5 resolution via the effective_date_source
sentinel column added in commit 1):

  fallback_with_fm_date  — page fell back to updated_at even though
                           frontmatter has parseable event_date / date /
                           published. The "wrong but populated" residual
                           that earlier review iterations missed.
  future_dated            — effective_date > NOW() + 1 year (corrupt
                            or typo'd century).
  pre_1990                — effective_date < 1990-01-01 (epoch math gone
                            wrong, bad parse).

Sample of last 1000 pages by default — fast on 200K-page brains. Fix
hint: gbrain reindex-frontmatter.

salience_health: detects pages with active takes whose emotional_weight
is still 0 (recompute_emotional_weight phase hasn't run since the
take landed). Reports the brain's non-zero emotional_weight count as
an informational baseline. Fix hint: gbrain dream --phase
recompute_emotional_weight.

Both checks gracefully skip on pre-v0.29.1 brains (column doesn't
exist → 42703) without surfacing as warnings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: docs + skills convention + CHANGELOG + version bump

- VERSION 0.29.0 → 0.29.1
- package.json version bump
- CHANGELOG.md: full release-summary + itemized + "To take advantage"
  block per the project's voice rules. Two-line headline + concrete
  pathology framing (existing callers unchanged; new axes opt-in;
  agent in charge per the prime directive).
- skills/conventions/salience-and-recency.md: agent-readable decision
  rules. "Current state → on. Canonical truth → off." plus the narrow
  temporal-bound exception. Cross-cutting convention propagates to
  brain skills via RESOLVER.md.
- skills/migrations/v0.29.1.md: agent-readable upgrade instructions.
  Verify steps + behavior-change reference + recovery commands.

The build-time tool-description generator from D2 (extract decision
tables from skills/conventions/salience-and-recency.md, embed into
operations.ts at build time) is deferred to a follow-up commit. The
tool descriptions on the query op + get_recent_salience are inline in
operations.ts for v0.29.1; the auto-gen + CI staleness gate land in
v0.29.2 if drift becomes a problem in practice.

148 unit tests pass across the v0.29.1 surface (effective-date,
recency-decay, query-intent, migrate, salience, recompute-emotional-weight).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Wintermute <wintermute@garrytan.com>

---------

Co-authored-by: Wintermute <wintermute@garrytan.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 master-rebase fixups: renumber + drift cleanup

- v0.29.1 migrations renumber v38/v39 → v41/v42 (master shipped takes_table at
  v37 + access_tokens_permissions at v38; v0.27.1 took v39). My v0.29.0
  emotional_weight slots in at v40; v0.29.1's pages_recency_columns lands at
  v41 and eval_candidates_recency_capture at v42.
- src/core/utils.ts comment refs updated v37 → v40 (emotional_weight) and
  v38 → v41 (effective_date/etc).
- test/brain-allowlist.test.ts: size assertion 11 → 13 + the new
  get_recent_salience / find_anomalies positive checks + the explicit
  get_recent_transcripts negative check (v0.29 added the salience pair to
  the allow-list; transcripts are deliberately excluded because all
  subagent calls have remote=true and the v0.29 trust gate rejects them —
  visibility would be a footgun).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 CI fixups: privacy allow-list + cycle phase count + migration plan

Three CI test failures on PR #730, all caused by master-side state the
v0.29 cherry-picks didn't yet account for:

1. scripts/check-privacy.sh allow-lists test/recency-decay.test.ts
   The v0.29.1 recency-decay test asserts that DEFAULT_RECENCY_DECAY's
   keys do NOT include fork-specific path prefixes. Because the assertion
   has to name the banned tokens to assert their absence, the privacy
   guard flagged the literal occurrence. Same exception class as
   CHANGELOG.md, CLAUDE.md, and scripts/check-privacy.sh itself —
   meta-rule enforcement requires mentioning what the rule forbids.

2. test/core/cycle.serial.test.ts: 9 → 10 phases.
   The yieldBetweenPhases test was written for v0.26.5 (9 phases incl.
   purge). v0.29 added a 10th phase (recompute_emotional_weight)
   between patterns and embed; the test's expected hookCalls and
   report.phases.length needed bumping.

3. test/apply-migrations.test.ts: append '0.29.1' to skippedFuture lists.
   v0.29.1 added a new entry to src/commands/migrations/index.ts; the
   buildPlan test snapshots the exact ordered list of versions, so it
   needs the new entry in both the fresh-install case and the Codex H9
   regression case.

All three verified locally:
  - bash scripts/check-privacy.sh → exit 0
  - bun test test/apply-migrations.test.ts → 18/18 pass
  - bun test test/core/cycle.serial.test.ts → 28/28 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 CI fixup: regenerate llms-full.txt to match CLAUDE.md state

build-llms test asserts the committed llms.txt + llms-full.txt match
what the generator produces from the current source tree. CLAUDE.md
got new v0.29 Key Files entries (recompute_emotional_weight phase,
emotional-weight formula, anomaly stats, transcripts library, salience
ops, etc.) without a corresponding regen. `bun run build:llms` brings
llms-full.txt back in sync; llms.txt is byte-for-byte identical so
only the larger inline bundle changed.

Verified locally: bun test test/build-llms.test.ts → 7/7 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 e2e: cover tool-surfaces + MCP dispatch path

Two gaps were uncovered when reviewing v0.29 coverage against the new
contracts the cherry-picks landed onto master.

1. test/v0_29-tool-surfaces.test.ts (unit, 9 cases)

   Existing tests pin the description constants module and the
   BRAIN_TOOL_ALLOWLIST set membership, but nothing checked the two
   filters that ACT on those constants:

   - serve-http.ts:745 filters operations by !op.localOnly to build the
     HTTP MCP tool list. Without a test, anyone removing `localOnly: true`
     from get_recent_transcripts would silently expose it to remote
     callers — defense-in-depth on top of the in-handler ctx.remote check
     would be the only guard. Now pinned: get_recent_transcripts is
     hidden, salience + anomalies stay visible.

   - buildBrainTools surfaces the v0.29 ops as `brain_get_recent_salience`
     and `brain_find_anomalies`, and EXCLUDES `brain_get_recent_transcripts`
     (codex C3 footgun gate — all subagent calls are remote=true, the op
     would always reject). Now pinned.

   Both filters are pure functions; no DB / engine.connect needed.

2. test/e2e/v0_29-mcp-dispatch-pglite.test.ts (e2e, 5 cases)

   Existing v0.29 e2e tests call engine methods directly. None went
   through the full dispatchToolCall pipeline that stdio MCP and HTTP
   MCP both use. The new file covers:

   - get_recent_salience returns ranked rows via dispatch (top result
     is the wedding-tagged page from the seeded fixture).
   - find_anomalies returns the AnomalyResult shape via dispatch.
   - get_recent_transcripts rejects with permission_denied when
     ctx.remote === true (the in-handler trust gate is the last line if
     localOnly ever drops).
   - get_recent_transcripts succeeds with ctx.remote === false (CLI
     path) and returns [] when no corpus dir is configured.
   - Unknown tool name returns the standard isError + "Unknown tool"
     envelope (regression guard for dispatch shape).

Verified locally — all 14 cases pass:
  bun test test/v0_29-tool-surfaces.test.ts                          → 9 pass
  bun test test/e2e/v0_29-mcp-dispatch-pglite.test.ts                → 5 pass

Re-ran the full v0.29 PGLite e2e suite to confirm no regressions:
  salience-pglite.test.ts                       5 pass
  anomalies-pglite.test.ts                      4 pass
  cycle-recompute-emotional-weight-pglite.test  3 pass
  list-pages-regression.test.ts                 6 pass
  multi-source-emotional-weight-pglite.test     4 pass
  backfill-perf-pglite.test.ts                  1 pass
  v0_29-mcp-dispatch-pglite.test.ts             5 pass
  -----
  Total: 28 pass / 0 fail
  Postgres parity test (DATABASE_URL gated)     7 skip (correct)
  LLM routing eval (ANTHROPIC_API_KEY gated)   12 skip (correct)
  bun run typecheck                             clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 CI fixup: drop unused PGLiteEngine in tool-surfaces test

scripts/check-test-isolation.sh's R3 + R4 lints flagged the new
test/v0_29-tool-surfaces.test.ts for instantiating PGLiteEngine outside
a beforeAll() block (R3) and lacking the matching afterAll(disconnect)
(R4). The intent of those rules is to prevent engine leaks across the
shard process — every PGLiteEngine must follow the canonical
beforeAll(connect+initSchema) / afterAll(disconnect) pattern.

The fix here is upstream of the rule, not a workaround: this test never
needed an engine. buildBrainTools doesn't issue any SQL at registry-build
time — it only reads `engine.kind` for the put_page namespace-wrap
branch. A `{ kind: 'pglite' } as unknown as BrainEngine` fake-engine
literal keeps the test pure-function: no WASM cold-start, no connect
lifecycle, no test-isolation rule fired.

Verified locally:
  bash scripts/check-test-isolation.sh → OK (257 non-serial unit files)
  bun test test/v0_29-tool-surfaces.test.ts → 9 pass
  bun run typecheck → clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Wintermute <wintermute@garrytan.com>
garrytan added a commit to garrytan-agents/gbrain that referenced this pull request May 10, 2026
… (EXP-5)

Reproducible cross-modal quality eval for the takes layer. Three frontier
models score a sample against the 5-dim rubric, the runner aggregates to
PASS/FAIL/INCONCLUSIVE, the receipt persists to eval_takes_quality_runs.
Trend mode segregates by rubric_version; regress mode is a CI gate that
exits 1 when any dim regresses past --threshold.

Subcommands:
  run     [--limit N --cycles N --budget-usd N --slug-prefix P --models a,b,c]
  replay  <receipt-path> [--json]                 # NO BRAIN required
  trend   [--limit N --rubric-version V --json]
  regress --against <receipt> [--threshold T --json]

Codex review integrations (D7 — all 10 findings landed):

  garrytan#1 json-repair shim re-exports BOTH parseModelJSON AND the
     ParsedScore + ParsedModelResult types. The original plan only
     re-exported the function, which would have compile-broken
     cross-modal-eval/aggregate.ts:19's type import.

  garrytan#3 Receipt name binds (corpus_sha8, prompt_sha8, models_sha8,
     rubric_sha8) so a future rubric tweak segregates trend rows
     instead of silently corrupting the quality-over-time graph.
     RUBRIC_VERSION + rubric_sha8 are persisted in every receipt.

  garrytan#4 Pricing fail-closed: any model not in pricing.ts produces an
     actionable PricingNotFoundError before any HTTP call fires.
     Same drift problem as cross-modal-eval/runner.ts:estimateCost(),
     but explicit instead of silent zero.

  garrytan#5 Aggregate requires ALL 5 declared rubric dimensions per model.
     Cross-modal-eval v1's union-of-whatever-parsed pattern allowed a
     model to omit a dim and still PASS — that's a regression-gate
     hole. Now: missing-dim drops the contribution, treated identically
     to a parse failure. Empty-scores PASS regression guard preserved.

  garrytan#6 DB-authoritative receipt persistence. Original two-phase plan had
     a split-brain reconciliation gap (disk-success/DB-fail vanishes
     from trend; DB-success/disk-fail unreplayable). Now DB row is the
     source of truth (carries full receipt JSON in a JSONB column);
     disk artifact is best-effort. replay reads disk first; loadReceiptFromDb
     reconstructs from DB when the disk file is missing.

  garrytan#10 Brain-routing: replay is the only sub-subcommand that doesn't
      need a brain. cli.ts no-DB bypass routes "eval takes-quality replay"
      directly to runReplayNoBrain, which exits 0/1/2 cleanly without
      ever touching the engine. Other modes go through connectEngine.

Files added:
  src/core/eval-shared/json-repair.ts (hoisted from cross-modal-eval)
  src/core/takes-quality-eval/{rubric,pricing,aggregate,receipt-name,
                                receipt-write,receipt,replay,regress,trend,runner}.ts
  src/commands/eval-takes-quality.ts
  docs/eval-takes-quality.md (stable schema_version: 1 contract)
  10 test files (83 cases — aggregate / receipt-name / shim / pricing /
                 rubric / receipt-write / replay / trend / regress / cli)

Files modified:
  src/cli.ts: replay no-DB bypass + engine-required dispatch
  src/core/cross-modal-eval/json-repair.ts → re-export shim
  src/core/migrate.ts: append v47 (eval_takes_quality_runs table)
  src/core/pglite-schema.ts + src/schema.sql: mirror the v47 table for
    fresh-install path. RLS toggled on the new table.
  src/core/schema-embedded.ts: regenerated via build:schema
  test/migrate.test.ts: 6 structural cases for v47

186 tests pass; typecheck clean. Replay verified working end-to-end
(reads receipt JSON file without DATABASE_URL, exits with the verdict
code, prints actionable error on missing file).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant