Skip to content

feat: OpenClaw plugin — Phase 1 core search#5

Closed
garrytan wants to merge 2 commits intomasterfrom
wintermute/openclaw-plugin-support
Closed

feat: OpenClaw plugin — Phase 1 core search#5
garrytan wants to merge 2 commits intomasterfrom
wintermute/openclaw-plugin-support

Conversation

@garrytan
Copy link
Copy Markdown
Owner

@garrytan garrytan commented Apr 7, 2026

OpenClaw plugin that makes GBrain semantically searchable via native agent tools. Lives in openclaw-plugin/ alongside the main Postgres-based GBrain product.

Tools:

  • gbrain_query: semantic search with scope filtering and compiled truth awareness
  • gbrain_resolve: entity resolution via exact match, aliases, and embedding similarity

Infrastructure:

  • Smart chunking: frontmatter+summary (high weight), compiled truth body, optional timeline
  • Voyage API embeddings (1024-dim) with batch support
  • SQLite storage with brute-force cosine similarity (portable, zero deps)
  • Git-based incremental sync (polls HEAD every 30s)
  • Background watcher service via api.registerService()
  • CLI: openclaw gbrain status/reindex/query/resolve

Stack: TypeScript ESM, better-sqlite3, gray-matter, @sinclair/typebox

root added 2 commits April 7, 2026 15:29
OpenClaw plugin that makes GBrain semantically searchable via native agent tools.
Lives in openclaw-plugin/ alongside the main Postgres-based GBrain product.

Tools:
- gbrain_query: semantic search with scope filtering and compiled truth awareness
- gbrain_resolve: entity resolution via exact match, aliases, and embedding similarity

Infrastructure:
- Smart chunking: frontmatter+summary (high weight), compiled truth body, optional timeline
- Voyage API embeddings (1024-dim) with batch support
- SQLite storage with brute-force cosine similarity (portable, zero deps)
- Git-based incremental sync (polls HEAD every 30s)
- Background watcher service via api.registerService()
- CLI: openclaw gbrain status/reindex/query/resolve

Stack: TypeScript ESM, better-sqlite3, gray-matter, @sinclair/typebox
Phase 2 (Relationships + Temporal):
- gbrain_graph: traverse entity relationships with depth control
- gbrain_timeline: temporal queries via git log + timeline entry parsing
- gbrain_ingest: create/update brain pages with auto-reindex

Phase 3 (Intelligence):
- gbrain_contradictions: detect numeric/factual conflicts across sources
- gbrain_confidence: score claims by source count, recency, corroboration

Also adds edge traversal methods to store.ts and registers all 7 tools
in index.ts. 1,153 new lines, compiles clean.
garrytan added a commit that referenced this pull request Apr 10, 2026
- fix(file_upload): call storage.upload() in all 3 paths (operation, CLI upload, CLI sync) with rollback semantics (#22 Bug #9)
- fix(import): use atomic index counter for parallel queue instead of array.shift() race, preserve checkpoint on errors (#22 Bug #3)
- fix(s3): replace unsigned fetch with @aws-sdk/client-s3 for proper SigV4 auth, supports R2/MinIO via forcePathStyle (#22 Bug #10)
- fix(redirect): verify remote file exists before deleting local copy, skip files not found in storage (#22 Bug #5)
- deps: add @aws-sdk/client-s3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@garrytan garrytan mentioned this pull request Apr 10, 2026
garrytan added a commit that referenced this pull request Apr 11, 2026
* fix: 7 bug fixes from Issue #9 and #22

- fix(mcp): use ListToolsRequestSchema/CallToolRequestSchema instead of string literals (Issue #9, PR #25)
- fix(mcp): handleToolCall reads dry_run from params instead of hardcoding false (#22 Bug #11)
- fix(search): keyword search returns best chunk per page via DISTINCT ON, not all chunks (#22 Bug #8)
- fix(search): dedup layer 1 keeps top 3 chunks per page instead of collapsing to 1 (#22 Bug #12)
- fix(engine): transaction uses scoped engine via Object.create, no shared state mutation (#22 Bug #2)
- fix(engine): upsertChunks uses UPSERT instead of DELETE+INSERT, preserves existing embeddings (#22 Bug #1)
- fix(slugs): validateSlug normalizes to lowercase, pathToSlug lowercases consistently (#22 Bug #4)
- schema: add unique index on content_chunks(page_id, chunk_index) for UPSERT support
- schema: add access_tokens and mcp_request_log tables via migration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: embed schema.sql at build time, remove fs dependency from initSchema

initSchema() previously read schema.sql from disk at runtime via readFileSync,
which broke in compiled Bun binaries and Deno Edge Functions. Now uses a
generated schema-embedded.ts constant (run `bun run build:schema` to regenerate).

- Removes fs and path imports from postgres-engine.ts and db.ts
- Adds scripts/build-schema.sh for one-source-of-truth generation
- Adds build:schema npm script

Fixes Issue #22 Bug #6.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: 5 more bug fixes from Issue #22

- fix(file_upload): call storage.upload() in all 3 paths (operation, CLI upload, CLI sync) with rollback semantics (#22 Bug #9)
- fix(import): use atomic index counter for parallel queue instead of array.shift() race, preserve checkpoint on errors (#22 Bug #3)
- fix(s3): replace unsigned fetch with @aws-sdk/client-s3 for proper SigV4 auth, supports R2/MinIO via forcePathStyle (#22 Bug #10)
- fix(redirect): verify remote file exists before deleting local copy, skip files not found in storage (#22 Bug #5)
- deps: add @aws-sdk/client-s3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: remote MCP server via Supabase Edge Functions

Deploy GBrain as a serverless remote MCP endpoint on your existing Supabase
instance. One brain, accessible from Claude Desktop, Claude Code, Cowork,
Perplexity Computer, and any MCP client. Zero new infrastructure.

New files:
- supabase/functions/gbrain-mcp/index.ts — Edge Function with Hono + MCP SDK
- supabase/functions/gbrain-mcp/deno.json — Deno import map
- src/edge-entry.ts — curated bundle entry point (excludes fs-dependent modules)
- src/commands/auth.ts — standalone token management (create/list/revoke/test)
- scripts/deploy-remote.sh — one-script deployment
- .env.production.example — 3-value config template

Changes:
- config.ts: lazy-evaluate CONFIG_DIR (no homedir() at module scope)
- schema.sql: add access_tokens + mcp_request_log tables
- package.json: add build:edge script

Auth: bearer tokens via access_tokens table (SHA-256 hashed, per-client, revocable)
Transport: WebStandardStreamableHTTPServerTransport (stateless, Streamable HTTP)
Health: /health endpoint (unauth: 200/503, auth: postgres/pgvector/openai checks)
Excluded from remote: sync_brain, file_upload (may exceed 60s timeout)

Setup: clone, fill .env.production, run scripts/deploy-remote.sh, create token, done.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: per-client MCP setup guides

- docs/mcp/DEPLOY.md — deployment walkthrough, auth, troubleshooting, latency table
- docs/mcp/CLAUDE_CODE.md — claude mcp add command
- docs/mcp/CLAUDE_DESKTOP.md — Settings > Integrations (NOT JSON config!)
- docs/mcp/CLAUDE_COWORK.md — remote + local bridge paths
- docs/mcp/PERPLEXITY.md — Perplexity Computer connector setup
- docs/mcp/CHATGPT.md — coming soon (requires OAuth 2.1, P0 TODO)
- docs/mcp/ALTERNATIVES.md — Tailscale Funnel + ngrok self-hosted options

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.6.0)

GBrain v0.6.0: Remote MCP server via Supabase Edge Functions + 12 bug fixes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add Remote MCP Server section to README

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: make document-release mandatory in CLAUDE.md, add MCP key files

Post-ship requirements section: document-release is NOT optional. Lists every
file that must be checked on every ship. A ship without updated docs is incomplete.

Also adds remote MCP server files to Key files section.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: batch upsertChunks into single statement to prevent deadlocks

The per-chunk UPSERT loop caused deadlocks under parallel workers because
each INSERT ON CONFLICT acquired row-level locks sequentially. Multiple
workers upserting different pages could deadlock on the shared unique index.

Fix: batch all chunks into a single multi-row INSERT ON CONFLICT statement.
One round-trip, one lock acquisition. COALESCE preserves existing embeddings
when the new value is NULL.

Fixes CI failure: "E2E: Parallel Import > parallel import with --workers 4"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: advisory lock in initSchema() prevents deadlock on concurrent DDL

When multiple processes call initSchema() concurrently (e.g., test setup +
CLI subprocess, or parallel workers during E2E tests), the schema SQL's
DROP TRIGGER + CREATE TRIGGER statements acquire AccessExclusiveLock on
different tables, causing deadlocks.

Fix: pg_advisory_lock(42) serializes all initSchema() calls within the
same database. The lock is session-scoped and released in a finally block.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add explicit test timeouts for CLI subprocess E2E tests

CLI subprocess tests (Setup Journey, Doctor Command, Parallel Import)
spawn `bun run src/cli.ts` which takes several seconds to JIT compile +
connect. The Bun test framework default 5000ms per-test timeout is too
tight for CI. Added 30-60s timeouts matching each subprocess's own
timeout to prevent false failures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: infinite recursion in config.ts exported getConfigDir/getConfigPath

The replace_all refactor created recursive functions: the exported
getConfigDir() called the private getConfigDir() which called itself.
Renamed exports to configDir()/configPath() to avoid shadowing.

Also adds scripts/smoke-test-mcp.ts — verified all 8 MCP tool calls
work against a real Postgres database.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@garrytan
Copy link
Copy Markdown
Owner Author

Thank you for the early work on this! The architecture has evolved significantly since this was opened — v0.3.0 introduced the contract-first architecture with openclaw.plugin.json and the plugin bundle system, which supersedes the approach here. Appreciate the contribution and the thinking that went into it!

@garrytan garrytan closed this Apr 11, 2026
garrytan added a commit that referenced this pull request Apr 18, 2026
…ot exit

Codex architecture finding #5: reusing CLI entry-point functions as Minions
handler bodies is wrong. If a Minion invokes runExtract / runEmbed /
runBacklinks / runLint and the handler hits a process.exit(1), the ENTIRE
WORKER process dies — killing every other in-flight job. Handlers need
library-level APIs that throw, and the CLI stays a thin wrapper that
catches + exits.

Per-command shape:
  - runXxxCore(opts): throws on validation errors, returns structured
    result. Handler-safe.
  - runXxx(args): arg parser; calls Core; catches; process.exit(1) on
    thrown errors. CLI-safe.

Shipped:
  - runExtractCore({ mode, dir, dryRun?, jsonMode? }) → ExtractResult
  - runEmbedCore({ slug? | slugs? | all? | stale? }) → void
  - runBacklinksCore({ action, dir, dryRun? }) → BacklinksResult
  - runLintCore({ target, fix?, dryRun? }) → LintResult

sync.ts is already correct — performSync throws; runSync wraps. No change.

import.ts deferred to v0.12.0 (its one process.exit fires only on a
missing dir arg; handlers always pass a dir, so worker-kill risk is
zero in practice). Noted in the plan's Out-of-scope.

Smoke verified: all four Core functions throw on invalid mode / missing
dir / not-found target instead of exiting the process.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request Apr 18, 2026
…hutdown

Autopilot now dispatches each cycle as a single `autopilot-cycle` Minion
job (with idempotency_key on the cycle slot) instead of running steps
inline. A forked `gbrain jobs work` child drains the queue durably,
supervised by autopilot. The user runs ONE install step
(`gbrain autopilot --install`) and gets sync + extract + embed + backlinks
+ durable job processing, with no separate worker daemon to manage.

Mode selection:
  - minion_mode=always OR pain_triggered (default), engine=postgres →
    Minions dispatch. Spawn child, submit autopilot-cycle each interval.
  - minion_mode=off, OR engine=pglite, OR `--inline` flag → run steps
    inline in-process, same as pre-v0.11.1. PGLite has an exclusive file
    lock that blocks a second worker process, so the inline path is the
    only path that works there.

Worker supervision:
  - spawn(resolveGbrainCliPath(), ['jobs', 'work'], { stdio: 'inherit' }).
    stdio:'inherit' avoids pipe-buffer blocking (Codex architecture #2).
  - On worker exit: 10s backoff + restart. Crash counter caps at 5 →
    autopilot stops with a clear error.
  - resolveGbrainCliPath() prefers argv[1] (cli.ts / /gbrain), then
    process.execPath (compiled binary suffix check), then `which gbrain`
    (installed to $PATH). NEVER blindly uses process.execPath, which on
    source installs is the Bun runtime, not `gbrain` (Codex architecture
    #1).

Shutdown:
  - Async SIGTERM/SIGINT handler: sends SIGTERM to worker, awaits its
    exit for up to 35s (the worker's own drain is 30s; we add buffer for
    signal-delivery latency), then SIGKILL if still alive.
  - Drops the old `process.on('exit')` lock-cleanup handler — its
    callback runs synchronously and can't wait for the worker drain.
    Lock file cleanup moved inside the async shutdown.

Lock-file mtime refresh every cycle (Codex C) so a long-lived autopilot
doesn't get declared "stale" by the next cron-fired invocation after 10
minutes.

Inline fallback path calls the new Core fns (runExtractCore, runEmbedCore)
instead of the CLI wrappers. That way a bad arg from inside the loop
can't process.exit() the autopilot itself (matches Codex #5).

test/autopilot-resolve-cli.test.ts: 3 tests covering argv[1]-as-gbrain,
argv[1]-as-cli.ts, and graceful error when no path resolves.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request Apr 18, 2026
…LOG + version bump

scripts/fix-v0.11.0.sh — the paste-command for broken-v0.11.0 installs.
Released on the v0.11.1 tag so:
  curl -fsSL https://raw.githubusercontent.com/garrytan/gbrain/v0.11.1/scripts/fix-v0.11.0.sh | bash
always works (master branch could be renamed). 8 steps: schema apply,
smoke, mode prompt (non-TTY defaults pain_triggered), atomic write of
preferences.json (0o600), append completed.jsonl with status:"partial"
and apply_migrations_pending:true so the v0.11.1 apply-migrations run
resumes correctly (does NOT poison the permanent migration path —
Codex H2 avoidance), AGENTS.md + cron/jobs.json detection with guidance
printed as text only (never auto-edits from a curl-piped script), and a
closing line telling the user to run `gbrain autopilot --install` as the
one-stop finisher.

CLAUDE.md — new "Migration is canonical, not advisory" section pinning
the design principle. Any host-repo change (AGENTS.md, cron manifests,
launchctl units) is GBrain's responsibility via the migration; the
exception is host-specific handler registration, which goes via the
code-level plugin contract in docs/guides/plugin-handlers.md.

README.md — new sections:
  - "v0.11.0 migration didn't fire on your upgrade?" with both repair
    paths (v0.11.1 binary and pre-v0.11.1 stopgap).
  - "Skillify + check-resolvable: user-controllable auto-skill-creation"
    explaining why the user-controlled pair beats Hermes-style auto
    generation. Includes the scripts/skillify-check.ts invocation.

CHANGELOG.md — v0.11.1 entry (per CLAUDE.md voice: lead with what the
user can now do that they couldn't before; frame as benefits, not files
changed). Covers: mega-bug fix + apply-migrations + postinstall +
stopgap, autopilot-supervises-worker + single-install-step + env-aware
targets, Core fn extraction so handlers don't kill workers, skillify +
check-resolvable pair, host-agnostic plugin contract replacing
handlers.json (RCE concern), gbrain init --migrate-only, TS migration
registry + H8/H9 diff-rule fixes, CLAUDE.md directive. All Codex hard
blockers (H1, H3/H4, H5, H6, H7, H8, H9, K) + architecture issues
(#1/#2/#4/#5/#7) resolved.

package.json — version bump 0.11.0 → 0.11.1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request Apr 18, 2026
* feat: add minion_jobs schema, migration v5, and executeRaw to BrainEngine

Foundation for the Minions job queue system. Adds:
- minion_jobs table (20 columns) with CHECK constraints, partial indexes,
  and RLS. Inspired by BullMQ's job model, adapted for Postgres.
- Migration v5 creates the table for existing databases.
- executeRaw<T>() method on BrainEngine interface for raw SQL access,
  needed by the Minions module for claim queries (FOR UPDATE SKIP LOCKED),
  token-fenced writes, and atomic stall detection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: Minions job queue — queue, worker, backoff, types

BullMQ-inspired Postgres-native job queue built into GBrain. No Redis.
No external dependencies. Postgres transactions replace Lua scripts.

- MinionQueue: submit, claim (FOR UPDATE SKIP LOCKED), complete/fail
  (token-fenced), atomic stall detection (CTE), delayed promotion,
  parent-child resolution, prune, stats
- MinionWorker: handler registry, lock renewal, graceful SIGTERM,
  exponential backoff with jitter, UnrecoverableError bypass
- MinionJobContext: updateProgress(), log(), isActive() for handlers
- 8-state machine: waiting/active/completed/failed/delayed/dead/
  cancelled/waiting-children

Patterns stolen from: BullMQ (lock tokens, stall detection, flows),
Sidekiq (dead set, backoff formula), Inngest (checkpoint/resume).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: 43 tests for Minions job queue

Full coverage of the Minions module against PGLite in-memory:
- Queue CRUD (9): submit, get, list, remove, cancel, retry, duplicate
- State machine (6): waiting→active→completed/failed, retry→delayed→waiting
- Backoff (4): exponential, fixed, jitter range, attempts_made=0 edge
- Stall detection (3): detect stalled, counter increment, max→dead
- Dependencies (5): parent waits, fail_parent, continue, remove_dep, orphan
- Worker lifecycle (5): register, start-without-handlers, claim+execute,
  non-Error throws, UnrecoverableError bypass
- Lock management (3): renewal, token mismatch, claim sets lock fields
- Claim mechanics (4): empty queue, priority ordering, name filtering,
  delayed promotion timing
- Cancel & retry (2): cancel active, retry dead

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: Minions CLI commands and MCP operations

Wire Minions into the GBrain CLI and MCP layer:

CLI (gbrain jobs):
  submit <name> [--params JSON] [--follow] [--dry-run]
  list [--status S] [--queue Q] [--limit N]
  get <id> — detailed view with attempt history
  cancel/retry/delete <id>
  prune [--older-than 30d]
  stats — job health dashboard
  work [--queue Q] [--concurrency N] — Postgres-only worker daemon

6 MCP operations (contract-first, auto-exposed via MCP server):
  submit_job, get_job, list_jobs, cancel_job, retry_job, get_job_progress

Built-in handlers: sync, embed, lint, import. --follow runs inline.
Worker daemon blocked on PGLite (exclusive file lock).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for Minions job queue

CLAUDE.md: added Minions files to key files, updated operation count (36),
BrainEngine method count (38), test file count (45), added jobs CLI commands.
CHANGELOG.md: added Minions entry to v0.10.0 (background jobs, retry, stall
detection, worker daemon).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: Minions v2 — agent orchestration primitives (pause/resume, inbox, tokens, replay)

Adds the foundation for Minions as universal agent orchestration infrastructure.
GBrain's Postgres-native job queue now supports durable, observable, steerable
background agents. The OpenClaw plugin (separate repo) will consume these via
library import, not MCP, for zero-latency local integration.

## New capabilities

- **Concurrent worker** — Promise pool replaces sequential loop. Per-job
  AbortController for cooperative cancellation. Graceful shutdown waits for
  all in-flight jobs via Promise.allSettled.
- **Pause/resume** — pauseJob clears the lock and fires AbortSignal on active
  jobs. Handlers check ctx.signal.aborted and exit cleanly. resumeJob returns
  paused jobs to waiting. Catch block skips failJob when signal.aborted.
- **Inbox (separate table)** — minion_inbox table for sidechannel messages.
  sendMessage with sender validation (parent job or admin). readInbox is
  token-fenced and marks read_at atomically. Separate table avoids row bloat
  from rewriting JSONB on every send.
- **Token accounting** — tokens_input/tokens_output/tokens_cache_read columns.
  updateTokens accumulates; completeJob rolls child tokens up to parent.
  USD cost computed at read time (no cost_usd column — pricing too volatile).
- **Job replay** — replayJob clones a terminal job with optional data overrides.
  New job, fresh attempts, no parent link.

## Handler contract additions

MinionJobContext now provides:
- `signal: AbortSignal` — cooperative cancellation
- `updateTokens(tokens)` — accumulate token usage
- `readInbox()` — check for sidechannel messages
- `log()` — now accepts string or TranscriptEntry

## MCP operations added

pause_job, resume_job, replay_job, send_job_message — all auto-generate CLI
commands and MCP server endpoints.

## Library exports

package.json exports map adds ./minions and ./engine-factory paths so plugins
can `import { MinionQueue } from 'gbrain/minions'` for direct library use.

## Instruction layer (the teaching)

- skills/minion-orchestrator/SKILL.md — when/how to use Minions, decision
  matrix, lifecycle management, anti-patterns
- skills/conventions/subagent-routing.md — cross-cutting rule: all background
  work goes through Minions
- RESOLVER.md — trigger entries for agent orchestration
- manifest.json — registered

## Schema migration v6

Additive: 3 token columns, paused status, minion_inbox table with unread index.
Full Postgres + PGLite support. No backfill needed.

## Tests

65 tests (was 43): pause/resume (5), inbox (6), tokens (4), replay (4),
concurrent worker context (3), plus all existing coverage.

## What's NOT in this commit

Deferred to follow-up PRs:
- LISTEN/NOTIFY subscribe (needs real Postgres E2E)
- Resource governor (depends on concurrent worker stress testing)
- Routing eval harness (needs API keys + benchmark data)
- OpenClaw plugin (separate @gbrain/openclaw-minions-plugin repo)

See docs/designs/MINIONS_AGENT_ORCHESTRATION.md for full CEO-approved design.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(minions): migration v7 — agent_parity_layer schema

Adds columns on minion_jobs (depth, max_children, timeout_ms, timeout_at,
remove_on_complete, remove_on_fail, idempotency_key) plus the new
minion_attachments table. Three partial indexes for bounded scans:
idx_minion_jobs_timeout, idx_minion_jobs_parent_status, and
uniq_minion_jobs_idempotency. Check constraints enforce non-negative depth
and positive child cap / timeout.

Additive migration — existing installs pick it up via ensureSchema on next
use. No user action required.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(minions): extend types for v7 parity layer

Extends MinionJob with depth/max_children/timeout_ms/timeout_at/
remove_on_complete/remove_on_fail/idempotency_key. Extends MinionJobInput
with the same options plus max_spawn_depth override. Adds MinionQueueOpts
(maxSpawnDepth default 5, maxAttachmentBytes default 5 MiB). Adds
AttachmentInput/Attachment shapes and ChildDoneMessage in the InboxMessage
union. rowToMinionJob updated to pick up the new columns.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(minions): attachments validator

New module validateAttachment() gates every attachment write. Rejects empty
filenames, path traversal (.., /, \), null bytes, oversized content (5 MiB
default, per-queue override), invalid base64, and implausible content_type
headers. Returns normalized { filename, content_type, content (Buffer),
sha256, size } on success.

The DB also enforces UNIQUE (job_id, filename) as defense-in-depth for
concurrent addAttachment races — JS-only checks are not sufficient.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(minions): queue v7 — depth, child cap, timeouts, cascade, idempotency, child_done

Wraps completeJob and failJob in engine.transaction() so parent hook
invocations (resolveParent, failParent, removeChildDependency) fold into
the same transaction as the child update. A process crash between child
and parent can't strand the parent in waiting-children anymore.

Adds v7 behaviors:
- Depth tracking. add() computes depth = parent.depth + 1 and rejects
  past maxSpawnDepth (default 5).
- Per-parent child cap. add() takes SELECT ... FOR UPDATE on the parent,
  counts non-terminal children, rejects when count >= max_children.
  NULL max_children = no cap.
- Per-job wall-clock timeout. claim() populates timeout_at when
  timeout_ms is set. New handleTimeouts() dead-letters expired rows with
  error_text='timeout exceeded'. Terminal — no retry.
- Cascade cancel. cancelJob() walks descendants via recursive CTE with
  depth-100 runaway cap. Returns the root row. Re-parented descendants
  (parent_job_id NULL) are naturally excluded.
- Idempotency. add() uses INSERT ... ON CONFLICT (idempotency_key) DO
  NOTHING RETURNING; falls back to SELECT when RETURNING is empty. Same
  key always yields the same job id.
- child_done inbox. completeJob inserts {type:'child_done', child_id,
  job_name, result} into the parent's inbox in the same transaction as
  the token rollup, guarded by EXISTS so terminal/deleted parents skip
  without FK violation. New readChildCompletions(parent_id, lock_token,
  since?) helper; token-fenced like readInbox.
- removeOnComplete / removeOnFail. Deletes the row after the parent hook
  fires, so parent policy sees consistent state.
- Attachment methods. addAttachment validates via validateAttachment
  then INSERTs; UNIQUE (job_id, filename) backs the JS dup check.
  listAttachments, getAttachment, deleteAttachment round out the API.

Fixes pre-existing inverted status bug: add() now puts children in
waiting/delayed (not waiting-children) and atomically flips the parent
to waiting-children in the same transaction. Tests no longer need
manual UPDATE workarounds.

Two correctness fixes:
- Sibling completion race. Under READ COMMITTED, two grandchildren
  completing concurrently each saw the other as still-active in the
  pre-commit snapshot and neither flipped the parent. Fixed by taking
  SELECT ... FOR UPDATE on the parent row at the start of completeJob
  and failJob transactions, serializing siblings on the parent lock.
- JSONB double-encode. postgres.js conn.unsafe(sql, params) auto-
  JSON-encodes parameters. Calling JSON.stringify(obj) first stored a
  JSON string literal (jsonb_typeof=string) and broke payload->>'key'
  queries silently. Removed JSON.stringify from three call sites
  (child_done inbox post, updateProgress, sendMessage). PGLite tolerated
  both forms so unit tests missed it — real-PG E2E caught it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(minions): worker — timeout safety net + handleTimeouts tick

Worker tick now calls handleStalled() first, then handleTimeouts() — stall
requeue wins over timeout dead-letter when both could fire in the same
cycle. handleTimeouts() guards on lock_until > now() so stalled jobs take
the retryable path.

launchJob schedules a per-job setTimeout(timeout_ms) that fires ctx.signal
as a best-effort handler interrupt. The timer is always cleared in .finally
so process exit isn't delayed by a dangling timer. Handlers that respect
AbortSignal stop cleanly; handlers that ignore it still get dead-lettered
by the DB-side handleTimeouts.

Removed post-completeJob and post-failJob parent-hook calls from the worker
— those are now inside the queue method transactions. Worker becomes
simpler and crash-safer.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(minions): 33 new unit tests for v7 parity layer

Covers depth cap, per-parent child cap, timeout dead-letter, cascade
cancel (including the re-parent edge case), removeOnComplete /
removeOnFail, idempotency (single + concurrent), child_done inbox
(posted in txn + survives child removeOnComplete + since cursor),
attachment validation (oversize, path traversal, null byte, duplicates,
base64), AbortSignal firing on pause mid-handler, catch-block skipping
failJob when aborted, worker in-flight bookkeeping, token-rollup guard
when parent already terminal, and setTimeout safety-net cleanup.

Existing tests updated to remove the inverted-status manual UPDATE
workarounds that the add() fix made obsolete.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(e2e): Minions v7 concurrency + OpenClaw resilience coverage

minions-concurrency.test.ts spins two MinionWorker instances against the
test Postgres, submits 20 jobs, and asserts zero double-claims (every job
runs exactly once). This is the only test that actually proves FOR UPDATE
SKIP LOCKED under real concurrency — PGLite runs on a single connection
and can't exercise the race.

minions-resilience.test.ts covers the six OpenClaw daily pains:
1. Spawn storm caps enforce under concurrent submit. 2. Agent stall →
handleStalled() requeues; handleTimeouts() skips (lock_until guard).
3. Forgotten dispatches recoverable via child_done inbox. 4. Cascade
cancel stops grandchildren mid-flight. 5. Deep tree fan-in
(parent → 3 children → 2 grandchildren each) completes with the full
inbox chain. 6. Parent crash/recovery resumes from persisted state.

helpers.ts extends ALL_TABLES with minion_attachments, minion_inbox, and
minion_jobs (FK dependents first) so E2E teardown doesn't leak rows
between runs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: release v0.11.0 — Minions v7 agent orchestration primitives

Bumps VERSION / package.json to 0.11.0. Adds CHANGELOG entry covering
depth tracking, max_children, per-job timeouts, cascade cancel,
idempotency keys, child_done inbox, removeOnComplete/Fail, attachments,
migration v7, plus the two correctness fixes (sibling completion race
and JSONB double-encode).

TODOS.md captures the four v7 follow-ups: per-queue rate limiting,
repeat/cron scheduler, worker event emitter, and waitForChildren
convenience helpers.

1066 unit + 105 E2E = 1171 tests passing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(minions): unify JSONB inserts, tighten nullish coalescing

Three non-blocker cleanups from post-ship review of v0.11.0:

- queue.ts add() and completeJob(): pre-stringifying with JSON.stringify
  while other sites pass raw objects with $n::jsonb casts. postgres.js
  double-encodes if you stringify first — works on PGLite (text→JSONB
  auto-cast), fails silently on real PG. Unify on raw object + explicit
  $n::jsonb cast.
- queue.ts readChildCompletions: since clause used sent_at > $2 relying
  on PG's implicit text→TIMESTAMPTZ coercion. Explicit $2::timestamptz
  is safer and clearer.
- types.ts rowToMinionJob: parent_job_id used || which coerces 0 to null.
  Harmless today (SERIAL IDs start at 1) but ?? is semantically correct.

All 110 unit tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(minions): updateProgress missed $1::jsonb cast in unification

Residual from c502b7e — updateProgress was the only remaining JSONB write
without the explicit ::jsonb cast. Not broken (implicit cast works) but
breaks the convention the prior commit unified everywhere else.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* doc: Minions v7 skill count + jobs subcommands (26 skills)

README: bump skill count 25 → 26, add minion-orchestrator row, add
`gbrain jobs` command family block so v0.11.0's headline feature is
actually discoverable from the top-level commands reference.

CLAUDE.md: unit test count 48 → 49 (minions.test.ts expanded), skill
count 25 → 26, add minion-orchestrator to Key files + skills categorization,
expand MinionQueue one-liner to cover v7 primitives (depth/child-cap,
timeouts, idempotency, child_done inbox, removeOnComplete/Fail).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat: Minions adoption UX — smoke test + migration + pain-triggered routing

Teach OpenClaw when to reach for Minions vs native subagents. Ship three
pieces so upgrading from v0.10.x actually lands for real users:

- `gbrain jobs smoke` — one-command health check that submits a `noop` job,
  runs a worker, verifies completion, and prints engine-aware guidance
  (PGLite installs get the "daemon needs Postgres, use --follow" note).
  Fails loud if schema's below v7 so the user knows to `gbrain init`.

- `skills/migrations/v0.11.0.md` — post-upgrade migration file the
  auto-update agent reads. Six steps: apply schema, run smoke, ask user
  via AskUserQuestion which mode they want (always / pain_triggered / off),
  write to `~/.gbrain/preferences.json`, sanity-check handlers, mark done.
  Completeness scores on each option so the recommendation is explicit.

- `skills/conventions/subagent-routing.md` rewritten — was a "MUST use
  Minions for ALL background work" mandate, now reads preferences.json
  on every routing decision and branches on three modes. Mode B
  (pain_triggered) is the default: keep subagents until gateway drops
  state, parallel > 3, runtime > 5min, or user expresses frustration.
  Then pitch the switch in-session with a specific script.

Rename pass: "Minions v7" → "Minions" in README (JOBS block), TODOS.md
(P1 section header + depends-on), CHANGELOG.md v0.11.0 entry. v7 stays
as the internal schema version in code/migration contexts. The product
name is just Minions.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* doc(readme): promote Minions — 6 OpenClaw pains + how each is fixed

The one-line mention in the skills table wasn't doing the work. Added a
dedicated section between "How It Works" and "Getting Data In" that leads
with the six multi-agent failures every OpenClaw user hits daily (spawn
storms, hung handlers, forgotten dispatches, unstructured debugging,
gateway crashes, runaway grandchildren) and maps each pain to the
specific Minions primitive that fixes it.

Includes the smoke test command, the adoption default (pain_triggered),
and a pointer to skills/minion-orchestrator for the full patterns.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(bench): add harness for Minions vs OpenClaw subagent dispatch

Shared harness (openclawDispatch + minionsHandler) using matching
claude-haiku-4-5 calls on both sides so the delta measures queue+
dispatch overhead on top of identical LLM work. Includes
statsFromResults (p50/p95/p99) and formatStats helpers. Uses
`openclaw agent --local` embedded mode; does not test gateway
multi-agent fan-out (documented in the harness header).

* test(bench): durability under SIGKILL — Minions vs OpenClaw --local

Headline bench for the claim: when the orchestrator dies mid-dispatch,
Minions rescues via PG state + stall detection; OpenClaw --local loses
in-flight work outright.

Minions side: seed 10 active+expired-lock rows (exact state a SIGKILLed
worker leaves) then run a rescue worker. Expect 10/10 completed.
OpenClaw side: spawn 10 `openclaw agent --local` in parallel, SIGKILL
each at 500ms, count pre-kill delivered output. Expect 0/10 — no
persistence layer, nothing to recover.

Budget: ~$0 (Minions handlers sleep 10ms; OC calls die at 500ms so
partial LLM billing is negligible).

* test(bench): per-dispatch throughput — Minions vs OpenClaw --local

20 serial dispatches each side, identical claude-haiku-4-5 call with the
same trivial prompt. p50/p95/p99 reported via statsFromResults. Serial
(not parallel) so the per-dispatch cost is measured honestly and LLM
token spend stays bounded (~$0.08 total).

Minions: one queue, one worker, one concurrency. Submit → poll to
completion before next submit. OpenClaw: N sequential
`openclaw agent --local` spawns.

* test(bench): fan-out — Minions 10-wide concurrency vs 10 parallel OC spawns

Parent dispatches 10 children, waits for all to return. Minions uses
worker concurrency=10 sharing one warm process; OpenClaw parallel
`openclaw agent --local` spawns, each boots its own runtime.

3 runs × 10 children per run. Reports ok count and wall time per run
plus summary. Honest caveat documented: does not test OC gateway
multi-agent fan-out — that needs a custom WS client and LLM-backed
parent agent. This measures what users script today.

Budget: ~$0.12 LLM spend.

* test(bench): memory — 10 in-flight subagents, single-proc vs 10-proc cost

Measures resident memory for keeping 10 subagents in flight. Minions:
one worker process, concurrency=10 with handlers that park on a
promise — sample RSS of the test process via process.memoryUsage().
OpenClaw: 10 parallel `openclaw agent --local` processes, sum their
RSS via `ps -o rss=`.

Handlers are cheap sleeps, no LLM — we want harness memory, not LLM
client state. Budget: $0.

* test(bench): fan-out — don't gate on OC success rate, report numbers

Initial run showed OC parallel `--local` at 10-wide hits 40% failure
rate (17/30 across 3 runs). That's the finding, not a test bug —
process startup stampede + LLM rate limits. Bench now prints error
samples and reports the numbers instead of gating.

Minions side still gates at 90% (30/30 observed in practice).

* doc(benchmarks): Minions vs OpenClaw --local subagent dispatch

Real numbers on four claims: durability, throughput, fan-out, memory.
Same claude-haiku-4-5 call on both sides so the delta is queue+dispatch+
process cost on top of identical LLM work.

Headline: Minions rescues 10/10 from a SIGKILLed worker in 458ms while
OpenClaw --local loses all 10; ~10× faster per dispatch (778ms p50 vs
8086ms p50); ~21× faster at 10-wide fan-out AND 100% reliable vs OC's
43% failure rate; 2 MB vs 814 MB to keep 10 subagents in flight.

Honest caveats section covers what this doesn't test (OC gateway
multi-agent, load tests, other models). Fully reproducible via
test/e2e/bench-vs-openclaw/.

* doc(readme): inject Minions vs OpenClaw bench numbers

Headline deltas now in the Minions section: 10/10 vs 0/10 on crash,
~10× faster per dispatch, ~21× faster fan-out at 10-wide with 0%
failure vs 43%, ~400× less memory. Links to the full bench doc.

Prose first said Minions "fixes all six pains." Now it shows the
numbers that prove it.

* bench: production Wintermute benchmark — Minions 753ms vs sub-agent timeout

Real deployment: 45K-page brain on Render+Supabase. Task: pull 99 tweets,
write brain page, commit, sync. Minions: 753ms, $0. Sub-agent: gateway
timeout (>10s, couldn't even spawn under production load).

Also: 19,240 tweets backfilled across 36 months in 15 min at $0.
Sub-agents would cost $1.08 and fail 40% of spawns.

* bench: tweet ingestion — Minions 719ms vs OpenClaw 12.5s (17×)

Production benchmark with runnable test code:
- test/e2e/bench-vs-openclaw/tweet-ingest.bench.ts (reusable)
- docs/benchmarks/2026-04-18-tweet-ingestion.md (publishable)

Task: pull 100 tweets from X API, write brain page, commit, sync.
Minions: 719ms mean, $0, 100% success.
OpenClaw: 12,480ms mean, $0.03/run, 60% success (gateway timeouts).
At scale: 36-month backfill, 19K tweets, 15 min, $0 vs est. $1.08.

* doc(benchmarks): Wintermute production data point for Minions vs OpenClaw

Adds a production-environment data point to the Minions README section:
one month of tweet ingest on Wintermute (Render + Supabase + 45K-page brain)
ran end-to-end in 753ms for \$0.00 via Minions, while the equivalent
sessions_spawn hit the 10s gateway timeout and produced nothing.

Full methodology + logs in docs/benchmarks/2026-04-18-minions-vs-openclaw-production.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(core): preferences.ts + cli-util.ts — foundations for v0.11.1

Adds two foundational modules that apply-migrations (Lane A-4), the
v0.11.0 orchestrator (Lane C-1), and the stopgap script (Lane C-4) all
depend on.

- src/core/preferences.ts: atomic-write ~/.gbrain/preferences.json
  (mktemp + rename, 0o600, forward-compatible for unknown keys) with
  validateMinionMode, loadPreferences, savePreferences. Plus
  appendCompletedMigration + loadCompletedMigrations for the
  ~/.gbrain/migrations/completed.jsonl log (tolerates malformed lines).
  Uses process.env.HOME || homedir() so $HOME overrides work in CI and
  tests; Bun's os.homedir() caches the initial value and ignores later
  mutations.
- src/core/cli-util.ts: promptLine(prompt) helper, extracted from
  src/commands/init.ts:212-224. Shared so init, apply-migrations, and
  the v0.11.0 orchestrator's mode prompt don't each reinvent it.

test/preferences.test.ts: 21 unit tests covering load/save atomicity,
0o600 perms, forward-compat for unknown keys, minion_mode validation,
completed.jsonl JSONL append idempotence, auto-ts population, malformed-
line tolerance in loadCompletedMigrations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(init): add --migrate-only flag (schema-only, no saveConfig)

Context: v0.11.0 migration orchestrators need a safe way to re-apply the
schema against an existing brain without risking a config flip. Today
running bare `gbrain init` with no flags defaults to PGLite and calls
saveConfig, which would silently overwrite an existing Postgres
database_url — caught by Codex in the v0.11.1 plan review as a
show-stopper data-loss bug.

The new --migrate-only path:
  - loadConfig() reads the existing config (does NOT call saveConfig)
  - errors out with a clear "run gbrain init first" if no config exists
  - connects via the already-configured engine, calls engine.initSchema(),
    disconnects
  - --json emits structured success/error payloads

Everything downstream in the v0.11.1 migration chain (apply-migrations,
the stopgap bash script, the package.json postinstall hook) will invoke
this flag rather than bare gbrain init.

test/init-migrate-only.test.ts: 4 tests covering the no-config error
path, --json error payload shape, happy-path with a PGLite fixture
(verifies config.json content is byte-identical after the call — the
real invariant), and idempotent rerun.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(migrations): TS registry replaces filesystem migration scan

Context: Codex flagged that bun build --compile produces a self-contained
binary, and the existing findMigrationsDir() in upgrade.ts:145 walks
skills/migrations/v*.md on disk — which fails on a compiled install
because the markdown files aren't bundled. The plan's fix is a TS
registry: migrations are code, imported directly, visible to both source
installs and compiled binaries.

- src/commands/migrations/types.ts: shared Migration, OrchestratorOpts,
  OrchestratorResult types.
- src/commands/migrations/index.ts: exports the migrations[] array,
  getMigration(version), and compareVersions() (semver comparator).
  The feature_pitch data that lived in the MD file frontmatter now
  lives here as a code constant on each Migration, so runPostUpgrade's
  post-upgrade pitch printer can consume it without a filesystem read.
- src/commands/migrations/v0_11_0.ts: stub orchestrator + pitch. The
  full phase implementation lands in Lane C-1; for now the stub throws
  a clear "not yet implemented" so apply-migrations --list (Lane A-4)
  can still enumerate the migration.

test/migrations-registry.test.ts: 9 tests covering ascending-semver
ordering, feature_pitch shape invariants, getMigration lookup, and
compareVersions edge cases (equal / newer / older / single-digit
across major bumps).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cli): gbrain apply-migrations — migration runner CLI

Reads ~/.gbrain/migrations/completed.jsonl, diffs against the TS migration
registry, runs pending orchestrators. Resumes status:"partial" entries
(the stopgap bash script writes these so v0.11.1 apply-migrations can
pick up where it left off). Idempotent: rerunning when up-to-date exits 0.

Flags:
  --list                    Show applied + partial + pending + future.
  --dry-run                 Print the plan; take no action.
  --yes / --non-interactive Skip prompts (used by runPostUpgrade + postinstall).
  --mode <a|p|o>            Preset minion_mode (bypasses the Phase C TTY prompt).
  --migration vX.Y.Z        Force-run one specific version.
  --host-dir <path>         Include $PWD in host-file walk (default is
                            $HOME/.claude + $HOME/.openclaw only).
  --no-autopilot-install    Skip Phase F.

Diff rule (Codex H9): apply when no status:"complete" entry exists AND
migration.version ≤ installed VERSION. Previously proposed rule was
"version > currentVersion", which would SKIP v0.11.0 when running v0.11.1;
regression test in apply-migrations.test.ts pins the correct semantics.

Registered in src/cli.ts CLI_ONLY Set; dispatched before connectEngine so
each phase owns its own engine/subprocess lifecycle (no double-connect
when the orchestrator shells out to init --migrate-only or jobs smoke).

test/apply-migrations.test.ts: 18 unit tests covering parseArgs for every
flag, indexCompleted/statusForVersion correctness (including stopgap-then-
complete transition), and buildPlan's four buckets (applied / partial /
pending / skippedFuture) with the Codex H9 regression pinned.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(upgrade): runPostUpgrade tail-calls apply-migrations; postinstall hook

Closes the v0.11.0 mega-bug: migration skills never fired on upgrade.
`runPostUpgrade` now does two things:

  1. Cosmetic: prints feature_pitch headlines for migrations newer than
     the prior binary. Uses the TS registry (Codex K) instead of walking
     skills/migrations/*.md on disk — compiled binaries see the same list
     source installs do.
  2. Mechanical: invokes apply-migrations --yes --non-interactive in the
     same process so Phase F (autopilot install) doesn't hit a subprocess
     timeout wall. Catches + surfaces errors without failing the upgrade.

Also:
  - Drops the early-return on missing upgrade-state.json (Codex H8).
    runPostUpgrade now runs apply-migrations unconditionally; it's cheap
    when nothing is pending. This repairs every broken-v0.11.0 install on
    their next upgrade attempt.
  - Bumps the `gbrain post-upgrade` subprocess timeout in runUpgrade from
    30s → 300s (Codex H7). A v0.11.0→v0.11.1 migration that has to
    schema-init + smoke + prefs + host-rewrite + launchd-install exceeds
    30s trivially.
  - Removes now-dead findMigrationsDir + extractFeaturePitch helpers and
    their filesystem-reading imports (readdirSync, resolve).
  - src/cli.ts post-upgrade dispatch now awaits the async runPostUpgrade.

apply-migrations (Lane A-4):
  - First-install guard: loadConfig() check at the top. No brain
    configured = exit silently for --yes / --non-interactive (postinstall
    stays quiet on fresh `bun add gbrain`); explicit message on --list /
    --dry-run.

package.json:
  - New `postinstall` script: gbrain --version >/dev/null 2>&1 && gbrain
    apply-migrations --yes --non-interactive 2>/dev/null || true. The
    --version sanity check guards against a half-written binary (Codex
    review criticism). || true prevents `bun update gbrain` failure
    mid-upgrade.

Manual smoke verified: fresh $HOME with no config → apply-migrations
--yes silently exits 0; --dry-run prints the one-liner "No brain
configured... Nothing to migrate."

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(commands): extract library-level Core functions that throw not exit

Codex architecture finding #5: reusing CLI entry-point functions as Minions
handler bodies is wrong. If a Minion invokes runExtract / runEmbed /
runBacklinks / runLint and the handler hits a process.exit(1), the ENTIRE
WORKER process dies — killing every other in-flight job. Handlers need
library-level APIs that throw, and the CLI stays a thin wrapper that
catches + exits.

Per-command shape:
  - runXxxCore(opts): throws on validation errors, returns structured
    result. Handler-safe.
  - runXxx(args): arg parser; calls Core; catches; process.exit(1) on
    thrown errors. CLI-safe.

Shipped:
  - runExtractCore({ mode, dir, dryRun?, jsonMode? }) → ExtractResult
  - runEmbedCore({ slug? | slugs? | all? | stale? }) → void
  - runBacklinksCore({ action, dir, dryRun? }) → BacklinksResult
  - runLintCore({ target, fix?, dryRun? }) → LintResult

sync.ts is already correct — performSync throws; runSync wraps. No change.

import.ts deferred to v0.12.0 (its one process.exit fires only on a
missing dir arg; handlers always pass a dir, so worker-kill risk is
zero in practice). Noted in the plan's Out-of-scope.

Smoke verified: all four Core functions throw on invalid mode / missing
dir / not-found target instead of exiting the process.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(jobs): Tier 1 handlers + autopilot-cycle (the killer handler)

registerBuiltinHandlers now handlers every operation autopilot needs to
dispatch via Minions + the single autopilot-cycle handler the autopilot
loop actually submits each interval.

Existing handlers (sync, embed, lint) rewired to call library-level Core
functions directly instead of the CLI wrappers. CLI wrappers call
process.exit(1) on validation errors; if a worker claimed a badly-formed
job, the WORKER PROCESS would die — killing every in-flight job. Cores
throw, so one bad job fails one job.

New handlers:
  - extract  → runExtractCore (mode: links|timeline|all, dir)
  - backlinks → runBacklinksCore (action: check|fix, dir)
  - autopilot-cycle → THE killer handler. Runs sync → extract → embed →
    backlinks inline. Each step wrapped in try/catch; returns
    { partial: true, failed_steps: [...] } when any step fails. Does NOT
    throw on partial failure — that would trigger Minion retry, and an
    intermittent extract bug would block every future cycle. Replaces
    the 4-job parent-child DAG proposed in early plan drafts (Codex
    H3/H4: parent/child is NOT a depends_on primitive in Minions).

import.ts handler still uses the CLI wrapper (runImport) — import's one
process.exit fires only on a missing dir arg and the handler always
passes a dir; Core extraction deferred to v0.12.0 when Tier 2 refactors
happen.

registerBuiltinHandlers promoted from private to exported for testability.

test/handlers.test.ts: 4 tests. Asserts every expected handler name
registers. Asserts autopilot-cycle against a nonexistent repo returns
{ partial: true, failed_steps: ['sync', 'extract', 'backlinks'] } — does
NOT throw. Asserts autopilot-cycle against an empty (but real) git repo
returns a result with a steps map, never throws.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(autopilot): Minions dispatch + worker spawn supervisor + async shutdown

Autopilot now dispatches each cycle as a single `autopilot-cycle` Minion
job (with idempotency_key on the cycle slot) instead of running steps
inline. A forked `gbrain jobs work` child drains the queue durably,
supervised by autopilot. The user runs ONE install step
(`gbrain autopilot --install`) and gets sync + extract + embed + backlinks
+ durable job processing, with no separate worker daemon to manage.

Mode selection:
  - minion_mode=always OR pain_triggered (default), engine=postgres →
    Minions dispatch. Spawn child, submit autopilot-cycle each interval.
  - minion_mode=off, OR engine=pglite, OR `--inline` flag → run steps
    inline in-process, same as pre-v0.11.1. PGLite has an exclusive file
    lock that blocks a second worker process, so the inline path is the
    only path that works there.

Worker supervision:
  - spawn(resolveGbrainCliPath(), ['jobs', 'work'], { stdio: 'inherit' }).
    stdio:'inherit' avoids pipe-buffer blocking (Codex architecture #2).
  - On worker exit: 10s backoff + restart. Crash counter caps at 5 →
    autopilot stops with a clear error.
  - resolveGbrainCliPath() prefers argv[1] (cli.ts / /gbrain), then
    process.execPath (compiled binary suffix check), then `which gbrain`
    (installed to $PATH). NEVER blindly uses process.execPath, which on
    source installs is the Bun runtime, not `gbrain` (Codex architecture
    #1).

Shutdown:
  - Async SIGTERM/SIGINT handler: sends SIGTERM to worker, awaits its
    exit for up to 35s (the worker's own drain is 30s; we add buffer for
    signal-delivery latency), then SIGKILL if still alive.
  - Drops the old `process.on('exit')` lock-cleanup handler — its
    callback runs synchronously and can't wait for the worker drain.
    Lock file cleanup moved inside the async shutdown.

Lock-file mtime refresh every cycle (Codex C) so a long-lived autopilot
doesn't get declared "stale" by the next cron-fired invocation after 10
minutes.

Inline fallback path calls the new Core fns (runExtractCore, runEmbedCore)
instead of the CLI wrappers. That way a bad arg from inside the loop
can't process.exit() the autopilot itself (matches Codex #5).

test/autopilot-resolve-cli.test.ts: 3 tests covering argv[1]-as-gbrain,
argv[1]-as-cli.ts, and graceful error when no path resolves.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(autopilot): env-aware install + OpenClaw bootstrap injection

Expand installDaemon from 2 targets (macOS launchd, Linux crontab) to 4:

  - macos              → launchd plist (unchanged)
  - linux-systemd      → ~/.config/systemd/user/gbrain-autopilot.service
                         with Restart=on-failure, RestartSec=30, and an
                         is-system-running probe to confirm the user bus
                         actually works (Codex architecture #7 hardened —
                         the naive /run/systemd/system existence check was
                         a false-positive magnet)
  - ephemeral-container → detects RENDER / RAILWAY_ENVIRONMENT /
                          FLY_APP_NAME / /.dockerenv. Crontab is unreliable
                          here (wiped on deploy), so we write
                          ~/.gbrain/start-autopilot.sh and tell the user
                          to source it from their agent's bootstrap
  - linux-cron         → existing crontab path (unchanged)

detectInstallTarget() + --target flag for explicit override. Also:
  - --inject-bootstrap / --no-inject control OpenClaw ensure-services.sh
    auto-injection. Default is ON when OpenClaw is detected (OPENCLAW_HOME
    env var, openclaw.json in CWD or $HOME, or an ensure-services.sh
    found). Injection adds ONE line with a `# gbrain:autopilot v0.11.0`
    marker and writes .bak.<ISO-timestamp> before touching the file.
    Idempotent — the marker check prevents double injection.

uninstallDaemon mirrors all four targets. A user can now run
`gbrain autopilot --uninstall` after moving hosts (macOS laptop → Linux
server) and the uninstall will find + remove every artifact.

writeWrapperScript now uses resolveGbrainCliPath() instead of blindly
baking process.execPath into the wrapper script — on source installs
that path is the Bun runtime, not gbrain (Codex architecture #1 fix
propagated to the install path too).

test/autopilot-install.test.ts: 4 tests covering detectInstallTarget's
platform + env-var branches. Deeper E2E coverage (systemd unit file
contents, ephemeral start-script contents + exec bit, OpenClaw marker
injection + .bak) lives in Task 14's E2E fixture test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(migrations): v0.11.0 orchestrator — phases A through G, full implementation

Replaces the stub from commit de027ce. The orchestrator runs all seven
phases of the v0.11.0 Minions adoption migration idempotently, resumable
from any prior status:"partial" run (the stopgap bash script writes
those).

Phases:
  A. Schema  — `gbrain init --migrate-only` (NEVER bare `gbrain init`,
               which defaults to PGLite and clobbers existing configs —
               Codex H1 show-stopper).
  B. Smoke   — `gbrain jobs smoke`. Abort loudly on non-zero.
  C. Mode    — --mode flag wins. Preserved from prefs on resume. Non-TTY
               or --yes defaults pain_triggered with explicit print.
               Interactive: numbered 1/2/3 menu via shared promptLine.
  D. Prefs   — savePreferences({minion_mode, set_at, set_in_version}).
  E. Host    — AGENTS.md marker injection + cron manifest rewrites. For
               cron entries whose skill matches a gbrain builtin
               (sync/embed/lint/import/extract/backlinks/autopilot-cycle)
               rewrites kind:agentTurn → kind:shell with a
               gbrain jobs submit command. PGLite branch keeps --follow
               (inline execution, the only path that works without a
               worker daemon); Postgres branch drops --follow + adds
               --idempotency-key ${handler}:${slot} so long cron jobs
               don't stack up (same Codex fix as the autopilot-cycle
               dispatch). For non-builtin handlers (host-specific, like
               ea-inbox-sweep, frameio-scan, x-dm-triage) emits a
               structured TODO row to
               ~/.gbrain/migrations/pending-host-work.jsonl so the host
               agent can walk through plugin-contract work per
               skills/migrations/v0.11.0.md.
  F. Install — `gbrain autopilot --install --yes`. Best-effort (failure
               doesn't abort; user can run manually).
  G. Record  — append to completed.jsonl. status:"complete" unless
               pending_host_work > 0, in which case status:"partial" +
               apply_migrations_pending: true.

Safety guards (Codex code-quality tension #3: strict-skip, no rollback):
  - Scope: $HOME/.claude + $HOME/.openclaw only by default. --host-dir
    must be explicit to include $PWD or any other path.
  - Symlink escape: SKIP if the resolved target leaves the scoped root.
  - >1 MB files: SKIP with warning.
  - Permission denied: SKIP with warning; other files continue.
  - Malformed JSON manifest: SKIP with parse error logged; continue.
  - mtime re-check right before write: bail the file if changed between
    read + write; other files continue.
  - Every edit writes a .bak.<ISO-timestamp> sibling first (second-
    precision so two same-day runs don't collide).
  - Idempotency: `_gbrain_migrated_by: "v0.11.0"` JSON property marker
    on each rewritten cron entry (JSON can't have comments — Codex G);
    AGENTS.md marker `<!-- gbrain:subagent-routing v0.11.0 -->`.
  - TODO dedupe: JSONL appends deduped by (handler, manifest_path) so
    reruns don't grow the file.

Post-run summary: when pending_host_work > 0, prints a one-liner
pointing the user at the JSONL path + the v0.11.0 skill file. The skill
(Lane C-3 / C-4) is the host-agent instruction manual.

test/migrations-v0_11_0.test.ts: 18 tests covering:
  - AGENTS.md injection: happy path, .bak creation, idempotent rerun,
    --dry-run no-op, symlink-escape SKIP, >1MB SKIP.
  - Cron rewrite: builtin handlers rewrite to shell+gbrain jobs submit,
    non-builtins emit JSONL TODOs without touching the manifest, mixed
    manifests get both treatments in one pass, idempotent rerun, TODO
    dedupe, malformed JSON SKIP, no-entries-array SKIP, --dry-run no-op.
  - findAgentsMdFiles + findCronManifests: scoped walk to $HOME/.claude +
    $HOME/.openclaw, --host-dir opt-in for $PWD.
  - BUILTIN_HANDLERS frozen at the canonical 7 names.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(skill): port skillify from Wintermute, pair with check-resolvable

Skillify is the "meta skill": turn any raw feature or script into a
properly-skilled, tested, resolvable, evaled unit of agent-visible
capability. Proven in production on Wintermute; paired with gbrain's
existing `check-resolvable` it becomes a user-controllable equivalent of
Hermes' auto-skill-creation — you decide when and what, the tooling
keeps the checklist honest.

Shipped:
  - skills/skillify/SKILL.md — ported from ~/git/wintermute/workspace/
    skills/skillify/SKILL.md. Genericized:
      * /data/.openclaw/workspace → \${PROJECT_ROOT} (runtime-detected).
      * services/voice-agent/__tests__/ → test/ (detected from repo).
      * Manual `grep skills/... AGENTS.md` replaced with a reference to
        `gbrain check-resolvable`, which does reachability + MECE + DRY
        + gap detection properly instead of grep-matching a path string.
  - scripts/skillify-check.ts — ported from
    ~/git/wintermute/workspace/scripts/skillify-check.mjs. Preserves the
    --recent flag and --json output shape. Detects project root via
    package.json walkup; detects test dir (test/ → __tests__/ → tests/
    → spec/). Runs the 10-item checklist per target and exits non-zero
    if any required item is missing.
  - test/skillify-check.test.ts — 4 CLI tests: happy-path against
    publish.ts (known-skilled), --json shape + schema, --recent smoke,
    bogus-target exit code.
  - skills/RESOLVER.md — adds the trigger row ("Skillify this", "is
    this a skill?", "make this proper") → skills/skillify/SKILL.md.
  - skills/manifest.json — adds the skillify entry so the conformance
    test passes.

Why the pair:
  * Hermes auto-creates skills in the background. Fine until you don't
    know what the agent shipped — checklists decay silently.
  * gbrain ships the same capability as two user-controlled tools:
    /skillify builds the checklist, gbrain check-resolvable validates
    reachability + MECE + DRY across the whole skill tree.
  * Human keeps judgment. Tooling keeps the checklist honest.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(v0.11.1): cron-via-minions convention, plugin-handlers guide, minions-fix, skill updates

New reference docs:
  - skills/conventions/cron-via-minions.md — the rewrite convention for
    cron manifests. Shows the Postgres (fire-and-forget + idempotency-
    key) vs PGLite (--follow inline) branch; explains why builtin-only
    auto-rewrite is safe + how host-specific handlers get the plugin
    contract.
  - docs/guides/plugin-handlers.md — the plugin contract for host-
    specific Minion handlers. Code-level registration via import +
    worker.register(), not a data file (Codex D: handlers.json was an
    RCE surface). Concrete TypeScript skeleton + handler contract
    (ctx.data, ctx.signal, ctx.inbox) + full migration flow from TODO
    JSONL to a rewritten cron entry.
  - docs/guides/minions-fix.md — user-facing troubleshooting for
    half-migrated v0.11.0 installs. Paste-one-liner for the stopgap,
    gbrain apply-migrations path for v0.11.1+, verification commands,
    failure-mode recipes.

Rewrites + updates:
  - skills/migrations/v0.11.0.md — body restored as the host-agent
    instruction manual. Audience is the host agent reading
    ~/.gbrain/migrations/pending-host-work.jsonl after the CLI
    orchestrator has done the mechanical phases. Walks each TODO type
    through the 10-item skillify checklist (plugin contract, ship
    bootstrap, unit tests, integration tests, LLM evals, resolver
    trigger, trigger eval, E2E smoke, brain filing, check-resolvable).
    Reverses the earlier "delete the body" decision (1B) because the
    body serves a different audience now — host-agent, not CLI
    documentation.
  - skills/cron-scheduler/SKILL.md — Phase 4 ("Register with host
    scheduler") now references cron-via-minions + plugin-handlers.
  - skills/maintain/SKILL.md — new "Fix a half-migrated install"
    section with the apply-migrations recipe.
  - skills/setup/SKILL.md — new Phase C.5 "One-step autopilot +
    Minions install (v0.11.1+)" explaining the four install targets
    + the OpenClaw auto-injection default.
  - docs/GBRAIN_SKILLPACK.md — Operations section adds the three new
    guides + the subagent-routing and cron-routing SKILLPACK notes
    (v0.11.0+).

All 167 related tests (conformance + resolver + skillify-check + v0_11_0
orchestrator) stay green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(v0.11.1): stopgap script + CLAUDE.md directive + README + CHANGELOG + version bump

scripts/fix-v0.11.0.sh — the paste-command for broken-v0.11.0 installs.
Released on the v0.11.1 tag so:
  curl -fsSL https://raw.githubusercontent.com/garrytan/gbrain/v0.11.1/scripts/fix-v0.11.0.sh | bash
always works (master branch could be renamed). 8 steps: schema apply,
smoke, mode prompt (non-TTY defaults pain_triggered), atomic write of
preferences.json (0o600), append completed.jsonl with status:"partial"
and apply_migrations_pending:true so the v0.11.1 apply-migrations run
resumes correctly (does NOT poison the permanent migration path —
Codex H2 avoidance), AGENTS.md + cron/jobs.json detection with guidance
printed as text only (never auto-edits from a curl-piped script), and a
closing line telling the user to run `gbrain autopilot --install` as the
one-stop finisher.

CLAUDE.md — new "Migration is canonical, not advisory" section pinning
the design principle. Any host-repo change (AGENTS.md, cron manifests,
launchctl units) is GBrain's responsibility via the migration; the
exception is host-specific handler registration, which goes via the
code-level plugin contract in docs/guides/plugin-handlers.md.

README.md — new sections:
  - "v0.11.0 migration didn't fire on your upgrade?" with both repair
    paths (v0.11.1 binary and pre-v0.11.1 stopgap).
  - "Skillify + check-resolvable: user-controllable auto-skill-creation"
    explaining why the user-controlled pair beats Hermes-style auto
    generation. Includes the scripts/skillify-check.ts invocation.

CHANGELOG.md — v0.11.1 entry (per CLAUDE.md voice: lead with what the
user can now do that they couldn't before; frame as benefits, not files
changed). Covers: mega-bug fix + apply-migrations + postinstall +
stopgap, autopilot-supervises-worker + single-install-step + env-aware
targets, Core fn extraction so handlers don't kill workers, skillify +
check-resolvable pair, host-agnostic plugin contract replacing
handlers.json (RCE concern), gbrain init --migrate-only, TS migration
registry + H8/H9 diff-rule fixes, CLAUDE.md directive. All Codex hard
blockers (H1, H3/H4, H5, H6, H7, H8, H9, K) + architecture issues
(#1/#2/#4/#5/#7) resolved.

package.json — version bump 0.11.0 → 0.11.1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(e2e): migration-flow E2E against live Postgres + Bun env quirk fix

Ships test/e2e/migration-flow.test.ts — the end-to-end integration test
for the v0.11.0 orchestrator. Spins up against a live Postgres (gated
on DATABASE_URL per CLAUDE.md lifecycle) and exercises four scenarios:

  - Fresh install: schema apply (Phase A via `gbrain init --migrate-only`)
    → smoke (Phase B) → mode resolution (C) → prefs (D) → host rewrite
    (E, empty fixture) → record (G). Asserts preferences.json exists with
    0o600, completed.jsonl has a v0.11.0 entry, autopilot install was
    skipped per --no-autopilot-install.
  - Idempotent rerun: second orchestrator invocation on a completed
    install doesn't blow up; mode stays stable.
  - Host rewrite mixed manifest: 4-entry cron/jobs.json with 2 gbrain-
    builtin handlers (sync, embed) + 2 non-builtin (ea-inbox-sweep,
    morning-briefing). Asserts builtins rewrite to `gbrain jobs submit`
    kind:shell, non-builtins are LEFT on kind:agentTurn, and 2 JSONL
    TODOs are emitted with correct shape. AGENTS.md gets the marker
    injected. Status is "partial" because pending-host-work > 0.
  - Resumable: stopgap writes a partial completed.jsonl row first;
    orchestrator re-runs successfully against it and appends a new
    post-orchestrator entry. 1 partial + 1 complete = 2 rows total.

Critical fix surfaced by the E2E: src/commands/migrations/v0_11_0.ts's
three execSync calls (gbrain init --migrate-only, gbrain jobs smoke,
gbrain autopilot --install) now explicitly pass `env: process.env`.
Bun's execSync default does NOT propagate post-start `process.env.PATH`
mutations to subprocesses — only the initial PATH snapshot. Without the
explicit env, any user-side env tweak (e.g. setting GBRAIN_DATABASE_URL
in a script before calling the orchestrator) would be invisible to the
orchestrator's subprocesses. This is also the reason the E2E needs a
PATH shim installed at module-load time to expose the `gbrain` command.

test/init-migrate-only.test.ts: subprocess env now strips DATABASE_URL
and GBRAIN_DATABASE_URL. The "no config" error-path tests need
loadConfig() to return null, which it won't if the env-var fallback at
src/core/config.ts:30 fires. Before this fix, running the unit tests
with DATABASE_URL set (e.g. during an E2E run) caused false failures
because `gbrain init --migrate-only` saw the env var and succeeded.

Full test totals with live Postgres: 1265 pass, 0 fail, 3497 expect
calls, 67 files, ~95s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: bump VERSION file to 0.11.1

Commit 5c4cf1d bumped package.json version to 0.11.1 but missed the
root VERSION file. src/version.ts reads from package.json so
`gbrain --version` prints 0.11.1 correctly, but any tool or script
that reads the VERSION file directly (like /ship's idempotency check)
saw the stale 0.11.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(v0.11.1): doctor self-heal check + skillpack-check command for cron health reports

Closes the discoverability hole from the v0.11.0 mega-bug: once a user is
on v0.11.1 (or later), every `gbrain doctor` invocation immediately
surfaces a half-migrated state, and `gbrain skillpack-check` gives host
agents (Wintermute's morning-briefing, any OpenClaw cron) a single
exit-coded JSON pipe to check from their own skills.

gbrain doctor — two new checks:
  1. Filesystem-only (fires on every `doctor` invocation, even --fast):
     if `~/.gbrain/migrations/completed.jsonl` has any status:"partial"
     entry with no matching status:"complete" for the same version, print
     `MINIONS HALF-INSTALLED (partial migration: vX.Y.Z). Run: gbrain
     apply-migrations --yes`. Typical cause is the stopgap wrote a
     partial record but nobody ran `apply-migrations` afterward.
  2. DB-path: if schema version is v7+ (Minions present) AND
     `~/.gbrain/preferences.json` is missing, print the same banner.
     Catches installs that never ran the stopgap or apply-migrations at
     all — the classic v0.11.0 "upgrade landed, migration never fired"
     state.

Both checks status:"fail" so doctor exits non-zero when either fires.
Test `test/doctor-minions-check.test.ts` pins the five branches
(partial present → FAIL, partial+complete → quiet, no-jsonl → quiet,
multiple versions named correctly, human-readable banner contains the
exact "MINIONS HALF-INSTALLED" phrase Wintermute's cron can grep for).

gbrain skillpack-check — new command + skill:
  - `src/commands/skillpack-check.ts` wraps `doctor --fast --json` +
    `apply-migrations --list` into one JSON report with `{healthy,
    summary, actions[], doctor, migrations}`. Exit 0 on healthy, 1 on
    action-needed, 2 on determine-failure. `--quiet` flag for cron
    pipes that want exit-code-only behavior.
  - `actions[]` is the remediation list. Doctor messages of the form
    `... Run: <cmd>` get their command extracted (regex fixed to match
    the full remainder of the line, not just the first word). Pending
    or partial migrations push `gbrain apply-migrations --yes` to the
    front of actions[].
  - `gbrainSpawn()` helper resolves the gbrain invocation correctly on
    compiled binary installs (`argv[1] = /usr/local/bin/gbrain`) AND
    source installs (`argv[1] = src/cli.ts`, prefix with `bun run`).
    Same Codex #1 fix pattern as autopilot's resolveGbrainCliPath.
  - `skills/skillpack-check/SKILL.md` teaches agents when to run it,
    what to do with the output, and anti-patterns (don't run without
    --quiet in a cron that emails; don't ignore exit 2).
  - Registered in skills/RESOLVER.md and skills/manifest.json.

Test `test/skillpack-check.test.ts` (5 tests) covers healthy fresh
install, half-migrated exit-1 with apply-migrations in actions[],
--quiet suppresses stdout in both states, --help prints usage, summary
includes top action when multiple are present.

1192 unit tests pass (+15 new). The 38 failing tests are all
DATABASE_URL E2Es — same pre-existing pattern, unchanged by this
commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* doc(v0.11.1): reframe README + minions-fix — v0.11.0 was never released

v0.11.0 was cut but never released publicly. v0.11.1 is the first
public Minions ship, and fixes the upgrade-migration mega-bug so it
self-heals on every future `gbrain upgrade` + `bun update gbrain`.
The README was wrongly framing the fix as a retrospective for v0.11.0
users — none exist, so remove it.

README changes:
  - Delete the "v0.11.0 migration didn't fire on your upgrade?" section.
    Replace with "Health check and self-heal": the `gbrain doctor`,
    `gbrain skillpack-check --quiet`, and `gbrain skillpack-check | jq`
    recipes that ship in v0.11.1. Still links to docs/guides/minions-fix.md
    for deeper troubleshooting.
  - Promote the production benchmark to top billing. The previous section
    led with the lab benchmark (same LLM, localhost) and buried the
    production data point as a single follow-up sentence. Real deployment
    numbers are the stronger signal:
      * 753ms vs >10s gateway timeout (sub-agent couldn't even spawn)
      * $0.00 vs ~$0.03 per run
      * 100% vs 0% success rate under 19-cron production load
      * 36-month tweet backfill: 19,240 tweets, ~15 min, $0.00
    Lab numbers stay (separate table, labeled "controlled environment")
    so readers can see both layers.
  - Add the "The routing rule" closer: Deterministic → Minions, Judgment
    → Sub-agents. This is the clearest framing in the production
    benchmark doc and belongs in the README so readers leave with the
    right mental model. `minion_mode: pain_triggered` automates it.

docs/guides/minions-fix.md rewrite:
  - Reframe as: v0.11.0 never released, v0.11.1 is the first ship,
    `gbrain apply-migrations --yes` is canonical. Stopgap stays
    documented for pre-v0.11.1 branch builds (e.g. Wintermute's
    minions-jobs checkout before v0.11.1 tags).
  - Add the detection + verification commands (doctor + skillpack-check)
    at the top.
  - Cross-reference skills/skillpack-check/SKILL.md as the agent-facing
    health-check pattern.

Zero lingering "v0.11.0 released" references in README or minions-fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(doctor): remove "schema v7+ no prefs → FAIL" check (too aggressive)

CI failure in Tier 1 Mechanical E2E:
  (fail) E2E: Doctor Command > gbrain doctor exits 0 on healthy DB

Root cause: the doctor half-migration detection added two checks. The
second check (`schema v7+ AND ~/.gbrain/preferences.json missing →
minions_config FAIL`) was too aggressive. It treated a valid fresh-
install state as broken.

`gbrain init` against Postgres applies schema v7 but doesn't write
preferences.json — that's the migration orchestrator's Phase D, which
only runs via `apply-migrations`. Between `init` finishing and the user
running `apply-migrations`, the install is legitimately in a
"schema-applied, no prefs" state. Doctor was exiting 1 on this valid
state, breaking the pre-existing CI test that init's + docters a
healthy DB.

Fix: drop the check. The filesystem check (step 3 — partial-completed
without a matching complete) is sufficient signal for genuine half-
migration. Added a regression test pinning the exact CI scenario: no
completed.jsonl present, no preferences.json, doctor must not fail any
minions_* check.

Also removes the now-unused `preferencesPaths` import.

Verified against live Postgres: CI-equivalent `gbrain doctor` + `gbrain
doctor --json` both pass. Full suite: 1281/1281 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* doc(readme): Minions section — lead with the story, compress the rest

The previous section opened with "six daily pains" as a numbered list
before the hook, buried the production numbers halfway down, and had
a table explaining how each pain gets fixed. Fine for a spec doc;
wrong for a README that needs to land the impact fast.

Rewrite:
  - Lead with "your sub-agents won't drop work anymore" — the reason
    a reader is here.
  - Production numbers promoted, framed as a story: "Here's my
    personal OpenClaw deployment: one Render container, Supabase
    Postgres holding a 45,000-page brain, 19 cron jobs firing on
    schedule, the X Enterprise API on the wire..." Gives the reader
    the setup before the punchline.
  - The routing rule (deterministic → Minions, judgment → sub-agents)
    survives unchanged. It's the clearest framing in the whole section.
  - Lose the "how each pain gets fixed" table. Compress the six pains
    + their fixes into one paragraph that names the primitives by
    name (max_children, timeout_ms, child_done inbox, cascade cancel,
    idempotency keys, attachment validation). Readers who want depth
    click through to skills/minion-orchestrator/SKILL.md.
  - Close with "not incrementally better — categorically different"
    and the three headline numbers.
  - Drop the separate Lab Numbers table; the production numbers are
    stronger and the lab data is one click away via the link.

Lines: 75 → 42. Same signal, less scroll.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* doc: scrub X Enterprise API + @garrytan references from user-facing docs

User feedback: shouldn't name the specific enterprise-tier API product
or the account in the README or benchmark docs. Genericize:

  - "X Enterprise API on the wire" → drop entirely; the 19-cron load
    story carries the setup without naming the vendor
  - "X Enterprise API ($50K/mo firehose)" → "external API"
  - "@garrytan tweets" → "my social posts"
  - "Pull ~100 @garrytan tweets" → "Pull ~100 of my social posts"
  - "X Enterprise API (full-archive)" env var comment → "external API
    bearer token"

Scope:
  - README.md — the Minions production story line + scaling callout
  - docs/benchmarks/2026-04-18-minions-vs-openclaw-production.md
  - docs/benchmarks/2026-04-18-tweet-ingestion.md

Plain "X API" references in the tweet-ingestion methodology stay —
those describe which public HTTP endpoint was called, not the
enterprise-tier product. Benchmark doc filenames (tweet-ingestion.md)
stay to preserve inbound links; content is genericized.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* doc(readme): Skillify section — match Minions energy, land the category shift

The previous section was competent but undersold what skillify actually
is. Rewrite matches the Minions section's shape: lead with the hook,
tell the story, land the punchline.

Key changes:
  - Title: "your skills tree stops being a black box." Names the thing
    skillify actually solves.
  - Open with the problem: Hermes auto-creates skills as a background
    behavior. Six months later you have an opaque pile nobody's read
    or tested. Make the liability concrete.
  - Promote the 10 items by name (SKILL.md + script + unit tests +
    integration tests + LLM evals + resolver trigger + trigger eval +
    E2E + brain filing + check-resolvable audit). Showing the list
    makes the scope of the unlock visible.
  - New subsection "Why this is the right answer for OpenClaw" names
    the debugging-the-black-box pain directly. Skillify makes the tree
    legible: when something breaks, you know which layer (contract,
    test, eval, trigger, or route) to inspect. When anything goes
    stale, check-resolvable flags it.
  - Close with "compounding quality instead of compounding entropy" +
    "not a nice-to-have. It's the piece that makes the skills tree
    survive six months."
  - Expand the code block to include `gbrain check-resolvable` (the
    other half of the pair) so readers see the whole workflow.

Length goes from 17 to 34 lines — still shorter than Minions, still
one section. Worth the space because this is a category shift for
how agent skills get built, not a feature.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: root <root@localhost>
garrytan added a commit that referenced this pull request Apr 24, 2026
… exit

Lane A of PR #364 review fixes (20-item multi-lane plan). Addresses the
codex-tier + CEO + Eng findings on src/core/minions/supervisor.ts:

Safety + correctness:
- Atomic O_CREAT|O_EXCL PID lock via openSync('wx') with stale-file
  liveness check. Prevents two supervisors racing on the same PID file.
  (codex #1)
- Health check now queries status='active' AND lock_until < now()
  matching queue.ts:848's authoritative stalled definition. The prior
  `status = 'stalled'` predicate returned zero rows forever because
  'stalled' is not a persisted value in the schema. (codex #2)
- All health queries scoped to WHERE queue = $1 via opts.queue binding.
  Multi-queue installs no longer see cross-queue false positives.
  (codex #3)
- Class default allowShellJobs flipped true→false AND explicit
  `delete env.GBRAIN_ALLOW_SHELL_JOBS` when false, so child workers
  don't silently inherit the var from the parent shell. (eng #8, codex #9)
- Unified shutdown(reason, exitCode) — max-crashes now routes through
  the same drain path as SIGTERM. Single source of truth for lifecycle
  cleanup; prerequisite for trustworthy audit events (Lane C). (eng #1)
- Default PID path moves from /tmp to ~/.gbrain/supervisor.pid with
  mkdirSync recursive + GBRAIN_SUPERVISOR_PID_FILE env override.
  Matches the rest of the product's ~/.gbrain/ convention; fresh
  installs no longer hit ENOENT. (CEO #2 + codex #6)

Refinements:
- crashCount = 1 after 5-min stable-run reset (was 0, produced
  calculateBackoffMs(-1) = 500ms by accident). Now reads as 'first
  crash of a new cycle' with a clean 1s backoff. (Nit 1)
- Top-of-file POSTGRES-ONLY docstring documenting why the supervisor
  can't run against PGLite. (Nit 2)
- inBackoff flag suppresses 'worker not alive' warn during the
  expected null-child window (crash → sleep → next spawn). (eng #2)
- Tracked listener refs for SIGTERM/SIGINT removed in shutdown() so
  integration tests spinning up/tearing down multiple supervisors on
  one process don't leak handlers. (eng #3)
- Single FILTER query replaces two SELECT counts — one round-trip
  instead of two, three metrics in one pass. (eng #10)
- child.on('error') listener emits worker_spawn_failed event for
  ENOENT/EACCES; exit handler still increments crashCount as usual
  so max-crashes bounds permanent misconfigurations. (codex #7)
- healthInFlight boolean guard with try/finally prevents overlapping
  health checks from stacking on a hung DB. (codex #8)

Documented exit codes (ExitCodes const):
  0 CLEAN, 1 MAX_CRASHES, 2 LOCK_HELD, 3 PID_UNWRITABLE
  Agent can branch on exit=2 ('another supervisor, I'm fine') vs
  exit=1 ('escalate to human').

Event emitter surface:
  - started / worker_spawned / worker_exited / worker_spawn_failed
  - backoff / health_warn / health_error / max_crashes_exceeded
  - shutting_down / stopped
  Plumbed through emit() with an onEvent callback hook for Lane C's
  audit writer. json:false is the default; Lane C's --json mode
  flips it and writes JSONL to stderr.

CLI changes (src/commands/jobs.ts):
- `gbrain jobs supervisor` gains --allow-shell-jobs (explicit opt-in
  mirroring the env-var gate), --cli-path (override auto-resolution
  for exotic setups), and --json (JSONL lifecycle events on stderr).
- Expanded --help body with description, 3 examples, and exit-code
  table. (DX Fix A per review)
- Three-tier PID path resolution: --pid-file > GBRAIN_SUPERVISOR_PID_FILE
  > ~/.gbrain/supervisor.pid (via exported DEFAULT_PID_FILE).
- Removed the catch-fallback to process.argv[1] — resolveGbrainCliPath()
  throws its own actionable install-hint error, which is what dev users
  need instead of a cryptic spawn failure on a .ts path. (codex #5)

Tests: existing 7 supervisor.test.ts cases continue to pass.
Integration tests (crash-restart, max-crashes, SIGTERM-during-backoff,
env-inheritance regression) land in Lane E.

Out of scope for this lane (tracked in follow-up lanes):
- Audit file writer at ~/.gbrain/audit/supervisor-YYYY-Www.jsonl (Lane C)
- Documentation pass (Lane B)
- supervisor start/status/stop subcommands (Lane C)
- gbrain doctor supervisor check (Lane D)
- /ship release hygiene (Lane F)
- autopilot.ts migration to MinionSupervisor (deferred to follow-up PR
  per codex — requires non-blocking start() API redesign, not ~30 lines)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request Apr 24, 2026
…nager (#364)

* feat: add `gbrain jobs supervisor` — self-healing worker process manager

Adds a first-class supervisor command that:
- Spawns `gbrain jobs work` as a child process
- Restarts on crash with exponential backoff (1s→60s cap)
- Resets crash counter after 5min of stable operation
- PID file locking prevents duplicate supervisors
- Periodic health checks (stalled jobs, completion gaps)
- Graceful shutdown (SIGTERM→35s→SIGKILL)

Usage:
  gbrain jobs supervisor --concurrency 4

Replaces ad-hoc nohup patterns in bootstrap scripts.
The autopilot command's internal supervisor can be migrated
to use this in a follow-up.

Tests: 7 pass (backoff calc, PID management, crash tracking)

* supervisor: atomic PID lock, queue-scoped health, env safety, unified exit

Lane A of PR #364 review fixes (20-item multi-lane plan). Addresses the
codex-tier + CEO + Eng findings on src/core/minions/supervisor.ts:

Safety + correctness:
- Atomic O_CREAT|O_EXCL PID lock via openSync('wx') with stale-file
  liveness check. Prevents two supervisors racing on the same PID file.
  (codex #1)
- Health check now queries status='active' AND lock_until < now()
  matching queue.ts:848's authoritative stalled definition. The prior
  `status = 'stalled'` predicate returned zero rows forever because
  'stalled' is not a persisted value in the schema. (codex #2)
- All health queries scoped to WHERE queue = $1 via opts.queue binding.
  Multi-queue installs no longer see cross-queue false positives.
  (codex #3)
- Class default allowShellJobs flipped true→false AND explicit
  `delete env.GBRAIN_ALLOW_SHELL_JOBS` when false, so child workers
  don't silently inherit the var from the parent shell. (eng #8, codex #9)
- Unified shutdown(reason, exitCode) — max-crashes now routes through
  the same drain path as SIGTERM. Single source of truth for lifecycle
  cleanup; prerequisite for trustworthy audit events (Lane C). (eng #1)
- Default PID path moves from /tmp to ~/.gbrain/supervisor.pid with
  mkdirSync recursive + GBRAIN_SUPERVISOR_PID_FILE env override.
  Matches the rest of the product's ~/.gbrain/ convention; fresh
  installs no longer hit ENOENT. (CEO #2 + codex #6)

Refinements:
- crashCount = 1 after 5-min stable-run reset (was 0, produced
  calculateBackoffMs(-1) = 500ms by accident). Now reads as 'first
  crash of a new cycle' with a clean 1s backoff. (Nit 1)
- Top-of-file POSTGRES-ONLY docstring documenting why the supervisor
  can't run against PGLite. (Nit 2)
- inBackoff flag suppresses 'worker not alive' warn during the
  expected null-child window (crash → sleep → next spawn). (eng #2)
- Tracked listener refs for SIGTERM/SIGINT removed in shutdown() so
  integration tests spinning up/tearing down multiple supervisors on
  one process don't leak handlers. (eng #3)
- Single FILTER query replaces two SELECT counts — one round-trip
  instead of two, three metrics in one pass. (eng #10)
- child.on('error') listener emits worker_spawn_failed event for
  ENOENT/EACCES; exit handler still increments crashCount as usual
  so max-crashes bounds permanent misconfigurations. (codex #7)
- healthInFlight boolean guard with try/finally prevents overlapping
  health checks from stacking on a hung DB. (codex #8)

Documented exit codes (ExitCodes const):
  0 CLEAN, 1 MAX_CRASHES, 2 LOCK_HELD, 3 PID_UNWRITABLE
  Agent can branch on exit=2 ('another supervisor, I'm fine') vs
  exit=1 ('escalate to human').

Event emitter surface:
  - started / worker_spawned / worker_exited / worker_spawn_failed
  - backoff / health_warn / health_error / max_crashes_exceeded
  - shutting_down / stopped
  Plumbed through emit() with an onEvent callback hook for Lane C's
  audit writer. json:false is the default; Lane C's --json mode
  flips it and writes JSONL to stderr.

CLI changes (src/commands/jobs.ts):
- `gbrain jobs supervisor` gains --allow-shell-jobs (explicit opt-in
  mirroring the env-var gate), --cli-path (override auto-resolution
  for exotic setups), and --json (JSONL lifecycle events on stderr).
- Expanded --help body with description, 3 examples, and exit-code
  table. (DX Fix A per review)
- Three-tier PID path resolution: --pid-file > GBRAIN_SUPERVISOR_PID_FILE
  > ~/.gbrain/supervisor.pid (via exported DEFAULT_PID_FILE).
- Removed the catch-fallback to process.argv[1] — resolveGbrainCliPath()
  throws its own actionable install-hint error, which is what dev users
  need instead of a cryptic spawn failure on a .ts path. (codex #5)

Tests: existing 7 supervisor.test.ts cases continue to pass.
Integration tests (crash-restart, max-crashes, SIGTERM-during-backoff,
env-inheritance regression) land in Lane E.

Out of scope for this lane (tracked in follow-up lanes):
- Audit file writer at ~/.gbrain/audit/supervisor-YYYY-Www.jsonl (Lane C)
- Documentation pass (Lane B)
- supervisor start/status/stop subcommands (Lane C)
- gbrain doctor supervisor check (Lane D)
- /ship release hygiene (Lane F)
- autopilot.ts migration to MinionSupervisor (deferred to follow-up PR
  per codex — requires non-blocking start() API redesign, not ~30 lines)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: supervisor as canonical worker deployment pattern

Lane B of PR #364 review fixes. Reframes docs/guides/minions-deployment.md
around `gbrain jobs supervisor` as the default answer (blocker 7), deletes
the 68-line legacy bash watchdog (F10), and updates README + deployment
snippets to match.

docs/guides/minions-deployment.md:
- New 'Worker supervision' section at the top with the canonical 3-command
  agent pattern (start --detach / status --json / stop) and a documented
  exit-code table (0 clean, 1 max-crashes, 2 lock-held, 3 PID-unwritable).
- 'Which supervisor when?' decision table: container = supervisor as
  PID 1, Linux VM = systemd-over-supervisor, dev laptop = bare terminal.
- New 'Agent usage' section for OpenClaw / Hermes / Cursor / Codex — the
  3-turn discover-start-maintain workflow that replaces shell archaeology
  with machine-parseable JSON events + an audit file at
  ~/.gbrain/audit/supervisor-YYYY-Www.jsonl.
- Demoted the 'Option 1: watchdog cron' path entirely; replaced with a
  straightforward upgrade migration block (stop script, remove cron line,
  start supervisor, verify via doctor).
- Preconditions now check Postgres connectivity directly (supervisor is
  Postgres-only; the CLI rejects PGLite with a clear error).

Snippets:
- systemd.service: ExecStart now invokes `gbrain jobs supervisor` instead
  of raw `gbrain jobs work`. Two-layer supervision (systemd → supervisor
  → worker) buys automatic restart on reboot plus fast crash recovery.
  ReadWritePaths expanded to cover $HOME/.gbrain (supervisor PID + audit).
- Procfile + fly.toml.partial: same change — platform restarts the
  container on host events, supervisor restarts the worker on crashes.
- minion-watchdog.sh: deleted (git history retains it for anyone in an
  exotic deployment). Supervisor subsumes every capability it had plus
  atomic PID locking, structured audit events, queue-scoped health
  checks, and graceful drain on SIGTERM.

README.md:
- Added a paragraph under the Minions section pointing `gbrain jobs
  supervisor` as canonical, noting the --detach / status / stop surface
  and the audit file path, with a link to the full deployment guide.
  Kept `gbrain jobs work` documented for direct raw invocation but
  flagged 'prefer supervisor' for any long-running use.

The supervisor `--help` body itself (3 examples + exit-code table in
src/commands/jobs.ts) landed with Lane A — this lane finishes the
discoverability story by making the supervisor findable via doc grep,
README landing, and deployment-guide landing paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* supervisor: daemon-manager subcommands + JSONL audit writer

Lane C of PR #364 review fixes. Adds the daemon-manager CLI surface so
agents can drive `gbrain jobs supervisor` in 3 turns instead of 10, and
the audit writer that makes lifecycle events inspectable across process
restarts. (Blocker 8, closes DX Fix A/B/C.)

New: src/core/minions/handlers/supervisor-audit.ts
  - writeSupervisorEvent(emission, supervisorPid) appends JSONL to
    `${GBRAIN_AUDIT_DIR:-~/.gbrain/audit}/supervisor-YYYY-Www.jsonl`.
    ISO-week rotation via a `computeSupervisorAuditFilename()` helper
    that mirrors `shell-audit.ts` exactly (year-boundary ISO week math,
    Thursday anchor, etc).
  - readSupervisorEvents({sinceMs}) returns parsed events from the
    current week's file, oldest-first, for Lane D's doctor check.
    Malformed lines are skipped silently (disk-full truncation is
    already best-effort at write time).
  - Reuses `resolveAuditDir()` from shell-audit.ts so the
    `GBRAIN_AUDIT_DIR` env var override works identically across all
    gbrain audit trails.

src/commands/jobs.ts: supervisor subcommand dispatcher
  - `gbrain jobs supervisor [start] [--detach] [--json] ...` — default
    subcommand. Without --detach, runs foreground as before. With
    --detach, forks a background child (inheriting stderr so the caller
    can still tail JSONL events), writes a stdout payload:
      {"event":"started","supervisor_pid":N,"pid_file":"...","detached":true}
    and exits 0. Stdin/stdout on the detached child are /dev/null so
    the parent shell isn't held open.
  - `gbrain jobs supervisor status [--json]` — reads the PID file,
    checks liveness via `kill -0`, then reads the last 24h from the
    supervisor audit file to compute crashes_24h / last_start /
    max_crashes_exceeded. Exits 0 if running, 1 if not. JSON output
    is machine-parseable; human output is a 5-line ASCII report.
  - `gbrain jobs supervisor stop [--json]` — reads PID, sends SIGTERM,
    polls `kill -0` every 250ms for up to 40s (supervisor's own 35s
    worker-drain + 5s slack). Reports outcome: drained / timeout_40s
    / pid_file_missing / pid_file_corrupt / process_gone. Exit 0 on
    clean stop.
  - `--json` flag is already plumbed through to the supervisor opts
    from Lane A — this lane adds the onEvent audit-writer callback
    so every supervisor emission (started, worker_spawned,
    worker_exited, worker_spawn_failed, backoff, health_warn,
    health_error, max_crashes_exceeded, shutting_down, stopped) lands
    in the JSONL file with the supervisor's PID attached.

--help body updated:
  - Three separate usage lines (start / status / stop).
  - SUBCOMMANDS block with one-line summaries each.
  - EXIT CODES block (unchanged from Lane A, moved under SUBCOMMANDS).
  - EXAMPLES block updated with status --json + stop + --detach forms.

Tests: existing 127 supervisor + minions tests continue to pass.
Integration tests for the new subcommands + audit writer land with
Lane E.

Follow-up (Lane D): `gbrain doctor` will read readSupervisorEvents()
from this module to surface a `supervisor` health check alongside its
existing checks (DB connectivity, schema version, queue health).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* doctor: add supervisor health check

Lane D of PR #364 review fixes. Closes the observability loop: now that
Lane C writes supervisor lifecycle events to
`${GBRAIN_AUDIT_DIR:-~/.gbrain/audit}/supervisor-YYYY-Www.jsonl`,
`gbrain doctor` surfaces a `supervisor` check alongside its existing
health indicators.

Implementation (src/commands/doctor.ts, filesystem-only block 3b-bis):
- Resolves DEFAULT_PID_FILE via the same three-tier logic as the start
  path (--pid-file > GBRAIN_SUPERVISOR_PID_FILE > ~/.gbrain/supervisor.pid).
- Reads the PID file + `kill -0 <pid>` for liveness.
- Calls readSupervisorEvents({sinceMs: 24h}) from the audit module to
  derive last_start / crashes_24h / max_crashes_exceeded.
- Suppresses the check entirely when the user has never invoked the
  supervisor (no PID file AND no audit events) — avoids noise on
  installs that don't use the feature.

Status thresholds:
  fail   max_crashes_exceeded event seen in last 24h
         (supervisor gave up; operator needs to restart or triage)
  warn   supervisor not running but audit shows prior use
         (unexpected stop — likely crash or manual kill)
  warn   running but > 3 crashes in last 24h
         (supervisor recovering but worker is unstable)
  ok     running + ≤ 3 crashes + no max_crashes event

All failure paths emit a paste-ready recovery command. Read/import
errors are swallowed (best-effort like the other doctor checks).

Tests: all 127 supervisor + minions tests still green; 13 existing
doctor tests unaffected.

F3 done. All four lanes A/B/C/D are now committed; Lane E (integration
tests) and Lane F (/ship v0.20.2) remain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: 4 critical integration tests for supervisor lifecycle

Lane E of PR #364 review fixes (blocker 10). Fills the ~15% coverage
gap flagged in the eng review by actually exercising the code paths
that will break in production — crash-restart loop, max-crashes exit,
SIGTERM-during-backoff, env-var inheritance — via real spawn() calls
against fake shell-script workers. No mocks: real fork, real signals,
real env propagation, real audit file writes.

test/fixtures/supervisor-runner.ts (new, 55 lines):
  A standalone bun script that constructs a MinionSupervisor from env
  vars (SUP_PID_FILE / SUP_CLI_PATH / SUP_MAX_CRASHES / SUP_BACKOFF_FLOOR_MS
  / SUP_HEALTH_INTERVAL_MS / SUP_ALLOW_SHELL_JOBS / SUP_AUDIT_DIR) and
  calls start(). Mock engine returns empty rows for executeRaw (health
  check path still exercised without Postgres). Tests spawn this as a
  subprocess because MinionSupervisor.start() calls process.exit() on
  shutdown — can't run it in the test runner's own process.

test/supervisor.test.ts (existing; 91 → 300 lines):
  - Added IntegrationHarness helper: creates a unique tmpdir per test,
    a fake worker shell script, a PID-file path, and an audit-dir path;
    cleanup runs in finally.
  - spawnSupervisor() forks bun on the runner with env vars set.
  - readAudit() reads the supervisor-YYYY-Www.jsonl file via the
    existing readSupervisorEvents() helper (Lane C), threading
    GBRAIN_AUDIT_DIR through so tests don't collide on ~/.gbrain.
  - waitFor(pred, timeoutMs) polls helper for event-driven tests.

Four integration tests (with _backoffFloorMs=5 for <1s suite runs):

  1. "respawns the worker after a crash and eventually exits with
     max-crashes code=1"
     Worker always `exit 1`. maxCrashes=3. Asserts: exit code 1, PID
     file cleaned up, audit contains started + 3x worker_spawned +
     3x worker_exited + max_crashes_exceeded + shutting_down + stopped,
     and the stopped event carries {reason:'max_crashes', exit_code:1}.
     Locks in blockers 1 (PID lock), 2+3+6 (health SQL doesn't 500),
     5 (unified shutdown emits right events), F8 (spawn errors counted).

  2. "receives SIGTERM while sleeping between crashes and exits 0 cleanly"
     Worker always `exit 1`, backoff floor 800ms to catch the sleep.
     Asserts: SIGTERM during backoff → exit code 0 (not 1) in <5s,
     no signal kill (process.exit via shutdown), audit contains
     shutting_down {reason:'SIGTERM'} + stopped, PID file cleaned up.
     Locks in eng Issue 1 (unified exit path), eng Issue 3 (signal
     handlers don't accumulate across shutdowns).

  3. "strips inherited GBRAIN_ALLOW_SHELL_JOBS when allowShellJobs=false,
     even if parent has it set"  ⚠ CRITICAL regression test
     Parent env has GBRAIN_ALLOW_SHELL_JOBS=1. SUP_ALLOW_SHELL_JOBS=0.
     Worker writes $GBRAIN_ALLOW_SHELL_JOBS (or 'UNSET' if absent) to
     an OUT_FILE. Asserts child sees 'UNSET'. Locks in codex #9 + eng
     #8: the `else delete env.GBRAIN_ALLOW_SHELL_JOBS` branch from
     Lane A is load-bearing for the supervisor's security posture;
     this test prevents a future refactor silently re-opening the
     inheritance hole.

  4. "DOES pass GBRAIN_ALLOW_SHELL_JOBS to child when allowShellJobs=true"
     Positive-path companion to #3. SUP_ALLOW_SHELL_JOBS=1 → worker
     sees '1'. Confirms the else-branch doesn't over-strip and that
     operators who explicitly opt in still get shell-exec enabled.

Plus two audit-format unit tests:
  - computeSupervisorAuditFilename format (regex match)
  - Year-boundary ISO week: 2027-01-01 → supervisor-2026-W53.jsonl
    (matches the shell-audit.ts pattern exactly)

Before: 7 tests covering backoff math + PID helpers (~15% behavioral
coverage per eng review).
After: 13 tests across all critical lifecycle paths (crash-restart,
max-crashes, SIGTERM, env-inheritance, audit rotation).

All 146 tests in supervisor + minions + doctor suites green in ~8s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.20.2)

Lane F of PR #364 review fixes. Closes the multi-lane plan with release
hygiene: VERSION bump 0.19.0 → 0.20.2, package.json sync, CHANGELOG entry
in GStack voice with release summary + "numbers that matter" table +
"To take advantage of v0.20.2" migration block + itemized changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: escape template-literal interpolation in supervisor --help

The --help body in src/commands/jobs.ts is one big backtick template
literal. The supervisor subcommand description I added in Lane B used
both `${GBRAIN_AUDIT_DIR:-~/.gbrain/audit}` (parsed as a template
interpolation into an undefined variable) and inline `code` backticks
(parsed as nested template literals). CI caught it with ~200 tsc parse
errors across the file.

Fix:
- Escape `${...}` → `\${...}` so the audit-file path renders literally.
- Replace prose inline-code backticks with plain single-quote fences
  (`gbrain jobs work` → 'gbrain jobs work', `~/.gbrain/supervisor.pid`
  → ~/.gbrain/supervisor.pid). `--help` output is human prose; the
  single-quote form reads cleanly in a terminal without needing to
  smuggle nested backticks through a template literal.

`bunx tsc --noEmit` is clean. 146 tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: regenerate llms-full.txt after Lane B doc rewrite

CI drift guard caught that `llms-full.txt` didn't match the current
generator output. Root cause: the Lane B rewrite of
`docs/guides/minions-deployment.md` (supervisor as canonical, watchdog
deleted) changed content that gets inlined into `llms-full.txt`, but I
didn't run `bun run build:llms` to regenerate.

`bun test test/build-llms.test.ts` now clean (7/7 pass).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: root <root@localhost>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request Apr 26, 2026
Bumped 0.22.0 → 0.26.0 to slot above master's v0.21 chain with headroom
for v0.23/0.24/0.25 to ship from master between now and merge.

Security fixes (all from CSO finding writeups):

#1 cookie-parser middleware — admin dashboard auth was silently broken.
   Express 5 has no built-in cookie parsing; req.cookies was always
   undefined, so /admin/login set the cookie but every subsequent admin
   API call returned 401. Added cookie-parser@^1.4.7 + @types/cookie-parser
   as direct + dev deps. app.use(cookieParser()) wired before CORS.

#2 + #3 TOCTOU races — exchangeAuthorizationCode and exchangeRefreshToken
   used SELECT-then-DELETE, letting concurrent requests with the same
   code/refresh both pass the SELECT before either ran DELETE, both
   issuing token pairs. Switched to atomic DELETE...RETURNING. RFC 6749
   §10.5 (codes) + §10.4 (refresh detection) violations closed. Added
   regression tests that fire 10 concurrent exchanges and assert exactly
   one wins — both pass.

#5 pgArray escape + DCR redirect_uri validation — pgArray() did
   `arr.join(',')` with no escaping, so an element containing a comma
   would be parsed by Postgres as TWO array elements. With --enable-dcr
   on, this could smuggle a second redirect_uri into a registered client
   and steal auth codes. Now every element is double-quoted with `"` and
   `\` escaped. Added validateRedirectUri() per RFC 6749 §3.1.2.1:
   redirect_uris must be https:// or loopback (localhost / 127.0.0.1).
   Wired into the DCR registerClient path; CLI registration trusts the
   operator and bypasses. Regression test confirms a comma-in-URI element
   round-trips as 1 element, not 2.

#6 --public-url flag — issuerUrl was hardcoded to http://localhost:{port}.
   Behind reverse proxies / ngrok / production deploys, the issuer claim
   in tokens wouldn't match the discovery URL clients hit (RFC 8414 §3.3).
   New --public-url URL flag on `gbrain serve --http`, propagates through
   serve.ts → serve-http.ts → ServeHttpOptions.publicUrl → issuerUrl.
   Startup banner surfaces the configured issuer.

Findings #4 (admin requests filter dead code), #7 (admin register-client
hardcoded grant_types), #8 (legacy token grandfathering posture) are
documentation / minor functional fixes and are deferred per user direction.

Tests: oauth.test.ts now 34 cases (was 27). 7 new:
- single-use TOCTOU regression (10 concurrent code exchanges)
- single-use TOCTOU regression (10 concurrent refresh exchanges)
- redirect_uri http://localhost passes
- redirect_uri https://example.com passes
- redirect_uri http://example.com (non-loopback plaintext) rejected
- redirect_uri non-URL rejected
- redirect_uri with embedded comma stored as single element

Files:
- VERSION, package.json: 0.22.0 → 0.26.0
- CHANGELOG.md: heading + table + "To take advantage" + "pre-v0.22" → v0.26;
  new "Security hardening (post-/cso pass)" subsection at top of itemized
  changes; CLI flag list updated for --public-url.
- src/core/oauth-provider.ts: pgArray escape, validateRedirectUri,
  registerClient enforces validation, DELETE...RETURNING in
  exchangeAuthorizationCode + exchangeRefreshToken.
- src/commands/serve-http.ts: cookie-parser import + wire-up,
  publicUrl option, issuerUrl honors it, startup banner shows issuer.
- src/commands/serve.ts: parses --public-url and threads through.
- src/cli.ts: help text adds --public-url URL flag.
- test/oauth.test.ts: +7 regression tests (now 34 total).
- llms-full.txt: regenerated.

Typecheck clean. 34 oauth + 14 cli tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request Apr 28, 2026
Issue #5 + D6 of the eng review: tier matching used slug.startsWith(dir),
which falsely matches 'media/xerox/foo' against 'media/x' if a user wrote
the directory without a trailing slash.

The new matcher requires the configured directory to end with `/` and
treats it as a canonical path-segment ancestor:

  media/x/   matches  media/x/tweet-1       ✓
  media/x/   doesn't  media/xerox/foo       ✗
  media/x    refused  media/x/tweet-1       (matcher requires trailing /)

Non-canonical input (no trailing slash) is refused outright. Step 7's
auto-normalizing validator converts user-written 'media/x' → 'media/x/'
on load, so the matcher never sees non-canonical input from real configs.
The behavior tested here is the strict matcher's contract.

Regression test pins the media/xerox collision case explicitly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@momoiicom momoiicom mentioned this pull request Apr 28, 2026
1 task
garrytan added a commit that referenced this pull request Apr 30, 2026
)

* feat: storage tiering — git-tracked vs supabase-only directories

Brain repos scaling to 200K+ files. Bulk data (tweets, articles, transcripts)
bloats git repos and slows operations. New storage config in gbrain.yml lets
users declare git-tracked and supabase-only directories.

Changes:
- New config: storage.git_tracked and storage.supabase_only in gbrain.yml
- gbrain sync auto-manages .gitignore for supabase-only paths
- gbrain export --restore-only restores missing supabase-only files from DB
- New gbrain storage status command shows tier breakdown
- Config validation warns on conflicts
- 8 tests passing, full docs at docs/storage-tiering.md

Backward compatible — systems without gbrain.yml work unchanged.

* feat: add getDefaultSourcePath() typed accessor (step 1/15)

Single source of truth for "what brain repo are we operating against?"
Replaces ad-hoc raw SQL in storage.ts:38 (Issue #3 of eng review). Used by
both gbrain storage status and gbrain export --restore-only.

Returns null on miss, throws on DB error. Composes with the existing
resolveSourceId chain so it honors --source flag / GBRAIN_SOURCE env /
.gbrain-source dotfile / longest-prefix CWD match / brain-level default.

4 new test cases covering happy path, missing local_path, DB error
propagation, and CWD-prefix resolution priority.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: replace gray-matter with dedicated YAML parser (step 2/15)

The original storage-config.ts called gray-matter on a delimiter-less YAML
file. Gray-matter only parses YAML inside `---` frontmatter blocks; without
delimiters, it returns `{data: {}}`. Result: loadStorageConfig() always
returned null, the entire feature was a silent no-op for every user.

Original eng review's P0 confidence-9 finding (Issue #1).

Replaces gray-matter with a small dedicated parser for the gbrain.yml shape
(top-level `storage:` section, two array-valued nested keys). Yaml-lite was
considered first, but its flat key:value design doesn't handle nested
arrays. The dedicated parser is ~50 lines and trades expressiveness for
zero-dep, predictable parsing of a file format we control.

Adds the Issue #1B sanity warning (locked B): when gbrain.yml exists but
has no storage section (or empty arrays), warn once-per-process so the
user sees their config didn't take. The single test that would have caught
the original P0 — write a real gbrain.yml, call loadStorageConfig, assert
non-null — now exists.

Also tightens loadStorageConfig per D36: distinguishes "absent" (silent
null) from "unreadable" (throws). The previous code silently swallowed
read errors, hiding broken installs.

8 new test cases: real-disk happy path, comments + blank lines, quoted
values, missing storage section warning, empty section warning,
once-per-process warning suppression, unreadable file behavior, and the
existing helper tests (validation, tier matching, edge cases) all still
pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor: rename storage keys to db_tracked/db_only (step 3/15)

The vendor-specific names "supabase_only" and "git_tracked" hardcoded a
backend (Supabase) into the config schema. gbrain ships two engines —
PGLite and Postgres-via-Supabase. The canonical distinction is "lives in
the brain DB only" vs "lives in the brain DB and on disk under git." Both
work on either engine.

Renamed throughout (Issue #4 of eng review):
  git_tracked    → db_tracked
  supabase_only  → db_only
  isGitTracked() → isDbTracked()
  isSupabaseOnly() → isDbOnly()
  StorageTier 'git_tracked'/'supabase_only' → 'db_tracked'/'db_only'

Backward compatibility (D3 lock):
  loadStorageConfig accepts both shapes. Loader resolution order per the
  eng-review pass-2 finding: parse YAML → if canonical keys present use
  them, else if deprecated keys present map to canonical AND emit
  once-per-process deprecation warning → THEN run validation.
  Validation always sees the canonical shape so error messages reference
  db_tracked/db_only regardless of which keys the user wrote.

  The deprecation warning suggests `gbrain doctor --fix` for an automated
  rename (D72 — fix path lands in step 7).

  When both shapes coexist in one file, canonical wins and a stronger
  warning fires ("deprecated keys ignored — remove them").

Aliases isGitTracked/isSupabaseOnly kept for now to avoid churning the
sync.ts / export.ts / storage.ts call sites in this commit; they'll be
removed in a follow-up step. Storage.ts's tier-bucket initializers and
output strings updated. ASCII output replaces unicode box-drawing per D10.

gbrain.yml example file updated to canonical keys with explanatory
comments.

2 new test cases: deprecated-key fallback (asserts both shapes load
correctly with warning), canonical-wins-over-deprecated (asserts the
"both shapes coexist" path).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: add slugPrefix to PageFilters with engine-side filter (step 4/15)

Issue #13 of the eng review: storage.ts and export.ts loaded every page
in the brain (limit: 1_000_000) to check tier membership. On the 200K-page
brains this feature targets, that's the wall-clock and memory landmine
the feature exists to fix.

Adds an optional `slugPrefix` field to PageFilters. Both engines implement
it as `WHERE slug LIKE prefix || '%' ESCAPE '\'`, with literal escaping of
LIKE metacharacters (%, _, \) so user-supplied prefixes like `media/x/`
are treated as exact string prefixes.

Performance: the (source_id, slug) UNIQUE constraint on the pages table
gives both engines a btree index that supports LIKE-prefix range scans.
An EXPLAIN on Postgres confirms the index range scan rather than a seq
scan. PGLite has the same index shape via pglite-schema.ts.

Consumers updated:
  - export.ts: --slug-prefix flag now goes engine-side (no in-memory
    .filter(...)). The --restore-only path queries each db_only directory
    with slugPrefix in a loop instead of one full-table scan, with seen-set
    deduplication and disk-existence check inline.
  - storage.ts: keeps the full-scan path because storage-status needs the
    "unspecified" bucket count, which can't be computed without enumerating
    every page. Comment notes that step 5 (single-walk filesystem scan)
    will reduce per-page disk syscall cost.

2 new test cases on PGLiteEngine: slugPrefix happy path (3 tier dirs,
asserts only matching slugs return) and metacharacter escape regression
(asserts safe/ doesn't match unrelated slugs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* perf: single-walk filesystem scan via walkBrainRepo() (step 5/15)

Issue #14 of the eng review: storage.ts called existsSync + statSync
per-page in a synchronous loop. On a 200K-page brain that's 400K syscalls
serialized. Wall-clock landmine.

Adds src/core/disk-walk.ts with walkBrainRepo(repoPath) — one recursive
readdirSync walk, builds a Map<slug, {size, mtimeMs}>. Storage.ts looks
up each DB page in the map (O(1)) instead of stat-checking on demand.
Slug derivation matches the pages-table convention: people/alice.md on
disk becomes people/alice as the map key.

Skipped during walk:
  - dot-directories (.git, .gbrain, .vscode, etc) — not part of the brain
    namespace
  - node_modules — guards against accidentally walking into imported repos
  - non-.md files (sidecar JSON, binaries) — tracked by the brain through
    the files table, not by slug

Reusable: future commands (gbrain doctor's storage_tiering check, the
optional autopilot tier-fix path) get the same walk for free.

9 new test cases: empty dir, nonexistent dir, top-level files, nested
dirs, dot-dir skipping, node_modules skipping, non-.md filtering, size
capture, mtimeMs capture.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: path-segment matching for tier directories (step 6/15)

Issue #5 + D6 of the eng review: tier matching used slug.startsWith(dir),
which falsely matches 'media/xerox/foo' against 'media/x' if a user wrote
the directory without a trailing slash.

The new matcher requires the configured directory to end with `/` and
treats it as a canonical path-segment ancestor:

  media/x/   matches  media/x/tweet-1       ✓
  media/x/   doesn't  media/xerox/foo       ✗
  media/x    refused  media/x/tweet-1       (matcher requires trailing /)

Non-canonical input (no trailing slash) is refused outright. Step 7's
auto-normalizing validator converts user-written 'media/x' → 'media/x/'
on load, so the matcher never sees non-canonical input from real configs.
The behavior tested here is the strict matcher's contract.

Regression test pins the media/xerox collision case explicitly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: auto-normalize trailing-slash, throw on tier overlap (step 7/15)

D7+D8 of the eng review: validation was warnings-only. Users miss warnings.
Now:

  - Cosmetic: missing trailing slash auto-corrected, one-time info note
    showing what changed ("normalized 2 storage paths: 'people' →
    'people/', 'media/x' → 'media/x/'"). Once-per-process to keep noise low.

  - Semantic: same directory in both tiers throws StorageConfigError.
    Ambiguous routing — does media/ win as db_tracked or db_only? — is a
    real bug the user must fix. Caller propagates to the CLI for a clean
    exit-1 with actionable message.

loadStorageConfig now applies normalize+validate after merging deprecated
keys, so the path-segment matcher (step 6) only ever sees canonical
trailing-slash directories.

The pure validateStorageConfig kept for callers who want the warnings list
without the auto-fix side effects (gbrain doctor's reporting path).

2 new test cases: auto-normalize round-trip with warning text assertion,
overlap throws StorageConfigError.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: wire manageGitignore into runSync, only on success (step 8/15)

Issue #2 of the eng review: manageGitignore was defined and never
invoked. Docs claimed "auto-managed by gbrain" — false. Users hit a
.gitignore that never updated and committed db_only directories anyway.

Wire-up: runSync now calls manageGitignore after each successful
performSync return, in both watch and one-shot modes.

Eng review pass-2 finding #1: skip on dry_run AND blocked_by_failures
status. A sync that aborted partway has stale state; mutating .gitignore
based on a partially-loaded config invites drift. Failure-skip test
added (uses .gitignore-as-a-directory to simulate write failure;
asserts warning fired and disk wasn't corrupted).

Hardened manageGitignore itself with three additional behaviors:

  - GBRAIN_NO_GITIGNORE=1 escape hatch (D23) for shared-repo setups
    where a maintainer wants gbrain to leave .gitignore alone.

  - Submodule detection (D49). When repoPath/.git is a regular file
    (gitdir: ... pointer), the repo is a git submodule. Submodule
    .gitignore changes don't survive parent submodule updates, so we
    skip with an actionable warning ("add db_only directories to your
    parent repo's .gitignore manually").

  - Graceful failure (D9). Read errors, write errors, and
    StorageConfigError (overlap from step 7) all log a warning and
    return — sync's primary job (moving data) shouldn't die because of
    a side-effect on .gitignore.

manageGitignore is now exported (previously private) so the
storage-sync test file can hit it directly without spinning up sync.

9 new test cases: no-op without gbrain.yml, no-op with empty db_only,
happy-path append, idempotency (run twice, single entry), preservation
of user-written rules, GBRAIN_NO_GITIGNORE skip, submodule skip,
.git-directory normal path, write-failure graceful warning.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: D5 resolution chain for --restore-only and storage status (step 9/15)

D5 of the eng review: gbrain export --restore-only without --repo
silently fell through to the regular export path, dumping every page in
the database to the wrong directory. Hard regression risk.

Now exits 1 with an actionable message when --restore-only has no
--repo AND no configured default source. Resolution order:
  1. Explicit --repo flag
  2. Typed sources.getDefault() (reuses step 1's accessor)
  3. Hard error — never fall through to cwd

storage.ts:38 also bypassed BrainEngine with raw SQL and a bare
try/catch (Issue #3 + Issue #9). Replaced with the same typed
getDefaultSourcePath() — single source of truth, errors propagate
cleanly to the user, no silent cwd fallback.

Regular export (no --restore-only) keeps its current behavior per D26:
exports include everything, --repo is optional.

4 new test cases on PGLite in-memory:
  - hard-errors with no --repo + no default
  - explicit --repo wins
  - falls back to sources default local_path
  - non-restore export does not require --repo

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor: split storage.ts into pure data + JSON + human formatters (step 10/15)

Issue #10 of the eng review: getStorageStatus and runStorageStatus mixed
data gathering, JSON serialization, and human-readable output in one
function. Hard to test, hard to reuse, mismatched the orphans.ts pattern
that CLAUDE.md cites as the precedent.

Now three pure functions + a thin dispatcher:

  getStorageStatus(engine, repoPath) — async, returns StorageStatusResult.
    Side effects: engine.listPages + one walkBrainRepo (Issue #14).
    Exported so MCP exposure (D14) and gbrain doctor (D13) can consume the
    same data without re-running the loop.

  formatStorageStatusJson(result) — pure, returns indented JSON. Stable
    contract on the StorageStatusResult shape, suitable for orchestrators.

  formatStorageStatusHuman(result) — pure, returns ASCII text (D10 — no
    unicode box-drawing). Composable into other commands later.

  runStorageStatus(engine, args) — thin dispatcher: parses --repo /
    --json, calls getStorageStatus, picks a formatter, prints.

8 new test cases on the formatters: JSON parse round-trip, null-config
fallback, missing-files capped at 10 with rollup, ASCII-only assertion
(D10 regression guard), warnings inline, configuration listing, disk-
usage block omitted when zero bytes.

The StorageStatusResult interface is now exported as a public type, so
gbrain doctor's storage_tiering check can build its own findings from
the same shape.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* types: distinct PageCountsByTier and DiskUsageByTier (step 11/15)

Issue #11 of the eng review: pagesByTier (page counts) and
diskUsageByTier (byte totals) shared the same structural type
(Record<StorageTier, number>). Both are tier-keyed numeric maps but
carry semantically different units. A future bug that swaps them at a
call site (e.g., displaying disk bytes where the count belongs) wouldn't
trip the compiler.

Replaced with distinct nominal types via a brand field. Structurally
identical at runtime (no overhead) but compile-time disjoint —
TypeScript catches accidental cross-assignment.

  PageCountsByTier   { db_tracked, db_only, unspecified } : numbers (count)
  DiskUsageByTier    { db_tracked, db_only, unspecified } : numbers (bytes)

Both initialized in getStorageStatus, both threaded into
StorageStatusResult, both consumed by formatStorageStatusHuman /
formatStorageStatusJson without further changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: PGLite soft-warn + full lifecycle test (step 12/15)

D4: storage tiering on PGLite is a partial feature. The "DB" the pages
live in IS the local file gbrain uses for everything else, so "db_only"
has no real offload effect. The .gitignore management still helps
(keeps bulk content out of git history), so we warn and proceed —
not refuse.

Two warning sites (once-per-process each via module-local flags):
  - storage status: warns at runStorageStatus entry
  - sync: warns inside manageGitignore when engineKind='pglite' and
    config has db_only entries

Both phrased actionably ("To get full tiering, migrate to Postgres
with `gbrain migrate --to supabase`").

manageGitignore signature now takes an optional `engineKind` param.
runSync passes engine.kind. Stand-alone callers (tests, future
gbrain doctor --fix path) can omit it.

New test: test/storage-pglite.test.ts — D8 + D4 lifecycle. 6 cases:
engine.kind assertion, getStorageStatus loading gbrain.yml + reporting
tier counts, manageGitignore PGLite-warn (once per process), Postgres
no-warn, slugPrefix on PGLite, end-to-end (config + putPage + status
+ gitignore).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: add trailing-newline CI guard (step 14/15)

Issue #7 of the eng review: all four new files in the original
storage-tiering branch lacked POSIX trailing newlines. Linters complain,
git diffs phantom-flag every future edit. We've been adding newlines as
each file landed; this commit catches the regression class.

scripts/check-trailing-newline.sh:
  - sibling to check-jsonb-pattern.sh / check-progress-to-stdout.sh per
    CLAUDE.md's CI guard pattern
  - portable to bash 3.2 (macOS default; no mapfile, no associative arrays)
  - covers src/**, test/**, gbrain.yml, top-level *.md
  - reports each missing file by path and exits 1

Wired into `bun run test` between progress-to-stdout and typecheck.

Also fixed docs/storage-tiering.md (pre-existing missing newline from
the original branch — caught by the new guard on first run).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: v0.23.0 — VERSION, CHANGELOG, README, CLAUDE.md, storage-tiering.md (step 15/15)

VERSION → 0.23.0 (minor bump for new feature surface).

CHANGELOG entry in Garry voice with the canonical format:
  - Two-line bold headline ("Storage tiering, finally working...")
  - Lead paragraph naming what was broken before and what users get now
  - "Numbers that matter" before/after table for the 6 things that
    actually changed
  - "What this means for your brain" closer
  - "To take advantage of v0.23.0" self-repair block (per CLAUDE.md
    convention) — 6 numbered steps users can follow
  - Itemized changes split into critical fixes / new+renamed surface /
    architecture cleanup / tests + CI guards

CLAUDE.md "Key files" gains four new entries: storage-config.ts,
disk-walk.ts, the v0.23.0 storage.ts shape, and gbrain.yml itself.

README.md gains a new "Storage tiering" section between Skillify and
Getting Data In with the canonical example + commands + link to the
full guide.

docs/storage-tiering.md rewritten end-to-end with canonical key names
(db_tracked / db_only), v0.23.0 hardening details (idempotency,
submodule detection, GBRAIN_NO_GITIGNORE, dry-run gating), the
resolution chain for --restore-only, the auto-normalize +
throw-on-overlap validator, and the PGLite engine note.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: e2e Postgres lifecycle for storage tiering (step 16/16)

Per the v0.23.0 plan: full lifecycle E2E against real Postgres.

  - engine.kind === 'postgres' assertion
  - Full lifecycle: write 4 pages (1 db_tracked, 2 db_only, 1 unspecified)
    → getStorageStatus reports correct tier counts → human formatter
    renders → manageGitignore writes managed block → idempotency check
    → getDefaultSourcePath() resolves the configured local_path.
  - Container restart simulation: 2 db_only pages in DB, files missing
    on disk → status.missingFiles.length === 2 → slugPrefix engine
    filter on Postgres returns exactly the tier slugs.
  - slugPrefix index-based range scan regression: 50 media/x/* + 50
    people/p-* pages → slugPrefix='media/x/' returns exactly 50.
  - getDefaultSourcePath returns null when default source has no
    local_path (the hard-error path that replaces the original silent
    cwd fallback).
  - manageGitignore on Postgres engine does NOT emit the PGLite
    soft-warn (cross-engine assertion).

Skips gracefully when DATABASE_URL is unset, per CLAUDE.md E2E pattern.
Run via: DATABASE_URL=... bun test test/e2e/storage-tiering.test.ts

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: rebump version 0.23.0 → 0.22.9

Reverts the minor bump back to a patch-style version on the v0.22 line.
Storage tiering ships within the v0.22.x train alongside the recent
fix waves. Updates VERSION, package.json, CHANGELOG header + body refs,
CLAUDE.md Key files annotations, README.md section heading, and the
docs/storage-tiering.md backward-compat note.

* chore: bump version 0.22.9 → 0.22.11

Sibling workspaces claimed v0.22.10 in the queue. This branch advances
to v0.22.11 to keep the version monotonic on master.

Updates VERSION, package.json, CHANGELOG header + body refs, CLAUDE.md
Key files annotations, README.md section heading, and the
docs/storage-tiering.md backward-compat note.

* fix: address Codex pre-landing review findings (4 fixes)

Codex found 4 real issues during pre-landing review of v0.22.11 diff:

[P0] export --restore-only fell through to full export when
storageConfig was null (no gbrain.yml present). On older or
misconfigured brains, the recovery command would silently dump the
entire database. src/commands/export.ts now refuses with an actionable
error before any page query fires — matches the D5 lock spirit
("never silently fall through").

[P1] manageGitignore wire-up only fired when --repo was passed
explicitly. performSync resolves the repo from sync.repo_path or
sources.local_path, so the common `gbrain sync` path (after
setup, no flag) never updated .gitignore. src/commands/sync.ts now
uses the same source-resolver chain as the rest of /ship: opts.repoPath
→ getDefaultSourcePath → null. Fires in both watch and one-shot modes.

[P2] getDefaultSourcePath only consulted sources.local_path, missing
the legacy global sync.repo_path config key that pre-v0.18 brains use.
Added a fallback to engine.getConfig('sync.repo_path') when the
sources row has NULL local_path. Pre-v0.18 brains now work without
forcing a `gbrain sources add . --path .` migration.

[P2] sync --all multi-source loop never called manageGitignore even
though src.local_path was already known. Each source now gets its own
gitignore update on successful sync.

Tests:
  - test/storage-export.test.ts: replaced the old "falls through to
    full export" test with one that asserts the new refusal path
    (storage-tiering config required for --restore-only).
  - test/source-resolver.test.ts: added a fallback test exercising the
    legacy sync.repo_path code path for pre-v0.18 brains.
  - All 78 storage-tiering tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: regenerate llms.txt + llms-full.txt for v0.22.11

Per CLAUDE.md: "Run `bun run build:llms` after adding a new doc."
The README's new Storage tiering section + the rewritten
docs/storage-tiering.md changed the inlined bundle. test/build-llms.test.ts
catches the drift and was failing on master pre-regen.

* fix: typecheck error in disk-walk.ts (CI #73350475897)

tsc --noEmit failed in CI because ReturnType<typeof readdirSync> with
withFileTypes:true picks an overload union that includes
Dirent<Buffer<ArrayBufferLike>>. Strict tsc treats entry.name as Buffer,
so .startsWith / .endsWith / string comparisons all blew up.

Annotate the variable as Dirent[] (string-based) and cast through unknown,
matching the pattern sync.ts already uses for its own filesystem walk.
Same runtime behavior; clean typecheck.

Tests still 9/9.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EOF

---------

Co-authored-by: root <root@localhost>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request Apr 30, 2026
… (v0.23.0) (#462)

* feat: dream_verdicts schema + engine methods

Adds the v25 schema migration creating the dream_verdicts table
(file_path, content_hash, worth_processing, reasons, judged_at;
PRIMARY KEY (file_path, content_hash); RLS-enabled when running as
a BYPASSRLS role).

Distinct from raw_data (which is page-scoped) — transcripts being
judged for synthesis aren't pages. The (file_path, content_hash)
key means edited transcripts re-judge automatically.

BrainEngine gains:
- DreamVerdict + DreamVerdictInput types
- getDreamVerdict(filePath, contentHash) → DreamVerdict | null
- putDreamVerdict(filePath, contentHash, verdict) — ON CONFLICT upsert

Both engines implement (postgres-engine.ts, pglite-engine.ts).

This commit alone is functionally inert — nothing reads/writes the
table yet. The synthesize phase (later commit) is the consumer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: trusted-workspace allow-list for subagent put_page

Adds OperationContext.allowedSlugPrefixes — when set, put_page
enforces slug membership in the allow-list instead of the legacy
wiki/agents/<id>/... namespace. The trust signal is the SUBMITTER
(PROTECTED_JOB_NAMES gates subagent submission so MCP can't reach
this field), not the runtime ctx.remote flag — every subagent tool
call has remote=true for auto-link safety, so basing trust on
remote is incoherent.

matchesSlugAllowList(slug, prefixes) helper supports glob suffix
'/*' (recursive — wiki/originals/* matches ideas/foo/bar) and
exact match for unsuffixed entries.

put_page check shape:
  if (viaSubagent && allowedSlugPrefixes set) → allow-list check
  else if (viaSubagent) → existing namespace check (regression guard)
  else → no check (regular CLI)

Auto-link is re-enabled for the trusted-workspace path so the cycle's
extract phase doesn't have to recompute every edge after synthesize
writes. Untrusted remote writes still skip auto-link as before.

SubagentHandlerData.allowed_slug_prefixes is the wire field; the
synthesize/patterns phases (later commit) populate it from a single
source of truth in skills/_brain-filing-rules.json's
dream_synthesize_paths.globs array. The model's tool schema description
mirrors the allow-list so it writes correct slugs on the first try.

IRON RULE security tests:
- test/operations-allow-list.test.ts: allow-list ALLOW/REJECT, glob
  semantics, regression guard for the v0.15 namespace fallback when
  allow-list is unset, FAIL-CLOSED when subagentId is missing.
- test/e2e/dream-allow-list-pglite.test.ts: end-to-end on PGLite,
  poisoned-transcript style write outside allow-list → REJECTED.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: cycle scaffolding — 8-phase order + transcript discovery

Extends ALL_PHASES from 6 → 8: synthesize between sync and extract,
patterns between extract and embed. Codex finding #7: patterns MUST
run after extract because subagent put_page sets ctx.remote=true and
skips auto-link/timeline by default — extract is the canonical edge
materialization step. Without that ordering, patterns reads stale
graph state.

Final order:
  lint → backlinks → sync → synthesize → extract → patterns → embed → orphans

CycleOpts gains:
- yieldDuringPhase callback — generic in-phase keepalive for long
  waits (synthesize fan-out, patterns roll-up). Renews cycle-lock TTL
  + worker job lock. Mirrors yieldBetweenPhases shape.
- synthInputFile / synthDate / synthFrom / synthTo — forwarded to
  runPhaseSynthesize for the CLI's --input/--date/--from/--to flags.

CycleReport.totals additively grows (no schema_version bump):
  transcripts_processed, synth_pages_written, patterns_written.

src/core/cycle/transcript-discovery.ts is a pure filesystem walk:
- .txt files only, sorted by path for determinism
- date-prefixed basename filter (--date / --from / --to)
- min_chars filter (default 2000)
- exclude_patterns auto-wraps bare words as \b<word>\b regex (Q-3),
  power users may pass full regex with anchors
- compileExcludePatterns is exported for unit tests

Phase implementations land in the next commit; this one only adds
the dispatcher slots so commit-by-commit bisect doesn't crash on
import-not-found.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: synthesize + patterns phases — gbrain dream actually dreams

Synthesize phase (src/core/cycle/synthesize.ts) reads conversation
transcripts from dream.synthesize.session_corpus_dir and writes
brain-native pages: reflections to wiki/personal/reflections/...,
originals to wiki/originals/ideas/..., timeline entries on existing
people pages.

Pipeline:
  1. discoverTranscripts (filesystem walk + filters)
  2. cooldown check via dream.synthesize.last_completion_ts config
     (default 12h; bypassed by --input/--date/--from/--to)
  3. cheap Haiku verdict per transcript, cached in dream_verdicts
     table keyed by (file_path, content_hash) — backfill re-runs
     skip already-judged transcripts at zero cost
  4. fan-out: one Sonnet subagent per worth-processing transcript
     dispatched with allowed_slug_prefixes (read from
     skills/_brain-filing-rules.json's dream_synthesize_paths.globs)
     and idempotency_key dream:synth:<file_path>:<content_hash>
  5. wait via waitForCompletion; yieldDuringPhase ticks every child
     terminal so the cycle-lock TTL refreshes on long backfills
  6. collect slugs from subagent_tool_executions for each child
     (codex finding #2: NOT pages.updated_at, which would pick up
     unrelated writes)
  7. orchestrator dual-write — query each new page from DB,
     reverse-render via serializeMarkdown, write file to brain_dir.
     Subagent never gets fs-write access.
  8. deterministic summary index page at dream-cycle-summaries/<date>
     (codex finding #4: slug shape is regex-compatible — no
     underscores, no .md extension)
  9. write completion timestamp ONLY on successful runs

Patterns phase (src/core/cycle/patterns.ts) runs after extract so
the graph state is fresh. Single Sonnet subagent gathers reflections
within dream.patterns.lookback_days (default 30); names a pattern
only when ≥dream.patterns.min_evidence (default 3) reflections
support it. Same allow-list path as synthesize.

CLI flags on `gbrain dream` (src/commands/dream.ts):
  --input <file>      ad-hoc transcript synthesis (implies
                      --phase synthesize; bypasses cooldown)
  --date YYYY-MM-DD   restrict synthesize to one date
  --from <d> --to <d> backfill range
  --dry-run           runs Haiku verdict (cached), skips Sonnet
                      synthesis. NOT zero LLM calls (codex #8).

Conflict detection: --input + --date/--from/--to exits 2.
ISO 8601 date format validated; range start > end exits 2.

Auto-commit / push deferred to v1.1 (codex finding #5). v1 writes
files to brain_dir; user or autopilot handles git.

Tests:
- test/cycle-patterns.test.ts: structural assertions on the patterns
  phase (queue + waitForCompletion wired, allow-list threading,
  subagent_tool_executions provenance, no raw_data dependency).
- test/dream-cli-flags.test.ts: argv parsing, conflict detection,
  ISO date validation, --input implies --phase synthesize, dry-run
  semantics doc string.
- test/e2e/dream-synthesize-pglite.test.ts: 8 cases on PGLite
  in-memory exercising not_configured, empty corpus, no API key
  skip path, dry-run, cooldown active vs --input bypass, and the
  dream_verdicts cache hit path. Per-test rig isolation (each
  test creates and tears down its own engine) avoids
  cross-test PGLite WASM contention.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: dream cycle v0.27.0 — skills, CLAUDE.md, migration, changelog

- skills/maintain/SKILL.md: synthesize + patterns phases documented
  with quality bar (Iron Law for synthesis), trust boundary, idempotency,
  cooldown semantics, CLI invocation patterns. New triggers added so
  "process today's session" / "synthesize my conversations" route here.
- skills/RESOLVER.md: dream cycle triggers route to maintain.
- skills/_brain-filing-rules.md: directory table for the five output
  types (reflections, originals, patterns, people enrichment, cycle
  summary) with slug shape per row; Iron Law repeated.
- skills/migrations/v0.27.0.md: agent-readable migration narrative.
  Schema migration v25 runs automatically on `gbrain apply-migrations`;
  synthesize ships disabled by default — opt-in via
  dream.synthesize.session_corpus_dir + dream.synthesize.enabled.
- CLAUDE.md: file inventory updated with new files (cycle/synthesize.ts,
  cycle/patterns.ts, cycle/transcript-discovery.ts), the 8-phase
  ordering, the trusted-workspace allow-list trust model, and the v25
  schema migration line in the migrate.ts entry.
- VERSION: 0.20.4 → 0.27.0
- CHANGELOG.md: v0.27.0 release-summary section per CLAUDE.md voice
  rules (numbers that matter table, what-this-means closer, "to take
  advantage of" block), followed by the itemized changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: add patterns E2E + 8-phase cycle E2E + bump synth-cooldown timeouts

Two new E2E test files on PGLite (no DATABASE_URL or API key required):

- test/e2e/dream-patterns-pglite.test.ts (6 cases) — exercises
  runPhasePatterns skip paths against a real engine: disabled,
  default-enabled-but-insufficient-evidence, no-API-key, dry-run.
  Sibling of dream-synthesize-pglite.test.ts; same per-test rig
  pattern for engine isolation.

- test/e2e/dream-cycle-eight-phase-pglite.test.ts (5 cases) —
  end-to-end runCycle with the v0.27 8-phase order. Asserts:
  ALL_PHASES is the documented 8 phases in the right sequence,
  the dry-run report's phases array preserves that order,
  CycleReport.totals carries the new transcripts_processed /
  synth_pages_written / patterns_written fields, --phase synthesize
  and --phase patterns each run only that phase, and synthInputFile
  is plumbed correctly through runCycle to runPhaseSynthesize.

Bump per-test timeout to 30s on the two synthesize-cooldown E2E
tests that create two PGLite engines back-to-back. Default Bun 5s
budget is tight under sustained suite pressure (PGLite WASM init
costs ~1-2s per engine on macOS); each test passes alone but flakes
in the full E2E suite. The third arg `30_000` is Bun's standard
test-timeout knob.

Full E2E suite (test/e2e/) now: 86 pass / 0 fail / 258 skip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: ship-prep — typecheck fixes, llms.txt regen, 8-phase test update

- src/core/cycle/synthesize.ts + patterns.ts: PageType 'default' → 'note'
  (TS strict typecheck rejected 'default'; 'note' is a valid PageType
  for orchestrator-written summary index pages and reverse-render fallback).
- src/core/pglite-engine.ts: re-import DreamVerdict + DreamVerdictInput
  types after the master merge dropped them from the import line.
- test/e2e/dream-allow-list-pglite.test.ts: ToolCtx now requires
  remote: true literal; thread it through every put_page tool call.
- test/e2e/dream-patterns-pglite.test.ts: PageType 'default' → 'note'
  in the seedReflections helper.
- test/core/cycle.test.ts: bump expected hook-call count + phase count
  6 → 8 to match v0.27 ALL_PHASES extension.
- llms-full.txt: regenerate against the updated CHANGELOG + CLAUDE.md
  so the committed snapshot matches what the generator now produces.

Full bun test suite: 2793 pass / 0 fail / 258 skip (3051 tests, 177 files).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: update README + INSTALL_FOR_AGENTS for v0.27.0 dream cycle

README: maintain skill row mentions synthesize/patterns; gbrain dream
command-reference block describes the 8-phase pipeline and the new
--input/--date/--from/--to flags.

INSTALL_FOR_AGENTS: dream cycle bullet calls out v0.27 conversation
synthesis + cross-session pattern detection.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: renumber v0.27.0 → v0.23.0

Master is at v0.22.5; v0.23.0 is the next natural slot for the dream-cycle
synthesize + patterns release. Bulk rename across VERSION, package.json,
CHANGELOG, migration file, source comments, skills, and llms.txt bundles.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(e2e): bump cycle.test.ts phase count 6 → 8

The dry-run full-cycle test asserted 6 phases. v0.23 added synthesize
and patterns, bringing the total to 8. The unit-side equivalent
(test/core/cycle.test.ts) was already updated; this catches the
E2E sibling that surfaced after the latest master merge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request May 1, 2026
…rdict model tests (#527)

* v0.23.1 fix: dream self-consumption guard + configurable verdict model

Built-in isDreamOutput() guard in transcript-discovery.ts auto-skips
any transcript whose first 2000 chars contain dream output slug prefixes
(wiki/personal/reflections/, wiki/originals/ideas/, wiki/personal/patterns/,
dream-cycle-summaries/). Prevents infinite recursion if dream output is
ever fed back into the corpus.

judgeSignificance() now accepts a verdictModel parameter, loaded from
dream.synthesize.verdict_model config key. Default: claude-haiku-4-5.

3 new test cases covering the guard.

* feat(dream): replace content-prefix guard with orchestrator-stamped marker

The v0.23.1 prefix-string guard had two flaws caught by codex review.
serializeMarkdown does not embed the page slug into body content, so
the heuristic could miss real dream output. And real conversation
transcripts often cite brain slugs ("earlier I wrote about
wiki/personal/reflections/identity..."), so the heuristic dropped
legitimate transcripts silently.

Swap content inference for explicit identity. renderPageToMarkdown and
writeSummaryPage now stamp `dream_generated: true` + `dream_cycle_date`
into frontmatter at render time. Guard checks for the marker via
DREAM_OUTPUT_MARKER_RE (anchored at frontmatter open, BOM/CRLF
tolerant, scans first 2000 chars, word boundary on `true`). Cannot
drift, cannot false-positive on user text, cannot miss real output.

Tests built from a real Page → renderPageToMarkdown → isDreamOutput
round-trip (codex finding #5 — synthetic strings don't prove the
guard catches what synthesize actually produces). Coverage: regression
fixture, false-positive prevention on user transcripts citing slugs,
CRLF+BOM, whitespace/case variants, anchor-at-byte-0, perf bound,
bypass plumbing, dream_generatedfoo word-boundary check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(dream): --unsafe-bypass-dream-guard CLI flag

Explicit opt-in to disable the synthesize self-consumption guard. The
flag is intentionally NOT tied to --input — codex review caught that
implicit bypass is a footgun: any caller could synthesize a dream-
generated page directly via --input, get a cached positive verdict,
and silently re-trigger the loop bug.

Plumbing: dream.ts CLI parses the flag → DreamArgs.bypassDreamGuard →
runCycle({ synthBypassDreamGuard }) → SynthesizePhaseOpts.bypassDreamGuard
→ discoverTranscripts({ bypassGuard }) and readSingleTranscript.
Loud stderr warning at phase entry when set so the cost is visible.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.23.2 chore: bump version + CHANGELOG for corrected guard architecture

Replaces the v0.23.1 release notes with the v0.23.2 voice describing
the orchestrator-stamped marker approach and the --unsafe-bypass-dream-guard
flag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: sync project docs for v0.23.2 marker-based guard

Update CLAUDE.md Key Files entries for src/core/cycle/synthesize.ts,
src/core/cycle/transcript-discovery.ts, and src/commands/dream.ts to
reflect the v0.23.2 dream_generated frontmatter marker that replaces the
v0.23.1 content-prefix self-consumption guard, plus the new
--unsafe-bypass-dream-guard CLI flag.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: regenerate llms-full.txt for v0.23.2 CLAUDE.md updates

CI's `build-llms generator > committed match generator output` guard
caught drift after the v0.23.2 doc-sync (commit 507edb1) updated three
Key Files entries in CLAUDE.md without re-running `bun run build:llms`.

The llms.txt index didn't drift (no new doc URLs); only the inlined
llms-full.txt bundle needed refreshing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(e2e): round-trip dream-recursion coverage for v0.23.2 marker guard

Three new PGLite E2E cases exercise the actual production loop scenario
end-to-end. Unit tests covered the bug class at the function-pair level
(renderPageToMarkdown → readSingleTranscript). These cover it at the
phase level: runPhaseSynthesize with a real engine, real putPage, real
renderPageToMarkdown, real corpus-dir discovery.

1. Leaked dream output is skipped on next synthesize run. The reflection
   page is inserted, reverse-rendered (which stamps `dream_generated:
   true`), dropped into the corpus dir as .txt, and the next phase run
   reports "no transcripts to process" with a stderr skip log. Verdict
   cache stays untouched so a future legit edit isn't shadowed by a
   stale cached "false".

2. bypassDreamGuard=true at phase entry re-enables ingestion. Same
   marked file gets discovered through the loud-warning path. Proves
   --unsafe-bypass-dream-guard plumbing reaches discoverTranscripts at
   phase scope.

3. Mixed corpus (leaked dream output + real conversation transcript)
   discovers exactly the real one. Pins codex finding #1's headline
   false-positive case: a transcript citing wiki/personal/reflections/
   in body must NOT be skipped.

Stderr capture via process.stderr.write spy with try/finally restore.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(test): use valid PageType 'note' in round-trip E2E fixtures

CI typecheck caught three TS2322 violations in the round-trip E2E
fixtures: 'reflection' is not a member of PageType. Reflections are
filed as 'note' in production (renderPageToMarkdown falls back to 'note'
for unknown types).

No behavior change — the guard test still exercises the same
serializeMarkdown → discoverTranscripts loop.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(claude): require `bun run typecheck` before push

The pre-ship section listed `bun test` as the unit-test path but didn't
flag the trap: `bun test` (the bun runner) does NOT run TypeScript type
checking. Only `bun run test` (the npm script) does, because it chains
`bun run typecheck` + the four shell pre-checks before the runner.

CI on PR #527 caught a `'reflection'` literal that `PageType` doesn't
admit (PageType is a closed union). The runtime E2E and `bun test`
both passed locally because the runner doesn't gate on TS. The
separate typecheck stage in CI rejected it.

New rule: run `bun run typecheck` (or `bun run test`, which wraps it,
or `bun run ci:local` for the full gate) before pushing. The runner-
alone path is for hot-loop test iteration only.

Also regenerated llms-full.txt for the CLAUDE.md update.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Wintermute <wintermute@garrytan.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request May 3, 2026
…oard (#358)

* feat: OAuth 2.1 schema tables + shared token utilities

Add oauth_clients, oauth_tokens, oauth_codes tables to both PGLite and
Postgres schemas. Migration v5 creates tables for existing databases.
PGLite now includes auth infrastructure (access_tokens, mcp_request_log,
OAuth tables) because `serve --http` makes it network-accessible.

Extract hashToken() and generateToken() to src/core/utils.ts for DRY
reuse across auth.ts and oauth-provider.ts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: GBrainOAuthProvider — MCP SDK OAuthServerProvider implementation

Implements OAuthServerProvider backed by raw SQL (PGLite or Postgres).
Supports client credentials, authorization code with PKCE, token refresh
with rotation, revocation, and legacy access_tokens fallback.

Key decisions from eng review:
- Uses raw SQL connection, not BrainEngine (OAuth is infrastructure)
- All tokens/secrets SHA-256 hashed before storage
- Legacy tokens grandfathered as read+write+admin
- sweepExpiredTokens() wrapped in try/catch (non-blocking startup)
- Client credentials: no refresh token per RFC 6749 4.4.3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: scope + localOnly annotations on all 30 operations

Add AuthInfo, scope ('read'|'write'|'admin'), and localOnly fields to
Operation interface. Per-operation audit:
- 14 read ops, 9 write ops, 2 admin ops, 4 admin+localOnly ops
- sync_brain, file_upload, file_list, file_url: admin + localOnly
- Scope enforcement happens in serve-http.ts before handler dispatch

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: HTTP MCP server with OAuth 2.1 + 27 OAuth tests

gbrain serve --http starts Express 5 server with:
- MCP SDK mcpAuthRouter (authorize, token, register, revoke endpoints)
- Custom client_credentials handler (SDK doesn't support CC grant)
- Bearer auth + scope enforcement on /mcp tool calls
- Admin dashboard auth via HTTP-only cookie + bootstrap token
- SSE live activity feed at /admin/events
- DCR default OFF (--enable-dcr to enable)
- Rate limiting on /token (50/15min)
- localOnly operations excluded from HTTP

CLI: gbrain serve --http [--port 3131] [--token-ttl 3600] [--enable-dcr]

Dependencies: express@5.2.1, express-rate-limit@7.5.1, cors@2.8.6
SDK pinned to exact 1.29.0 (was ^1.0.0)

27 new tests covering OAuth provider, scope enforcement, auth code flow,
refresh rotation, token revocation, legacy fallback, and sweep.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: React admin dashboard — 7 screens, dark theme, Krug-designed

Admin SPA at /admin with client-side routing (#login, #dashboard,
#agents, #log). Built with Vite + React, served from admin/dist/.

Screens:
- Login: one field, one button, zero happy talk
- Dashboard: metrics bar, SSE live activity feed, token health panel
- Agents: table with scopes/badges, + Register Agent button
- Register: modal form (name, scopes), 3 mindless choices
- Credentials: full-screen modal, copy buttons, download JSON, warning
- Request Log: paginated table (50/page), time-relative timestamps
- Agent Detail: slide-out drawer, config export tabs (Perplexity/Claude/JSON)

Design tokens: #0a0a0f bg, Inter + JetBrains Mono, 4-32px spacing.
Build: bun run build:admin (Vite, 65KB gzipped).
Admin API: /admin/api/register-client endpoint for dashboard registration.
SPA serving: Express static + index.html fallback for client-side routing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: add admin SPA lockfile

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v1.0.0.0)

Milestone release: multi-agent GBrain with OAuth 2.1, HTTP server,
and React admin dashboard. See CHANGELOG.md for details.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: update project documentation for v1.0.0.0

Sync README, CLAUDE.md, and docs/mcp/ with the OAuth 2.1 + HTTP server
+ admin dashboard surface that shipped in v1.0.0.0.

- README.md: new "Remote MCP with OAuth 2.1" section covering
  gbrain serve --http, admin dashboard, scoped operations, legacy
  bearer fallback; add serve --http + auth notes to the commands
  reference.
- CLAUDE.md: add src/commands/serve-http.ts, src/core/oauth-provider.ts,
  admin/ directory as key files; document scope + localOnly additions
  to Operation contract; add oauth.test.ts (27 cases) to the test list;
  add v1.0.0 key-commands section clarifying that OAuth client
  registration is via the /admin dashboard or SDK (no CLI subcommand).
- docs/mcp/DEPLOY.md: promote --http as the recommended remote path,
  add OAuth 2.1 Setup section, list ChatGPT in supported clients,
  remove the "not yet implemented" footer.
- docs/mcp/CHATGPT.md (new): unblocks the P0 TODO. Full ChatGPT
  connector setup via OAuth 2.1 + PKCE.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: wire gbrain auth subcommand with OAuth register-client

Previously auth.ts was a standalone script invoked via
`bun run src/commands/auth.ts`. CHANGELOG and README documented
`gbrain auth ...` commands that didn't actually work.

- Export `runAuth(args)` from auth.ts (keeps standalone entry intact
  via `import.meta.url === file://${process.argv[1]}` check)
- Add `auth` to CLI_ONLY + dispatch in handleCliOnly
- New subcommand `gbrain auth register-client <name> [--grant-types]
  [--scopes]` wraps GBrainOAuthProvider.registerClientManual
- Lazy DB check: only subcommands that need DATABASE_URL error out

Now the documented CLI flow works end to end:
  gbrain auth register-client perplexity --grant-types client_credentials --scopes "read write"
  gbrain serve --http --port 3131

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: reflect wired gbrain auth register-client CLI

After /ship, the doc subagent wrote docs assuming `gbrain auth
register-client` did not exist (it said so explicitly in CLAUDE.md:184).
A follow-up commit (c4a86ce) wired it into src/cli.ts + src/commands/auth.ts.
These docs were now contradicting reality.

- CLAUDE.md: removed "There is no gbrain auth register-client CLI
  subcommand" claim, documented the three registration paths
  (CLI / dashboard / SDK).
- README.md: replaced `bun run src/commands/auth.ts` hint with
  `gbrain auth create|list|revoke|test` and `gbrain auth register-client`.
- docs/mcp/DEPLOY.md: added CLI registration example above the
  programmatic example.
- TODOS.md: moved "ChatGPT MCP support (OAuth 2.1)" P0 item to
  Completed with v1.0.0.0 completion note. Closes the P0 that had been
  blocking the "every AI client" promise since v0.6.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: enable RLS on OAuth tables + loosen v24-exact test assertion

CI Tier 1 (Mechanical) was failing on 4 E2E tests after the v0.18.1 RLS
hardening landed on master (PR #343). Our v25 oauth_infrastructure migration
adds 3 new public tables (oauth_clients, oauth_tokens, oauth_codes) but
didn't enable RLS, so gbrain doctor's new check flagged them and the
"RLS on every public table" assertion failed.

Fixes:
- src/schema.sql: ALTER TABLE ... ENABLE ROW LEVEL SECURITY for the 3 OAuth
  tables inside the existing BYPASSRLS-gated DO block (fresh installs).
- src/core/migrate.ts v25: append a BYPASSRLS-gated DO block after the OAuth
  CREATE TABLE statements (existing installs on upgrade). Mirrors the v24
  rls_backfill gating pattern — RAISE WARNING if the current role lacks
  BYPASSRLS, so migrations don't silently lock the operator out.
- src/core/schema-embedded.ts: regenerated via `bun run build:schema`.
- test/e2e/mechanical.test.ts: one unrelated v24 test asserted the post-
  migration version equals exactly '24'. That breaks when any later
  migration exists (like our v25). Relaxed to `>= 24` since the test's
  intent is "v24 didn't abort the chain", not "v24 is the final version".

Verified locally: 78/78 E2E tests pass against real Postgres 16 + pgvector.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: regenerate llms-full.txt for v1.0.0 docs

CI test/build-llms.test.ts > committed llms.txt + llms-full.txt match
current generator output failed. The committed llms-full.txt was built
before the v1.0.0 doc updates landed (OAuth 2.1 README section, new
docs/mcp/CHATGPT.md, CLAUDE.md serve-http references, etc.), so the
regen-drift guard flagged it.

Ran `bun run build:llms`. llms.txt is unchanged (skinny index still
matches); llms-full.txt picks up 166 net-new lines of bundled content.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* connected-gbrains PR 0 — minimal runtime (mounts, registry, aggregated RESOLVER) (#372)

* feat(mounts): connected-gbrains PR 0 foundation — registry + resolver + CLI

Lays the foundation for connected gbrains (v0.19.0) per the approved plan.
This is PR 0 — minimal runtime for direct-transport, path-mounted brains.

What this slice ships:
- src/core/brain-registry.ts — keyed BrainRegistry with lazy engine init,
  schema-validated mounts.json loader, DuplicateMountPathError (load-bearing
  identity check per Codex finding #9 correction), UnknownBrainError with
  actionable available-id list. Pure: no AsyncLocalStorage, no singleton
  mutation. ~280 LOC.

- src/core/brain-resolver.ts — 6-tier brain-id resolution mirroring
  v0.18.0's source-resolver.ts so agents learn ONE mental model:
    1. --brain <id>     2. GBRAIN_BRAIN_ID env      3. .gbrain-mount dotfile
    4. longest-path match over registered mounts    5. (reserved v2 default)
    6. 'host' fallback
  Orthogonal to --source: --brain picks which DB, --source picks the repo
  within that DB. Corruption-resistant: mounts.json load failures fall
  through to 'host' instead of breaking every CLI invocation.

- src/commands/mounts.ts — `gbrain mounts add|list|remove` (direct transport
  only). Validates on add (path exists on disk, id regex, no dupes). WARNS
  but does not block on same db_url/db_path across ids (teams may
  legitimately alias a remote brain). Password redaction in list output.
  Atomic write via temp+rename. 0600 perms. PR 1 adds pin/sync/enable;
  PR 2 adds --mcp-url + OAuth.

- src/cli.ts — wires `gbrain mounts` into handleCliOnly (no DB required
  for the config-only subcommands).

- test/brain-registry.test.ts (28 cases): schema validation across every
  malformed-input branch, ALS-free resolution, duplicate id + path detection,
  disabled-mount exclusion, UnknownBrainError context.

- test/brain-resolver.test.ts (22 cases): priority order (explicit > env >
  dotfile > path-prefix > fallback), dotfile walk-up, malformed dotfile
  recovery, longest-prefix match, sibling-path false-positive guard,
  loader-failure defense.

- test/mounts-cli.test.ts (17 cases): parseAddArgs surface, redactUrl,
  atomic write, add/list/remove roundtrip via temp HOME.

67 new tests, all green. Typecheck clean. Depends on mcp-key-mgmt (base
branch) for the OAuth/scope annotations that PR 2 will leverage.

Next in this branch: PR 0 still needs (a) the deep host-brain-bias audit
(postgres-engine internal singleton fallback + a few operations.ts
callers), (b) OperationContext threading to make ctx.brainId populated at
dispatch, (c) composeResolvers + composeManifests, (d) aggregated
~/.gbrain/mounts-cache/ for host-agent runtime ownership.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(mounts): brains-and-sources mental model + agent routing convention

Two orthogonal axes organize GBrain knowledge. Users AND agents need to
understand both, or queries misroute silently.

  --brain  → WHICH DATABASE    (host + mounts)
  --source → WHICH REPO IN DB  (v0.18.0 sources: wiki, gstack, ...)

Both axes use the same 6-tier resolution (explicit > env > dotfile >
path-prefix > default > fallback), so learning one teaches both.

Ships:

- docs/architecture/brains-and-sources.md — canonical mental model doc.
  Covers four topologies with ASCII diagrams:
    1. Single-person developer (one brain, one source)
    2. Personal brain with multiple repos (one brain, N sources)
    3. Personal + one team brain mount (2 brains)
    4. Senior user with multiple team memberships (N mounted team brains
       alongside personal) — the CEO-class topology
  Explicit "when to move each axis" decision table. Generic example names
  throughout per the project's privacy rule.

- skills/conventions/brain-routing.md — agent-facing decision table.
  Rules for when to switch brain (team-owned question, explicit name,
  data owner changes) vs switch source (working in a repo, topic scoped
  to one repo). Cross-brain federation is latent-space only in v0.19 —
  the agent fans out; the DB never does. Anti-patterns listed: silent
  brain jumps, writing to host when data is team-owned, missing brain
  prefix in citations, ignoring .gbrain-mount dotfiles.

- CLAUDE.md — adds "Two organizational axes (read this first)" section
  at the top pointing at both new docs.

- AGENTS.md — adds brains-and-sources.md + brain-routing.md to the
  "read this order" (positions 3 and 4, before RESOLVER.md).

- skills/RESOLVER.md — adds brain-routing.md to the Conventions section
  so it appears alongside quality.md, brain-first.md, subagent-routing.md.

No code changes. Pre-existing check-resolvable warnings unchanged (2
warnings on base unrelated to this work). 67 PR-0 tests still green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(mounts): thread brainId through OperationContext + subagent chain

PR 0 plumbing for connected gbrains. Adds an optional brainId field that
identifies which database an operation targets and ensures subagents
inherit the parent job's brain instead of process-wide defaults. No
dispatch-path changes in this commit — that is PR 1 (registry wiring at
MCP + CLI entry points). The fields exist so callers can set them now
and downstream code respects them.

Changes:

- src/core/operations.ts: OperationContext grows `brainId?: string`.
  Optional for back-compat. 'host' is the implicit default when absent.
  Orthogonal to v0.18.0's source_id (source = which repo within the
  brain, brain = which database). See docs/architecture/brains-and-sources.md.

- src/core/minions/types.ts: SubagentHandlerData gains `brain_id?: string`.
  Parent jobs set this when submitting a child subagent to lock the
  child into a specific brain. Omitted = host (unchanged behavior).

- src/core/minions/handlers/subagent.ts: buildBrainTools call site
  reads data.brain_id and passes it through. Child subagents spawned
  from this handler will see the same brainId unless they override in
  their own data.

- src/core/minions/tools/brain-allowlist.ts: BuildBrainToolsOpts +
  OpContextDeps grow brainId; buildOpContext stamps it on every
  OperationContext the subagent builds for tool calls. Addresses Codex
  finding #6 (brain-allowlist hardwired parent config without brain
  awareness, so switching brain only in subagent.ts was not enough).

Tests: 166 affected tests green (subagent suite + minions + brain
registry + resolver). Typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(mounts): composeResolvers + composeManifests + aggregated cache

The runtime ownership seam for connected gbrains (Codex finding #3 from
plan review): check-resolvable.ts VALIDATES RESOLVER.md; it does not
DISPATCH skills. Host agents (Wintermute/OpenClaw/Claude Code) read
skills/RESOLVER.md directly to route user requests. Without an aggregated
resolver, mounted team brains cannot contribute skills to the host
agent's routing table.

This commit adds the aggregation:

- src/core/mounts-cache.ts (NEW): pure composeResolvers + composeManifests
  functions plus filesystem writers for ~/.gbrain/mounts-cache/. The
  aggregated files carry every host skill plus every mount skill,
  namespace-prefixed (e.g. `yc-media::ingest`). Host skills always beat
  a same-named mount skill (locked decision 1); bare-name collisions
  between two mounts surface as structured ambiguity info so doctor can
  warn (PR 1).

  Also addresses Codex finding #8: manifests compose alongside the
  resolver, else doctor conformance breaks on remote skills.

- src/commands/mounts.ts: refreshMountsCache() called on `mounts add`
  and `mounts remove` (the latter clearing the cache entirely when the
  last mount goes away). Uses findRepoRoot() to locate the host skills
  dir; skips with a stderr note when run outside a gbrain repo so the
  user isn't confused by a "cache not refreshed" error in the wrong
  cwd.

- test/mounts-cache.test.ts (NEW): 23 unit tests covering empty world,
  host-only, single mount, two-mount ambiguity, host-shadows-mount,
  disabled mount excluded, missing RESOLVER.md is a no-op, manifest
  composition with same-name collision, render shape, atomic rewrite,
  clear on missing dir.

Output format for ~/.gbrain/mounts-cache/RESOLVER.md adds a Brain column
so host agents can see which brain each trigger routes to at a glance,
plus Shadows and Ambiguous sections when those conditions exist.

Tests: 90 PR 0 tests green (brain-registry + resolver + mounts-cache +
mounts-cli). Full suite regression pending in task 11.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(mounts): force instance-level pool for mount brains + CI guard

Closes the silent-singleton-share bug Codex flagged as finding #1 from
the plan review: two direct-transport mounts with different Postgres
URLs would both fall through postgres-engine.ts's `get sql()` getter to
db.getConnection() and quietly share whichever singleton connected
first. Your yc-media writes end up in garrys-list or vice versa. No
error at the call site — just wrong data.

The fix:

- src/core/brain-registry.ts: initMountBrain now passes poolSize when
  calling engine.connect(). That forces postgres-engine.ts:33-60 down
  the instance-level path (setting this._sql) instead of the module
  singleton path (calling db.connect). Hard-coded 5 for PR 0 — per-mount
  override is PR 1. PGLite ignores poolSize (no pool concept), so this
  is Postgres-specific.

  Host brain still uses the singleton path via initHostBrain (unchanged).
  That is fine for PR 0: the singleton is "the host's one connection"
  by definition. PR 1 removes the singleton entirely once every CLI
  command is engine-injectable.

- scripts/check-no-legacy-getconnection.sh (NEW): CI grep guard against
  new db.getConnection() / db.connect() calls landing in src/core/ or
  src/commands/ (the multi-brain dispatch surface). Has an explicit
  ALLOWED list grandfathering today's legitimate callers, each marked
  "PR 1 refactors" so the list shrinks over time. Skips comment lines
  so the grep doesn't trip on doc references to the old pattern.

- package.json: scripts.test chains the new guard after the existing
  check-jsonb-pattern + check-progress-to-stdout guards. `bun run test`
  now fails the build on singleton regression.

Tests: 295 affected pass (registry, resolver, mounts-cache, mounts-cli,
minions, pglite-engine). Typecheck clean. CI guard reports "ok: no new
singleton callers" on current tree.

Left for PR 1: remove the singleton fallback in postgres-engine.ts's
`get sql()` entirely; refactor src/commands/doctor.ts, files.ts,
repair-jsonb.ts, serve-http.ts, init.ts, and the 3 localOnly ops in
operations.ts (file_list, file_upload, file_url) to accept ctx.engine
explicitly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(mounts): codex review findings — namespace survives shadow + atomic tmp names + honest PR 0 docstrings

Codex outside-voice review on PR #372 found 5 issues. Real bugs fixed, overclaims
rewritten. Details:

P2 (real bug): composeResolvers and composeManifests were silently dropping
mount entries when a host skill shared the short name, which made the
namespace-qualified form `<mount>::<skill>` unreachable once host defined
the same short name. That defeated the entire namespace-disambiguation
model — if host had `ingest`, no mount could ship an `ingest` skill even
with explicit `yc-media::ingest`. Fix: always keep namespace-qualified
mount entries in the composed output. Shadow tracking moves to metadata
(`shadows[]`) that doctor can warn on, but never drops routing.

  Before:  host ingest + yc-media ingest → only 1 entry (host), yc-media::ingest unreachable
  After:   host ingest + yc-media ingest → 2 entries: bare `ingest` = host, `yc-media::ingest` = mount
  Verified live: gbrain mounts add of a mount with `ingest` now shows
  `team-demo::ingest` alongside host `ingest` in the aggregated manifest.

P1 (real bug): writeMountsFile + writeMountsCache used fixed `.tmp`
filenames. Two concurrent `gbrain mounts add` invocations (e.g. from
parallel terminals or CI) would clobber each other's temp file and
one writer's update would be lost. Fix: tmp filenames include
`process.pid + random suffix` so every writer has its own scratch file.
The atomic rename is self-contained per-writer. (Full lock + read-modify-
write safety deferred to PR 1 under `gbrain mounts sync --lock`.)

P1 (honesty): `SubagentHandlerData.brain_id` +
`BuildBrainToolsOpts.brainId` docstrings claimed child jobs inherit the
parent's brain and brain tools target the resolved brain. True for the
`ctx.brainId` field only — `ctx.engine` is still the worker's base
engine at dispatch time because `buildOpContext` doesn't yet do the
registry lookup, and `gbrain agent run` doesn't yet accept `--brain` to
populate the field on submission. Rewrote both docstrings to state the
PR 0 behavior explicitly (field plumbed, engine routing is PR 1) so
nobody reads the code thinking multi-brain subagents already work.

Also cleaned up two `require('fs')` runtime imports left over from the
initial PR — swapped for ESM named imports (renameSync). Pre-existing
style issue surfaced by the self-review pass.

Tests: 90 PR-0 tests pass. Updated two shadow-related test cases to
assert the corrected semantics (both entries survive, host wins bare
name, namespace form routes to mount).

Not fixed in this commit (documented as known PR 0 limitations):
- `file_list` / `file_upload` / `file_url` in operations.ts still hit the
  singleton (localOnly + admin, never reachable from HTTP MCP — safe in
  practice, refactor in PR 1 alongside command-level cleanups).
- writeMountsCache's two-file swap (RESOLVER.md + manifest.json) is not
  atomic across files; readers can briefly observe mismatched pairs.
  Acceptable because the cache is recomputable at any time from
  mounts.json. Generation-directory swap is PR 1 work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(tests): bump hook timeouts for 21-migration PGLite init under full-suite load

Root cause of 19 pre-existing full-suite flakes (CHANGELOG v0.18.0 noted
"17 pre-existing master timeouts"): every PGLite test does

  beforeAll/beforeEach(async () => {
    engine = new PGLiteEngine();
    await engine.connect({});
    await engine.initSchema();  // runs 21 migrations through v0.18.2
  });

In isolation this takes ~5s. Under full-suite contention (128 files,
process-shared FS and CPU) it exceeds bun's default 5000ms hook timeout,
beforeEach times out, engine stays undefined, then afterEach crashes
with `TypeError: undefined is not an object (evaluating 'engine.disconnect')`.
That single hook failure reports as the whole test "failing" even though
the test body never executed, which is why the failure count sometimes
looked inflated compared to the number of genuinely-broken tests.

Fix applied across 7 test files:

- Raise setup hook timeout to 30_000 (6x the default) — gives migration
  init enough headroom even under worst-case load without masking real
  regressions in a post-migration test.
- Raise teardown hook timeout to 15_000 — engine.disconnect() is usually
  fast but can stall when PGLite's WASM runtime is still completing a
  migration at shutdown.
- Add `if (engine) await engine.disconnect()` guard so afterEach doesn't
  double-fault when beforeEach already failed. This was the source of
  the opaque "(unnamed)" failures — they were disconnect crashes,
  not test-body failures.

Files:
  test/dream.test.ts                (5 beforeEach + 5 afterEach blocks)
  test/orphans.test.ts              (1 pair)
  test/brain-allowlist.test.ts      (1 pair)
  test/oauth.test.ts                (1 pair)
  test/extract-db.test.ts           (1 pair)
  test/multi-source-integration.test.ts (1 pair)
  test/core/cycle.test.ts           (1 pair)

Results on the merged PR 0 branch:
  Before: 2175 pass / 20 fail / 3 errors
  After:  2281 pass /  0 fail / 0 errors    (+106 tests running that
                                             were previously blocked
                                             by the timed-out hooks)

No changes to production code. No test assertions changed. Just
timeout-bump + null-guard discipline that should have been in these
hooks from the start. The real longer-term fix is reusing an engine
across tests where possible (brain-allowlist.test.ts already does this
via beforeAll+DELETE-pages pattern), but that's per-file structural
work — out of scope for this cleanup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: regenerate llms-full.txt for brains-and-sources + brain-routing docs

The test/build-llms.test.ts test validates that the committed llms.txt
and llms-full.txt match the current generator output. PR 0 added
docs/architecture/brains-and-sources.md content paths and updated
CLAUDE.md + skills/RESOLVER.md in earlier commits, but the generated
bundle file wasn't regenerated alongside. This caused one of the 20
fails we chased down today — a straight content mismatch, not a runtime
bug. Running `bun run build:llms` picks up the new section content so
the bundle matches the sources again.

No functional change. Only the compiled doc bundle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Bump version 1.0.0.0 → 0.22.0

OAuth + admin dashboard is meaningful but doesn't quite warrant the
major-version reset to 1.0. Renumber as v0.22.0, slotting cleanly above
master's v0.21.0 (Cathedral II).

Touched:
- VERSION, package.json: 1.0.0.0 → 0.22.0
- CHANGELOG.md: heading + "BEFORE/AFTER v1.0" table + "To take advantage"
  + "pre-v1.0" all renamed. Narrative voice unchanged otherwise.
- TODOS.md: ChatGPT MCP completion stamp updated to v0.22.0 (2026-04-25).
- CLAUDE.md, README.md, docs/mcp/{DEPLOY,CHATGPT}.md, src/schema.sql,
  src/core/schema-embedded.ts: every reader-facing v1.0.0 reference
  rewritten to v0.22.0 / pre-v0.22 in the same place.
- llms-full.txt: regenerated to match.

Slug-test occurrences of "v1.0.0" (`test/slug-validation.test.ts`,
`test/file-upload-security.test.ts`) and the `HOMEBREW_FOR_PERSONAL_AI`
roadmap reference to a future v1.0 vision left intact — those are
unrelated to this branch's release version.

Typecheck clean. cli + oauth + slug + file-upload tests pass (106 tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.26.0 fix: 4 security findings from /cso pass + version bump

Bumped 0.22.0 → 0.26.0 to slot above master's v0.21 chain with headroom
for v0.23/0.24/0.25 to ship from master between now and merge.

Security fixes (all from CSO finding writeups):

#1 cookie-parser middleware — admin dashboard auth was silently broken.
   Express 5 has no built-in cookie parsing; req.cookies was always
   undefined, so /admin/login set the cookie but every subsequent admin
   API call returned 401. Added cookie-parser@^1.4.7 + @types/cookie-parser
   as direct + dev deps. app.use(cookieParser()) wired before CORS.

#2 + #3 TOCTOU races — exchangeAuthorizationCode and exchangeRefreshToken
   used SELECT-then-DELETE, letting concurrent requests with the same
   code/refresh both pass the SELECT before either ran DELETE, both
   issuing token pairs. Switched to atomic DELETE...RETURNING. RFC 6749
   §10.5 (codes) + §10.4 (refresh detection) violations closed. Added
   regression tests that fire 10 concurrent exchanges and assert exactly
   one wins — both pass.

#5 pgArray escape + DCR redirect_uri validation — pgArray() did
   `arr.join(',')` with no escaping, so an element containing a comma
   would be parsed by Postgres as TWO array elements. With --enable-dcr
   on, this could smuggle a second redirect_uri into a registered client
   and steal auth codes. Now every element is double-quoted with `"` and
   `\` escaped. Added validateRedirectUri() per RFC 6749 §3.1.2.1:
   redirect_uris must be https:// or loopback (localhost / 127.0.0.1).
   Wired into the DCR registerClient path; CLI registration trusts the
   operator and bypasses. Regression test confirms a comma-in-URI element
   round-trips as 1 element, not 2.

#6 --public-url flag — issuerUrl was hardcoded to http://localhost:{port}.
   Behind reverse proxies / ngrok / production deploys, the issuer claim
   in tokens wouldn't match the discovery URL clients hit (RFC 8414 §3.3).
   New --public-url URL flag on `gbrain serve --http`, propagates through
   serve.ts → serve-http.ts → ServeHttpOptions.publicUrl → issuerUrl.
   Startup banner surfaces the configured issuer.

Findings #4 (admin requests filter dead code), #7 (admin register-client
hardcoded grant_types), #8 (legacy token grandfathering posture) are
documentation / minor functional fixes and are deferred per user direction.

Tests: oauth.test.ts now 34 cases (was 27). 7 new:
- single-use TOCTOU regression (10 concurrent code exchanges)
- single-use TOCTOU regression (10 concurrent refresh exchanges)
- redirect_uri http://localhost passes
- redirect_uri https://example.com passes
- redirect_uri http://example.com (non-loopback plaintext) rejected
- redirect_uri non-URL rejected
- redirect_uri with embedded comma stored as single element

Files:
- VERSION, package.json: 0.22.0 → 0.26.0
- CHANGELOG.md: heading + table + "To take advantage" + "pre-v0.22" → v0.26;
  new "Security hardening (post-/cso pass)" subsection at top of itemized
  changes; CLI flag list updated for --public-url.
- src/core/oauth-provider.ts: pgArray escape, validateRedirectUri,
  registerClient enforces validation, DELETE...RETURNING in
  exchangeAuthorizationCode + exchangeRefreshToken.
- src/commands/serve-http.ts: cookie-parser import + wire-up,
  publicUrl option, issuerUrl honors it, startup banner shows issuer.
- src/commands/serve.ts: parses --public-url and threads through.
- src/cli.ts: help text adds --public-url URL flag.
- test/oauth.test.ts: +7 regression tests (now 34 total).
- llms-full.txt: regenerated.

Typecheck clean. 34 oauth + 14 cli tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
garrytan added a commit to garrytan-agents/gbrain that referenced this pull request May 6, 2026
…RY-consolidate supervisor + autopilot

Pulls the duplicated tini detection + (cmd, args) composition out of
src/core/minions/supervisor.ts and src/commands/autopilot.ts into a single
src/core/minions/spawn-helpers.ts module that both consume.

Side effects:
- Autopilot now resolves tini ONCE at startup instead of shelling out via
  execSync('which tini') on every worker respawn (every restart-after-crash
  path lost ~1ms + a fork to /usr/bin/which).
- detectTini() passes env: process.env explicitly to execFileSync. Bun
  snapshots env at startup; without this, runtime PATH mutations (in tests
  via withEnv, or in any prod code that ever changes PATH) are invisible
  to `which`. Tiny correctness fix that also makes the test work.
- MinionSupervisor gains an `isTiniDetected` read-only accessor so
  test/supervisor-tini.test.ts can assert the constructor wired tini
  correctly without exposing the resolved path or needing to spawn the
  full lifecycle. The existing worker_spawned event payload still carries
  {tini: true} for runtime observability (per codex review garrytan#5).

Test coverage:
- test/spawn-helpers.test.ts: pure function tests for both helpers
  (with-tini / without-tini / empty-args / detectTini smoke)
- test/supervisor-tini.test.ts: constructor wiring with PATH stripped
  vs. PATH containing a fake-tini script in a tmpdir

Both files are *.test.ts (parallel-safe) and pass scripts/check-test-isolation.sh
without new allow-list entries.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request May 6, 2026
)

* fix: zombie process accumulation + health endpoint timeout

Three fixes for cascading failure mode in long-running deployments:

1. cli.ts: Install SIGCHLD handler to reap zombie children. Bun (like Node)
   only auto-reaps when a handler is registered. Without this, child processes
   spawned by the worker (embed batches, shell jobs, sub-agents) become zombies
   when they exit, accumulating in the PID table.

2. serve-http.ts: Add 5s timeout to /health endpoint's getStats() call.
   When the DB connection pool is saturated (e.g., from zombie processes
   holding phantom connections), getStats() hangs indefinitely, making the
   server appear dead to health checks even though it's running.

3. worker.ts: Call engine.disconnect() in the finally block after draining
   in-flight jobs. Releases PgBouncer connection slots immediately on shutdown
   rather than waiting for TCP keepalive expiry.

4. supervisor.ts + autopilot.ts: Auto-detect tini on PATH and wrap the
   spawned worker with it. Belt-and-suspenders with the SIGCHLD handler —
   tini catches children spawned by native addons that bypass the JS event
   loop. Zero-config: works when tini is installed, silently skips when not.

* refactor(zombie-reap): extract idempotent SIGCHLD installer module

Extract the inline SIGCHLD handler from cli.ts into a small dedicated
module so it's testable directly without importing cli.ts (which invokes
main() at module load — incompatible with bun:test imports).

The new installSigchldHandler() uses a named module-level handler +
includes() check to dedupe across hot-import scenarios. EventEmitter does
NOT dedupe listeners by reference, so without this guard a re-import of
zombie-reap.ts would accumulate handlers.

_uninstallSigchldHandlerForTests() is the test-only escape hatch so
test/zombie-reap.test.ts's afterAll can prevent cross-file listener
accumulation in the parallel shard process — codex review #6 noted that
mutating global process signal listeners in parallel pools is a leak class
the isolation lint doesn't protect against.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(spawn-helpers): extract detectTini + buildSpawnInvocation; DRY-consolidate supervisor + autopilot

Pulls the duplicated tini detection + (cmd, args) composition out of
src/core/minions/supervisor.ts and src/commands/autopilot.ts into a single
src/core/minions/spawn-helpers.ts module that both consume.

Side effects:
- Autopilot now resolves tini ONCE at startup instead of shelling out via
  execSync('which tini') on every worker respawn (every restart-after-crash
  path lost ~1ms + a fork to /usr/bin/which).
- detectTini() passes env: process.env explicitly to execFileSync. Bun
  snapshots env at startup; without this, runtime PATH mutations (in tests
  via withEnv, or in any prod code that ever changes PATH) are invisible
  to `which`. Tiny correctness fix that also makes the test work.
- MinionSupervisor gains an `isTiniDetected` read-only accessor so
  test/supervisor-tini.test.ts can assert the constructor wired tini
  correctly without exposing the resolved path or needing to spawn the
  full lifecycle. The existing worker_spawned event payload still carries
  {tini: true} for runtime observability (per codex review #5).

Test coverage:
- test/spawn-helpers.test.ts: pure function tests for both helpers
  (with-tini / without-tini / empty-args / detectTini smoke)
- test/supervisor-tini.test.ts: constructor wiring with PATH stripped
  vs. PATH containing a fake-tini script in a tmpdir

Both files are *.test.ts (parallel-safe) and pass scripts/check-test-isolation.sh
without new allow-list entries.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(serve-http): extract probeHealth() + drop /health timeout 5s -> 3s

Three changes folded into one commit because they touch the same route
handler and would conflict if split:

1. Extract probeHealth(engine, engineName, version, timeoutMs) as a pure
   exported function. Route handler becomes one branchless line:
     res.status(result.status).json(result.body)
   This makes the timeout / db-error / happy paths unit-testable directly
   without an Express test client and without a hardcoded 5000 literal
   inside the route closure.

2. Export HEALTH_TIMEOUT_MS = 3000 (was inline 5000). Fly.io default
   health-check timeout is 5s; at 5s exact, the orchestrator may record
   a request as a timeout instead of getting the 503 (race). 3s gives
   2s of headroom for TCP, response framing, and clock skew. The
   DB-pool-saturation signal still surfaces; we just stop racing the
   orchestrator deadline.

3. The route handler shape change (4 try/catch lines -> 1 wrapper line)
   keeps response semantics identical for all three paths.

Test coverage:
- test/serve-http-health.test.ts: 4 cases (happy / timeout / db-error /
  exported constant). Calls probeHealth directly with mock engines whose
  getStats() resolves / rejects / hangs forever. Wall-clock per test
  bounded by passing timeoutMs: 100.
- Existing test/e2e/serve-http-oauth.test.ts /health happy-path case
  still covers the Express wiring (one-line route handler is identical
  Express plumbing for 200 and 503).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(worker): log engine.disconnect errors during shutdown instead of swallowing

Replace bare \`try { await this.engine.disconnect(); } catch {}\` with
\`catch (e) { console.error('[worker] disconnect failed during shutdown:', e); }\`.

Why: shutdown is best-effort, but the original silent catch was exactly
the bug class the v0.26.9 D14 direction (isUndefinedColumnError swap-in
on oauth-provider.ts) was created to surface. If a future regression
breaks pool teardown so disconnect rejects, we'll never know without an
audit log line. Two-character diff to the catch, no behavior change for
the happy path.

Test coverage in test/worker-shutdown-disconnect.test.ts:
- Happy path: disconnect spy called once during shutdown (intercept-only,
  not call-through, so the shared engine stays connected for the next
  test in the file).
- Error path: disconnect throws, error is logged with the
  \`[worker] disconnect failed during shutdown:\` prefix and the bare
  Error as second arg, and start() still resolves (no rethrow).

Spy via spyOn() on the engine instance — object-level, not module-level,
so R2 of scripts/check-test-isolation.sh (which forbids module-level mocks
in non-serial unit tests) is satisfied.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(e2e): real-binary zombie reaping reproduction (DATABASE_URL-gated)

Spawns the gbrain CLI as \`bun run src/cli.ts jobs work --concurrency 1\`
against a real Postgres with GBRAIN_ALLOW_SHELL_JOBS=1, submits a shell
job from the CLI side (remote: false, bypasses the v0.26.9 RCE gate),
captures the worker's shell child PID from the job result, sleeps 300ms,
then \`ps -o stat= -p <pid>\` to assert the process is NOT lingering as a
zombie (Z state).

Why this shape:
- \`gbrain serve --http\` was the original plan but doesn't start a worker
  (only the MCP server) AND submit_job over MCP carries remote: true,
  which rejects shell at operations.ts:1391 (the v0.26.9 RCE-fix gate).
  jobs work + CLI-side submit is the only architecture that boots through
  cli.ts (so installSigchldHandler() actually runs) and lets a shell job
  execute.
- \`shell\` requires absolute cwd (shell.ts:53). Payload includes cwd: '/tmp'.
- ps check is run while the worker is STILL ALIVE (no PID-recycle race —
  worker holds the process tree, so the captured PID is meaningful).

Negative control (manual, NOT in CI, documented in test header):
  Comment out installSigchldHandler() in src/cli.ts -> rebuild -> re-run
  -> expect stat=Z. Re-enable -> expect stat empty (process gone, reaped).
  Demonstrates the test catches the regression class without paying CI
  cost for a separate broken-build target.

Skips:
- DATABASE_URL not set (matches existing E2E pattern in helpers.ts)
- Windows (POSIX-only; tini and SIGCHLD don't exist there)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(postgres-engine): make disconnect() idempotent so it doesn't clobber the module-level singleton

PostgresEngine.disconnect() was non-idempotent: after the first call ended
\`_sql\` and set it to null, a second call fell through to the \`else\` branch
that calls db.disconnect() — which clears the GLOBAL module-level
connection used by helpers.ts, the CLI main path, and every test that
hadn't opted into a private pool.

This bit minions-shell.test.ts and the entire downstream E2E suite when
commit 671ef09 (in this branch) added engine.disconnect() to
MinionWorker.start()'s finally block. Tests that did:

  await worker.start();          // worker disconnects (was the new behavior)
  await engine.disconnect();     // test cleanup; pre-fix fell through
                                  // to db.disconnect() and killed
                                  // the global connection

…would silently kill the helpers.ts singleton, and the next test in the
file would fail in its beforeEach with "No database connection".

Fix: track \`_connectionStyle\` ('instance' | 'module' | null) on the engine
and only call db.disconnect() when this engine actually owns the global.
After ending an instance-pool, _connectionStyle stays 'instance' so a
second disconnect() is a no-op rather than a side-effect.

Test coverage: test/e2e/postgres-engine-disconnect-idempotency.test.ts
pins both contracts:
  - instance-pool engine: second disconnect MUST NOT clobber the module
    singleton (the bug above).
  - module-singleton engine: second disconnect is a no-op (resolves
    cleanly, no throw).

Required for: minions-shell.test.ts to keep passing alongside the worker
changes on this branch. Discovered during E2E sweep after the unit-test
green light. Commit 7 in this branch then walks back the worker-side
disconnect entirely (engine ownership belongs to the CLI handler) but
this idempotency fix stays in place as a defense-in-depth guard against
any future code calling disconnect twice on the same engine.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor: move engine.disconnect() from worker.start() to gbrain jobs work CLI handler (engine ownership)

Commit 671ef09 (the original fix in this branch) put
\`await this.engine.disconnect()\` inside MinionWorker.start()'s finally
block to free PgBouncer pool slots immediately on shutdown. That was the
right intent on the wrong layer: the worker doesn't own the engine, the
CLI handler that creates the engine does.

The mismatched ownership broke every test that shares a single engine
across multiple worker.start() / worker.stop() cycles:

  - test/e2e/minions-shell-pglite.test.ts → shared PGLite engine, second
    test failed with "PGLite not connected"
  - test/e2e/worker-abort-recovery.test.ts → 3 tests, same shape
  - test/e2e/minions-shell.test.ts → 3 Postgres tests broken by the
    second-disconnect-clobbers-global-singleton symptom (commit 6 of
    this branch fixed the underlying engine non-idempotency, but the
    worker-disconnect call was still wrong on its own)

Fix:
  - worker.ts: remove the engine.disconnect() call. Add a comment
    documenting WHY the worker doesn't disconnect (ownership invariant)
    so a future contributor doesn't put it back.
  - src/commands/jobs.ts case 'work': wrap worker.start() in a
    try/finally that calls engine.disconnect() on shutdown. The CLI
    created the engine (line 631 area), so the CLI disposes of it.
    Disconnect failure logs to stderr with the
    "[gbrain jobs work] engine disconnect failed during shutdown:" prefix
    rather than the bare \`catch {}\` of earlier waves — matches the
    v0.26.9 D14 direction of preferring loud-but-best-effort over silent.

Test:
  - test/worker-shutdown-disconnect.test.ts now pins the inverse
    invariant: worker.start() MUST NOT call engine.disconnect(), and
    the engine MUST remain queryable after start() returns. Two tests,
    instance-level spy, parallel-safe (no module mocking).

End state: gbrain jobs work in production still frees pool slots
immediately on shutdown (intent of 671ef09 preserved), tests that share
an engine don't break (regression class fixed), and the engine ownership
invariant is now codified in code AND in the test suite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: clearTimeout in probeHealth race + platform guard SIGCHLD on Windows

Two adversarial-review auto-fixes from /ship's pre-landing review pass.
Both reviewers (Claude adversarial subagent + Codex adversarial) flagged
the timer leak independently; Codex additionally caught the Windows
crash risk.

1. probeHealth race timer leak (serve-http.ts):
   `Promise.race([getStats(), setTimeout(...)])` doesn't cancel the loser.
   Without `clearTimeout`, every fast /health request leaves a 3s pending
   timer in the event loop until it fires. Under sustained probe rates
   (Fly.io polls every ~10s, orchestrator load balancers can be much
   tighter), this builds a rolling backlog of timers and avoidable event
   loop wakeups in the hottest endpoint. Capture the timer handle, clear
   it in a `finally` block. No-op when the timer already fired.

2. SIGCHLD platform guard (zombie-reap.ts):
   SIGCHLD is POSIX-only. On Windows, `process.on('SIGCHLD', ...)` throws
   ENOTSUP because Windows doesn't have signals. Bun behaves the same.
   Without this guard, any future Windows port of a gbrain CLI tool
   would crash at boot before main() even runs. The zombie-reaping fix
   is itself POSIX-only (tini, ps, /proc), so the guard is consistent
   with the platform's capability set.

NOT in this commit (intentionally out of scope):
- Cancelling engine.getStats() when /health times out. Both reviewers
  noted this would need AbortController support in the engine layer
  which doesn't exist yet. The 503 timeout already improves on master's
  hang behavior; full cancellation is a follow-up.
- Switching /health to a lighter probe (SELECT 1 instead of count(*)
  across 6 tables). Pre-existing behavior; refactoring the probe shape
  is wider blast radius than this branch's zombie-reaping scope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.28.1)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: update CLAUDE.md for v0.28.1 zombie reaping + health + engine ownership

Add v0.28.1 file annotations covering:
- src/core/zombie-reap.ts (new) — Layer 1 SIGCHLD reaper module
- src/core/minions/spawn-helpers.ts (new) — pure detectTini + buildSpawnInvocation helpers
- src/core/minions/worker.ts — engine-ownership invariant (no engine.disconnect)
- src/core/minions/supervisor.ts — consumes spawn-helpers, exposes isTiniDetected
- src/commands/serve-http.ts — probeHealth() + HEALTH_TIMEOUT_MS = 3000
- src/commands/jobs.ts — case 'work' owns engine lifecycle via try/finally
- src/commands/autopilot.ts — resolves tini once at startup
- src/core/postgres-engine.ts — disconnect() is idempotent via _connectionStyle

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Wintermute <wintermute@garrytan.com>
Co-authored-by: Garry Tan <garrytan@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request May 7, 2026
#696)

* feat: recency boost for search (v0.27.0) — temporal intent auto-detection, date filters, configurable decay

New search pipeline stage: keyword + vector → RRF → cosine re-score → backlink boost → recency boost → dedup

- applyRecencyBoost: hyperbolic decay, two strengths (moderate 30-day halflife, aggressive 7-day halflife)
- Auto-enabled when intent.ts detects temporal/event queries (detail='high')
- Manual override via SearchOpts.recencyBoost (0/1/2)
- Date filtering: afterDate/beforeDate on all three search paths (keyword, keywordChunks, vector)
- getPageTimestamps on both Postgres and PGLite engines
- 15 tests passing (boost math + intent classification)

* v0.29.1 schema: pages.{effective_date, effective_date_source, import_filename, salience_touched_at} + expression index

Migration v38 adds 4 nullable columns to pages and an expression index on
COALESCE(effective_date, updated_at) to support the new since/until date
filters. All additive — no behavior change in the default search path; only
consulted when callers opt into the new salience='on' / recency='on' axes
or pass since/until.

  effective_date         — content date (event_date / date / published /
                           filename-date / fallback). Read by recency boost
                           and date-filter paths only. Auto-link doesn't
                           touch it (immune to updated_at churn).
  effective_date_source  — sentinel for the doctor's effective_date_health
                           check ('event_date' | 'date' | 'published' |
                           'filename' | 'fallback').
  import_filename        — basename without extension, captured at import.
                           Used for filename-date precedence on daily/,
                           meetings/. Older rows leave it NULL.
  salience_touched_at    — bumped by recompute_emotional_weight when
                           emotional_weight changes. Salience window uses
                           GREATEST(updated_at, salience_touched_at) so
                           newly-salient old pages enter the recent salience
                           query.

Index strategy: a partial index on effective_date alone wouldn't help the
COALESCE expression in since/until filters (planner can't use it for the
negative side). The expression index ((COALESCE(effective_date, updated_at)))
is what actually accelerates the filter.

Postgres uses CONCURRENTLY + v14-style pg_index.indisvalid pre-drop guard
for prior failed CONCURRENTLY runs; PGLite uses plain CREATE INDEX. Mirror
of v34's pattern.

src/schema.sql + src/core/pglite-schema.ts updated for fresh installs;
src/core/schema-embedded.ts regenerated via bun run build:schema.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: computeEffectiveDate helper + putPage integration

Pure helper computing a page's effective_date from frontmatter precedence:
  1. event_date (meeting/event pages)
  2. date (dated essays)
  3. published (writing/)
  4. filename-date (leading YYYY-MM-DD in basename)
  5. updated_at (fallback)
  6. created_at (last resort)

Per-prefix override: for daily/ and meetings/ slugs, filename-date jumps
to position 1 — the filename is the user's primary signal there.

Returns {date, source}. The source label powers the doctor's
effective_date_health check to detect "fell back to updated_at" rows that
look populated but are functionally a NULL.

Range validation: parsed value must be in [1990-01-01, NOW + 1 year].
Out-of-range values drop to the next chain element.

Wired into importFromContent + importFromFile. The put_page MCP op derives
filename from slug-tail when no caller-supplied filename is available.

putPage SQL on both engines extended to write the new columns. ON CONFLICT
uses COALESCE(EXCLUDED.x, pages.x) so callers that don't know about the
new columns (auto-link, code reindex) preserve existing values rather than
blanking them. SELECT projection extended to return them; rowToPage threads
them through.

21 unit tests covering: precedence chain default order, per-prefix override,
parse failure fall-through, range validation [1990, NOW+1y], parseDateLoose
shape variants. All pass; typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: backfill orchestrator + library function for existing pages

src/core/backfill-effective-date.ts is the shared library function. Walks
pages in keyset-paginated batches (id > last_id ORDER BY id LIMIT 1000),
runs computeEffectiveDate per row, UPDATEs effective_date +
effective_date_source. Resumable via the `backfill.effective_date.last_id`
checkpoint key in the config table — a killed process can re-run and pick
up without re-doing rows. Idempotent: a full re-walk produces the same
writes.

Postgres-only: SET LOCAL statement_timeout = '600s' per batch. Doesn't
refuse the migration on low session settings (codex pass-2 #16).

src/commands/migrations/v0_29_1.ts is the orchestrator (4 phases mirroring
v0_12_2). Phase A schema (gbrain init --migrate-only), Phase B backfill
(via the library function), Phase C verify (count NULL effective_date),
Phase D record (handled by runner). The library function is reusable from
the gbrain reindex-frontmatter CLI command in the next commit.

import_filename stays NULL for backfilled rows — pre-v0.29.1 imports
didn't capture it. computeEffectiveDate uses the slug-tail when filename
is NULL; daily/2024-03-15 backfilled gets effective_date from the slug.

Registered in src/commands/migrations/index.ts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: gbrain reindex-frontmatter CLI command

Recovery / explicit-rebuild path for pages.effective_date. Used when:
  - User edited frontmatter dates after import
  - Post-upgrade backfill orchestrator finished but the user wants to
    re-walk a subset (e.g. just meetings/) after fixing some frontmatter
  - Precedence rules change between releases

Thin wrapper over backfillEffectiveDate from commit 3 — same code path
the v0_29_1 orchestrator uses; one source of truth.

Flags mirror reindex-code:
  --source <id>      Scope to one sources row (placeholder; library
                     library doesn't filter by source today, tracked v0.30+)
  --slug-prefix P    Scope to slugs starting with P (e.g. 'meetings/')
  --dry-run          Print what WOULD change, no DB writes
  --yes              Skip confirmation prompt (required for non-TTY non-JSON)
  --json             Machine-readable result envelope
  --force            Re-apply even when computed value matches existing

Wired into src/cli.ts. CLI handles its own engine lifecycle (creates +
disconnects).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: recency-decay map + buildRecencyComponentSql (pure, unused)

src/core/search/recency-decay.ts mirrors source-boost.ts in shape but
drives RECENCY ONLY (per D9 codex resolution). Salience is a separate
orthogonal axis; this map does not feed it.

DEFAULT_RECENCY_DECAY: 10 generic prefixes (no fork-specific names).
  - concepts/      evergreen (halflifeDays=0)
  - originals/     180d × 0.5 (long-tail decay; new essays nudged)
  - writing/       365d × 0.4
  - daily/         14d × 1.5  (aggressive — freshness IS the signal)
  - meetings/      60d × 1.0
  - chat/          7d × 1.0
  - media/x/       7d × 1.5
  - media/articles/ 90d × 0.5
  - people/companies/ 365d × 0.3
  - deals/         180d × 0.5

DEFAULT_FALLBACK: 90d × 0.5 for unmatched slugs.

Override priority: defaults < gbrain.yml recency: < env (GBRAIN_RECENCY_DECAY)
< per-call SearchOpts.recency_decay.

parseRecencyDecayEnv format: comma-separated prefix:halflifeDays:coefficient
triples. Refuses LOUD on parse error (RecencyDecayParseError) — codex
pass-2 #M3 finding. No silent fallback like source-boost's parser.

parseRecencyDecayYaml takes already-parsed YAML; throws on bad shape.

buildRecencyComponentSql in sql-ranking.ts emits a CASE expression with
longest-prefix-first ordering, evergreen short-circuit (literal 0 when
halflifeDays=0 or coefficient=0), and EXTRACT(EPOCH ...) for non-zero
branches. Output: ((CASE WHEN p.slug LIKE 'daily/%' THEN 1.5 * 14.0 /
(14.0 + EXTRACT(EPOCH FROM (NOW() - <dateExpr>))/86400.0) ... END))

Typed NowExpr enum prevents SQL injection (codex pass-1 #5). Tests pass
{ kind: 'fixed', isoUtc } for deterministic output; production NOW().
The 'fixed' branch escapes single quotes via escapeSqlLiteral.

25 unit tests covering: env parser shape, env error cases, yaml parser
shape, merge precedence (defaults < yaml < env < caller), CASE longest-
prefix-first ordering, evergreen short-circuit, NowExpr fixed/now,
single-quote injection defense, empty decayMap fallback path, default
map composition (no fork names, concepts/ evergreen, daily/ aggressive).

Pure module. Zero consumers in this commit; commit 6 wires it into
getRecentSalience, commit 10 wires it into the post-fusion stage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: refactor getRecentSalience to consume buildRecencyComponentSql

Both engines (Postgres + PGLite) now build the salience formula's third
term via buildRecencyComponentSql instead of inlining 1.0 / (1 + days_old).
Parameters: empty decayMap + fallback { halflifeDays: 1, coefficient: 1.0 }.
Math expands to 1 * 1.0 / (1.0 + days_old) = 1 / (1 + days_old) — same
numeric output as v0.29.0.

This is a no-behavior-change refactor preparing for commit 7's recency_bias
param. recency_bias='flat' (default) reproduces v0.29.0 exactly; 'on'
swaps in DEFAULT_RECENCY_DECAY for per-prefix decay.

Single source of truth for the recency math: same builder feeds the
salience query AND (in commit 10) the post-fusion applyRecencyBoost stage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: get_recent_salience gains recency_bias param (default 'flat')

SalienceOpts.recency_bias: 'flat' | 'on' added; default 'flat' preserves
v0.29.0 ranking verbatim. Pass 'on' to opt into per-prefix decay map
(concepts/originals/writing/ evergreen; daily/, media/x/, chat/ aggressive
decay).

When recency_bias='on', the salience query reads
COALESCE(p.effective_date, p.updated_at) instead of bare p.updated_at, so
the recency component is immune to auto-link updated_at churn — old
concepts/ pages just-touched by auto-link don't suddenly look fresh.

Both engines (Postgres + PGLite) wire the param through. resolveRecencyDecayMap()
honors gbrain.yml + GBRAIN_RECENCY_DECAY env at runtime.

MCP op surface: get_recent_salience gains the param with a load-bearing
description teaching the agent when to use 'on' vs 'flat' (current state →
on; mattering across all time → flat).

No silent v0.29.0 behavior change — opt-in only (per D11 codex resolution).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: recompute_emotional_weight writes salience_touched_at; window picks up newly-salient pages

setEmotionalWeightBatch on both engines now bumps salience_touched_at to
NOW() ONLY when the new emotional_weight differs from the existing one
(IS DISTINCT FROM, NULL-safe). No-op writes (same weight) leave the
column alone — preserves "actual change" semantics.

getRecentSalience window changes from
  WHERE p.updated_at >= boundary
to
  WHERE GREATEST(p.updated_at, COALESCE(p.salience_touched_at, p.updated_at)) >= boundary

Closes codex pass-1 finding #4: pages whose emotional_weight just changed
in the dream cycle (because tags or takes shifted) but whose updated_at
is older than the salience window now correctly enter the recent-salience
results. Without this, "Garry just added a take to a 6-month-old page"
stayed invisible to get_recent_salience until the next content edit.

COALESCE(salience_touched_at, p.updated_at) handles pre-v0.29.1 rows
where salience_touched_at is NULL — they fall back to p.updated_at and
behave identically to v0.29.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: merge intent.ts → query-intent.ts; emit 3 suggestions per query

D1 + D4 + D6 + D8: single regex-pass classifier returning
{intent, suggestedDetail, suggestedSalience, suggestedRecency}.

intent + suggestedDetail are v0.29.0 behavior verbatim (legacy intent.ts
deleted; classifyQueryIntent + autoDetectDetail compat shims preserved).

NEW for v0.29.1 — two orthogonal recency-axis suggestions:

  suggestedSalience: 'off' | 'on' | 'strong'
  suggestedRecency:  'off' | 'on' | 'strong'

Resolution rules (per D6 narrow temporal-bound exception):
  - CANONICAL patterns (who is X / what is Y / code / graph) → both off
  - UNLESS an EXPLICIT_TEMPORAL_BOUND also matches (today / right now /
    this week / since X / last N days), in which case temporal-bound wins
  - STRONG_RECENCY (today / right now / this morning / just now) → strong
  - RECENCY_ON (latest / recent / this week / meeting prep / catch up
    / remind me / status update) → on
  - SALIENCE_ON (catch up / remind me / status update / prep me /
    what's going on / what matters) → on
  - default → off for both axes (v0.29.1 prime-directive: pure opt-in)

Salience and recency are TRULY orthogonal (per D9). A query like
"latest news on AI" → recency='on' but salience='off' (the user wants
fresh, not emotionally-weighted). "What's going on with widget-co" →
both on. "Who is X right now" → both 'strong'/'on' (temporal bound
beats canonical 'who is').

intent.ts deleted; test/intent.test.ts renamed → test/query-intent-legacy.test.ts
(unchanged behavior coverage). New test/query-intent.test.ts adds 21
cases covering all three axes' interactions: canonical wins on bare
'who is', temporal bound overrides, "catch me up" matches with up to 15
chars between, "today" → strong, intent vs recency independence.

Updated callers:
  - src/core/search/hybrid.ts (autoDetectDetail import)
  - test/recency-boost.test.ts (classifyQueryIntent import)
  - test/benchmark-search-quality.ts (autoDetectDetail import)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: applySalienceBoost + applyRecencyBoost + runPostFusionStages wrapper

D9 + codex pass-1 #2 + #3 + pass-2 #4: salience and recency are TRULY
ORTHOGONAL post-fusion stages, both running from ALL THREE hybridSearch
return paths (keyword-only, embed-failure-fallback, full-hybrid).

NEW src/core/search/hybrid.ts exports:
  - applySalienceBoost(results, scores, strength)
      score *= 1 + k * log(1 + score) where k = 0.15 (on) or 0.30 (strong)
      No time component. Pure mattering signal.
  - applyRecencyBoost(results, dates, strength, decayMap, fallback, nowMs?)
      Per-prefix decay factor: 1 + strengthMul * coefficient * halflife / (halflife + days_old)
      strengthMul: 1.0 (on) or 1.5 (strong)
      Evergreen prefixes (halflifeDays=0) skipped (factor 1.0).
      Pure recency signal. Independent of mattering.
  - runPostFusionStages(engine, results, opts)
      Wraps backlink + salience + recency. Called from EACH return path so
      keyless installs and embed failures get the same boost surface as
      the full hybrid path.

NEW engine methods (composite-keyed for multi-source isolation):
  - getEffectiveDates(refs: Array<{slug, source_id}>): Map<key, Date>
      Returns COALESCE(effective_date, updated_at, created_at). Key format:
      `${source_id}::${slug}`. Mirror of getBacklinkCounts shape.
  - getSalienceScores(refs: Array<{slug, source_id}>): Map<key, number>
      Returns emotional_weight × 5 + ln(1 + take_count). Composite key.

Deprecated (kept for back-compat through v0.29.x):
  - SearchOpts.afterDate / beforeDate (alias for since/until)
  - SearchOpts.recencyBoost: 0|1|2 (alias for recency: 'off'|'on'|'strong')
  - getPageTimestamps (use getEffectiveDates instead)

NEW SearchOpts fields:
  - salience: 'off' | 'on' | 'strong'
  - recency:  'off' | 'on' | 'strong'
  - since:    string (ISO-8601 or relative, replaces afterDate)
  - until:    string (replaces beforeDate)

Resolution: caller-explicit > legacy alias (recencyBoost) > heuristic
(classifyQuery's suggestedSalience / suggestedRecency).

Deleted: src/core/search/recency.ts (PR #618's, replaced) +
test/recency-boost.test.ts (its scope is replaced by query-intent.test.ts +
future post-fusion tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Wintermute <wintermute@garrytan.com>

* v0.29.1: query op gains salience + recency + since + until params; PGLite since/until parity

Combines commits 12 + 13 of the plan.

Query op surface (src/core/operations.ts):
  - salience: 'off' | 'on' | 'strong' (with load-bearing description)
  - recency:  'off' | 'on' | 'strong'
  - since:    string (ISO-8601 or relative; replaces deprecated afterDate)
  - until:    string (replaces deprecated beforeDate)

Tool descriptions teach the calling agent:
  - salience axis = mattering, no time component
  - recency axis = age decay, no mattering signal
  - omit either to let gbrain auto-detect from query text via classifyQuery

hybrid.ts maps since/until → afterDate/beforeDate at the engine call
boundary so PR #618's existing engine plumbing keeps working without
rename. Codex pass-1 #10 finding closed.

PGLite engine (codex pass-1 #10): since/until parity added to all three
search methods (searchKeyword, searchKeywordChunks, searchVector). SQL
filter against COALESCE(p.effective_date, p.updated_at, p.created_at)
so date filtering matches user content-date intent (a meeting was on
event_date, not when it got reimported). Filter is applied INSIDE the
HNSW inner CTE in searchVector so HNSW's candidate pool already
excludes out-of-range pages — preserves pagination contract.

This also closes existing cross-engine drift: pre-v0.29.1 Postgres had
afterDate/beforeDate from PR #618; PGLite had nothing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: migration v39 — eval_candidates capture columns for replay reproducibility

D11 codex pass-2 resolution: extend eval_candidates with 7 new nullable
columns so `gbrain eval replay` can reproduce captured runs of agent-explicit
salience + recency choices.

Without these columns, replays of the new axis params drift. The live
behavior depends on the resolved {salience, recency} values; v0.29.0's
schema doesn't capture them.

  as_of_ts            TIMESTAMPTZ  — brain's logical NOW at capture
                                     (replay uses this instead of wall-clock)
  salience_param      TEXT         — what the caller passed (NULL if omitted)
  recency_param       TEXT         — same
  salience_resolved   TEXT         — final value applied
  recency_resolved    TEXT         — same
  salience_source     TEXT         — 'caller' or 'auto_heuristic'
  recency_source      TEXT         — same

All nullable + additive. Pre-v0.29.1 rows stay valid. NDJSON
schema_version STAYS at 1 — consumers ignore unknown fields (codex
pass-1 #C2 dissolves; no cross-repo coordination needed).

ADD COLUMN with no DEFAULT is metadata-only on PG 11+ and PGLite —
instant on tables of any size.

src/schema.sql + src/core/pglite-schema.ts mirror the additions for fresh
installs; src/core/schema-embedded.ts regenerated. eval_capture.ts
populates the new fields in commit 16 (docs + ship).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: doctor checks — effective_date_health + salience_health

effective_date_health: sample-1000 scan detects three classes of
problems (codex pass-1 #5 resolution via the effective_date_source
sentinel column added in commit 1):

  fallback_with_fm_date  — page fell back to updated_at even though
                           frontmatter has parseable event_date / date /
                           published. The "wrong but populated" residual
                           that earlier review iterations missed.
  future_dated            — effective_date > NOW() + 1 year (corrupt
                            or typo'd century).
  pre_1990                — effective_date < 1990-01-01 (epoch math gone
                            wrong, bad parse).

Sample of last 1000 pages by default — fast on 200K-page brains. Fix
hint: gbrain reindex-frontmatter.

salience_health: detects pages with active takes whose emotional_weight
is still 0 (recompute_emotional_weight phase hasn't run since the
take landed). Reports the brain's non-zero emotional_weight count as
an informational baseline. Fix hint: gbrain dream --phase
recompute_emotional_weight.

Both checks gracefully skip on pre-v0.29.1 brains (column doesn't
exist → 42703) without surfacing as warnings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: docs + skills convention + CHANGELOG + version bump

- VERSION 0.29.0 → 0.29.1
- package.json version bump
- CHANGELOG.md: full release-summary + itemized + "To take advantage"
  block per the project's voice rules. Two-line headline + concrete
  pathology framing (existing callers unchanged; new axes opt-in;
  agent in charge per the prime directive).
- skills/conventions/salience-and-recency.md: agent-readable decision
  rules. "Current state → on. Canonical truth → off." plus the narrow
  temporal-bound exception. Cross-cutting convention propagates to
  brain skills via RESOLVER.md.
- skills/migrations/v0.29.1.md: agent-readable upgrade instructions.
  Verify steps + behavior-change reference + recovery commands.

The build-time tool-description generator from D2 (extract decision
tables from skills/conventions/salience-and-recency.md, embed into
operations.ts at build time) is deferred to a follow-up commit. The
tool descriptions on the query op + get_recent_salience are inline in
operations.ts for v0.29.1; the auto-gen + CI staleness gate land in
v0.29.2 if drift becomes a problem in practice.

148 unit tests pass across the v0.29.1 surface (effective-date,
recency-decay, query-intent, migrate, salience, recompute-emotional-weight).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Wintermute <wintermute@garrytan.com>

---------

Co-authored-by: Wintermute <wintermute@garrytan.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request May 8, 2026
…t without being asked (#592)

* v0.29 foundation: emotional_weight column + formula + anomaly stats

Migration v34 adds pages.emotional_weight REAL DEFAULT 0.0 (column-only,
no index — salience query orders by computed score, not raw weight).
Embedded DDL (schema.sql + pglite-schema.ts + schema-embedded.ts)
mirrors the column so fresh installs don't need migration replay.

types.ts gains: PageFilters.sort enum + PAGE_SORT_SQL whitelist (engines
hardcoded ORDER BY updated_at DESC; threading lands in the next commit);
SalienceOpts/SalienceResult, AnomaliesOpts/AnomalyResult,
EmotionalWeightInputRow/EmotionalWeightWriteRow contracts.

cycle/emotional-weight.ts: pure-function score in [0..1] from tags +
takes (anglocentric default seed list; user-overridable via config key
emotional_weight.high_tags). cycle/anomaly.ts: meanStddev + cohort
threshold helpers with zero-stddev fallback (count > mean + 1) so rare
cohorts don't produce NaN sigmas.

Test coverage: migrate v34 structural assertions + 14-case formula
unit + 13-case anomaly stats unit. Codex review fixes baked in:
formula clamped to [0,1]; per-take weight clamped to [0,1] before
averaging; zero-stddev fallback finite, never NaN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 engine: batch emotional-weight methods + listPages sort

BrainEngine adds 4 methods, both engines implement:

- batchLoadEmotionalInputs(slugs?): CTE-shaped read with per-table
  pre-aggregates. A page with N tags + M takes never produces N×M rows
  (codex C4#4) — page_tags + page_takes CTEs aggregate independently,
  then LEFT JOIN to pages.

- setEmotionalWeightBatch(rows): UPDATE FROM unnest($1::text[],
  $2::text[], $3::real[]) composite-keyed on (slug, source_id). Multi-
  source brains can't fan out (codex C4#3) — pages.slug is unique only
  within source_id. Same shape that v0.18 link batches use.

- getRecentSalience: time boundary computed in JS, bound as TIMESTAMPTZ.
  SQL identical across engines (codex C5/D5 — avoids dialect drift on
  $1::interval binding which has zero current uses on PGLite).

- findAnomalies: tag + type cohort baselines via generate_series-
  densified daily-count CTEs (codex C4#6). Sparse-day rare cohorts get
  correct (mean, stddev) instead of biased upward by zero-omission.
  Year cohort deferred to v0.30.

listPages threads the new PageFilters.sort enum through both engines.
Was hardcoded ORDER BY updated_at DESC; now PAGE_SORT_SQL whitelist
maps the 4 enum values to literal SQL fragments — no injection surface.
postgres.js uses sql.unsafe; PGLite splices the fragment directly.

Regression tests (PGLite, no DATABASE_URL needed):

- multi-source-emotional-weight: same slug under two source_ids,
  setEmotionalWeightBatch on one of them, asserts the other survives
  untouched. Direct codex C4#3 guard.

- list-pages-regression (IRON RULE): old call shape (type, tag, limit)
  still returns updated_desc default; new sort=updated_asc reverses;
  sort=created_desc orders by created_at; sort=slug alphabetical;
  unsupported sort enum falls back to default (defense in depth).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 cycle: new recompute_emotional_weight phase

Adds a 9th cycle phase between extract and embed. Sees the union of
syncPagesAffected + synthesizeWrittenSlugs for incremental mode (so
synthesize-written pages get their weight computed too — codex C2 caught
that the prior plan threaded only sync). Full mode (no incremental
anchors) walks every page; users hit this path on first upgrade via
gbrain dream --phase recompute_emotional_weight.

Phase orchestrator (cycle/recompute-emotional-weight.ts) is two SQL
round-trips total regardless of brain size:
  1. batchLoadEmotionalInputs(slugs?) → per-page tag/take inputs.
  2. computeEmotionalWeight in memory (pure function).
  3. setEmotionalWeightBatch(rows) → composite-keyed UPDATE FROM unnest.

Empty affectedSlugs short-circuits (no DB read, no write). Dry-run
computes weights and reports the would-write count without touching
the DB. Engine throw bubbles into status:fail with code
RECOMPUTE_EMOTIONAL_WEIGHT_FAIL — cycle continues to the next phase.

Plumbing:
- CyclePhase type adds 'recompute_emotional_weight'.
- ALL_PHASES + NEEDS_LOCK_PHASES include it.
- CycleReport.totals adds pages_emotional_weight_recomputed (additive,
  schema_version stays "1").
- runCycle's totals rollup + status derivation honor the new field.
- synthesize.ts emits writtenSlugs in details so cycle.ts can union
  with syncPagesAffected for incremental backfill.

Tests: 7-case unit (fake-engine), 3-case PGLite e2e (full mode + dry-
run + ALL_PHASES position), 1000-page perf budget (<5s on PGLite).

Codex C2 → A: clean separation. Phase doesn't modify runExtractCore;
runs on its own seam after the existing 8 phases plus synthesize.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 ops: get_recent_salience + find_anomalies + get_recent_transcripts

Three new MCP operations + a transcripts library:

- get_recent_salience: pages ranked by emotional + activity salience.
  Subagent-allow-listed. params: days (default 14), limit (default 20,
  capped 100), slugPrefix (renamed from `kind` per codex C4#10 to
  avoid collision with PageKind/TakeKind).

- find_anomalies: cohort-level activity outliers (tag + type).
  Subagent-allow-listed. Year cohort deferred to v0.30.

- get_recent_transcripts: raw .txt transcripts from the dream-cycle
  corpus dirs. LOCAL-ONLY: rejects ctx.remote === true with
  permission_denied (codex C3). NOT in the subagent allow-list — all
  subagent calls run with remote=true, would always reject (footgun if
  visible). Cycle's synthesize phase calls discoverTranscripts
  directly, so subagents that need transcripts go through the library
  function, not the op.

Tool descriptions extracted to src/core/operations-descriptions.ts so
they're pinnable in tests and stable for the Tier-2 LLM routing eval.
Redirects on query/search/list_pages: personal/emotional questions
should reach the new ops, not semantic search. Anti-flattery hint on
query: "Do NOT assume words like crazy, notable, or big mean
impressive — they often mean difficult or emotionally charged."

list_pages gains updated_after (string ISO) and sort enum params,
surfacing the engine threading from the prior commit.

src/core/transcripts.ts: filesystem walk shared by the gated MCP op
and the (commit 5) CLI command. Reuses discoverTranscripts corpus-dir
resolution + isDreamOutput from cycle/transcript-discovery.ts. Trust
gate lives in the op handler, not the library — the library is
trusted by both the gated op and the local CLI.

Allow-list: 11 → 13 (add salience + anomalies; transcripts excluded
per codex C3, with a comment explaining why).

Tests: 21-case description pin (catches accidental edits that change
LLM-facing surface); 11-case transcripts unit covering trust gate,
mtime window, dream-output skip, summary truncation, no corpus_dir;
2-case salience type-contract smoke (full Garry-test fixture in commit
6's e2e suite).

Codex C1: routing-eval fixtures (skills/<x>/routing-eval.jsonl)
deliberately NOT shipped — routing-eval.ts is substring-match on
resolver triggers, not MCP tool routing. Real coverage lands as
test/e2e/salience-llm-routing.test.ts in commit 6.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 CLI: gbrain salience / anomalies / transcripts

Three new CLI commands wired into src/cli.ts dispatch + CLI_ONLY set +
help text:

- gbrain salience [--days N] [--limit N] [--kind PREFIX] [--json]
- gbrain anomalies [--since YYYY-MM-DD] [--lookback-days N] [--sigma N] [--json]
- gbrain transcripts recent [--days N] [--full] [--json]

Each command file mirrors src/commands/orphans.ts shape: pure data fn
+ JSON formatter + human formatter. Calls into engine.getRecentSalience
/ findAnomalies (already shipped) and src/core/transcripts.ts.

salience and anomalies show ranked rows with per-cohort
mean/stddev/sigma. transcripts honors `--full` (caps at 100KB/file)
vs default summary (first non-empty line + ~250 chars). All three
emit JSON with --json for agent consumption.

`--kind` is accepted as a slug-prefix shorthand on `gbrain salience`
even though the underlying op param is `slugPrefix` (kept the CLI
flag short; the MCP-facing param uses the more-explicit name to
align with PageKind/TakeKind/slugPrefix vocabulary).

CLI_ONLY set in src/cli.ts gains the three new command names so
they don't get forwarded to MCP-only routing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 e2e: Garry-test fixtures + Postgres parity + LLM routing eval

PGLite e2e (no DATABASE_URL needed):

- salience-pglite: the Garry test. 7 wedding-tagged pages updated today
  + 100 background pages backdated across 30 days via raw SQL UPDATE
  (codex C4#7 — engine.putPage stamps updated_at = now(), so seeding
  via the engine alone can't reproduce historical recency windows).
  Asserts wedding pages outrank random-tag noise in the 7-day window;
  slugPrefix filter narrows correctly; days=0 boundary case; limit cap.

- anomalies-pglite: same fixture shape (7 wedding pages today, 100
  background backdated). findAnomalies with sigma=3 returns the
  wedding-tag cohort with sigma_observed > 3 vs near-zero baseline;
  page_slugs sample carries the wedding pages; date with no activity
  returns []; high sigma threshold suppresses borderline cohorts
  (zero-stddev fallback stays finite — no NaN sigma).

Postgres-gated e2e:

- engine-parity-salience: PGLite ↔ Postgres parity for getRecentSalience
  and findAnomalies. Same fixture into both engines; top-result and
  cohort-set match. Closes the v0.22.0-style parity gap for the new
  v0.29 SQL idioms (EXTRACT(EPOCH ...), generate_series, CTE chain).

Tier-2 LLM routing eval (ANTHROPIC_API_KEY-gated):

- salience-llm-routing: calls Claude with v0.29 tool descriptions and
  12 personal-query phrasings ("anything crazy lately", "what's been
  going on with me", etc.). Asserts the chosen tool is in the v0.29
  set, not query() / search(). ~$0.10 per CI run on Haiku. Tests the
  ACTUAL ship criterion — replaces the discarded fake-coverage
  routing-eval.jsonl fixtures (codex C1 → B).

This is the only test that proves the description edits drive routing.
Without it, we'd ship description changes and only learn from
production behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.0: ship-prep — VERSION + CHANGELOG + CLAUDE Key Files

VERSION + package.json bump 0.28.0 → 0.29.0.

CHANGELOG.md adds a v0.29.0 release-summary in the GStack/Garry voice
plus the "To take advantage of v0.29.0" block. Headline two-liner:
"The brain tells you what's hot without being asked. Salience +
anomaly detection ship. Search rewards hypotheses; salience surfaces
them." Numbers-that-matter table covers engine surface delta, MCP op
delta, allow-list delta, cycle-phase delta, schema migration, list_pages
param surface, and test count. Itemized changes section lists the
schema migration + new cycle phase + new MCP ops + redirect
descriptions + subagent allow-list rules + new tests + a contributor
note clarifying that routing-eval.ts is not the right surface for
testing MCP tool routing (use the Tier-2 LLM eval pattern instead).

CLAUDE.md Key Files updated for the v0.29 surface:

- src/core/engine.ts: notes the 4 new methods + PageFilters.sort threading.
- src/core/migrate.ts: v34 (pages_emotional_weight) entry.
- src/core/cycle.ts: 8 → 9 phases, recompute_emotional_weight inserted
  between patterns and embed; totals.pages_emotional_weight_recomputed.
- src/core/cycle/emotional-weight.ts (NEW): formula + override path.
- src/core/cycle/anomaly.ts (NEW): stats helpers + zero-stddev fallback.
- src/core/cycle/recompute-emotional-weight.ts (NEW): phase orchestrator.
- src/core/transcripts.ts (NEW): library shared by gated MCP op + CLI.
- src/core/operations-descriptions.ts (NEW): pinned tool descriptions.
- src/core/minions/tools/brain-allowlist.ts: 11 → 13 entries; comment
  on why get_recent_transcripts is excluded.
- src/commands/salience.ts / anomalies.ts / transcripts.ts (NEW): CLI surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1 feat: recency + salience as two orthogonal options on query op (#696)

* feat: recency boost for search (v0.27.0) — temporal intent auto-detection, date filters, configurable decay

New search pipeline stage: keyword + vector → RRF → cosine re-score → backlink boost → recency boost → dedup

- applyRecencyBoost: hyperbolic decay, two strengths (moderate 30-day halflife, aggressive 7-day halflife)
- Auto-enabled when intent.ts detects temporal/event queries (detail='high')
- Manual override via SearchOpts.recencyBoost (0/1/2)
- Date filtering: afterDate/beforeDate on all three search paths (keyword, keywordChunks, vector)
- getPageTimestamps on both Postgres and PGLite engines
- 15 tests passing (boost math + intent classification)

* v0.29.1 schema: pages.{effective_date, effective_date_source, import_filename, salience_touched_at} + expression index

Migration v38 adds 4 nullable columns to pages and an expression index on
COALESCE(effective_date, updated_at) to support the new since/until date
filters. All additive — no behavior change in the default search path; only
consulted when callers opt into the new salience='on' / recency='on' axes
or pass since/until.

  effective_date         — content date (event_date / date / published /
                           filename-date / fallback). Read by recency boost
                           and date-filter paths only. Auto-link doesn't
                           touch it (immune to updated_at churn).
  effective_date_source  — sentinel for the doctor's effective_date_health
                           check ('event_date' | 'date' | 'published' |
                           'filename' | 'fallback').
  import_filename        — basename without extension, captured at import.
                           Used for filename-date precedence on daily/,
                           meetings/. Older rows leave it NULL.
  salience_touched_at    — bumped by recompute_emotional_weight when
                           emotional_weight changes. Salience window uses
                           GREATEST(updated_at, salience_touched_at) so
                           newly-salient old pages enter the recent salience
                           query.

Index strategy: a partial index on effective_date alone wouldn't help the
COALESCE expression in since/until filters (planner can't use it for the
negative side). The expression index ((COALESCE(effective_date, updated_at)))
is what actually accelerates the filter.

Postgres uses CONCURRENTLY + v14-style pg_index.indisvalid pre-drop guard
for prior failed CONCURRENTLY runs; PGLite uses plain CREATE INDEX. Mirror
of v34's pattern.

src/schema.sql + src/core/pglite-schema.ts updated for fresh installs;
src/core/schema-embedded.ts regenerated via bun run build:schema.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: computeEffectiveDate helper + putPage integration

Pure helper computing a page's effective_date from frontmatter precedence:
  1. event_date (meeting/event pages)
  2. date (dated essays)
  3. published (writing/)
  4. filename-date (leading YYYY-MM-DD in basename)
  5. updated_at (fallback)
  6. created_at (last resort)

Per-prefix override: for daily/ and meetings/ slugs, filename-date jumps
to position 1 — the filename is the user's primary signal there.

Returns {date, source}. The source label powers the doctor's
effective_date_health check to detect "fell back to updated_at" rows that
look populated but are functionally a NULL.

Range validation: parsed value must be in [1990-01-01, NOW + 1 year].
Out-of-range values drop to the next chain element.

Wired into importFromContent + importFromFile. The put_page MCP op derives
filename from slug-tail when no caller-supplied filename is available.

putPage SQL on both engines extended to write the new columns. ON CONFLICT
uses COALESCE(EXCLUDED.x, pages.x) so callers that don't know about the
new columns (auto-link, code reindex) preserve existing values rather than
blanking them. SELECT projection extended to return them; rowToPage threads
them through.

21 unit tests covering: precedence chain default order, per-prefix override,
parse failure fall-through, range validation [1990, NOW+1y], parseDateLoose
shape variants. All pass; typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: backfill orchestrator + library function for existing pages

src/core/backfill-effective-date.ts is the shared library function. Walks
pages in keyset-paginated batches (id > last_id ORDER BY id LIMIT 1000),
runs computeEffectiveDate per row, UPDATEs effective_date +
effective_date_source. Resumable via the `backfill.effective_date.last_id`
checkpoint key in the config table — a killed process can re-run and pick
up without re-doing rows. Idempotent: a full re-walk produces the same
writes.

Postgres-only: SET LOCAL statement_timeout = '600s' per batch. Doesn't
refuse the migration on low session settings (codex pass-2 #16).

src/commands/migrations/v0_29_1.ts is the orchestrator (4 phases mirroring
v0_12_2). Phase A schema (gbrain init --migrate-only), Phase B backfill
(via the library function), Phase C verify (count NULL effective_date),
Phase D record (handled by runner). The library function is reusable from
the gbrain reindex-frontmatter CLI command in the next commit.

import_filename stays NULL for backfilled rows — pre-v0.29.1 imports
didn't capture it. computeEffectiveDate uses the slug-tail when filename
is NULL; daily/2024-03-15 backfilled gets effective_date from the slug.

Registered in src/commands/migrations/index.ts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: gbrain reindex-frontmatter CLI command

Recovery / explicit-rebuild path for pages.effective_date. Used when:
  - User edited frontmatter dates after import
  - Post-upgrade backfill orchestrator finished but the user wants to
    re-walk a subset (e.g. just meetings/) after fixing some frontmatter
  - Precedence rules change between releases

Thin wrapper over backfillEffectiveDate from commit 3 — same code path
the v0_29_1 orchestrator uses; one source of truth.

Flags mirror reindex-code:
  --source <id>      Scope to one sources row (placeholder; library
                     library doesn't filter by source today, tracked v0.30+)
  --slug-prefix P    Scope to slugs starting with P (e.g. 'meetings/')
  --dry-run          Print what WOULD change, no DB writes
  --yes              Skip confirmation prompt (required for non-TTY non-JSON)
  --json             Machine-readable result envelope
  --force            Re-apply even when computed value matches existing

Wired into src/cli.ts. CLI handles its own engine lifecycle (creates +
disconnects).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: recency-decay map + buildRecencyComponentSql (pure, unused)

src/core/search/recency-decay.ts mirrors source-boost.ts in shape but
drives RECENCY ONLY (per D9 codex resolution). Salience is a separate
orthogonal axis; this map does not feed it.

DEFAULT_RECENCY_DECAY: 10 generic prefixes (no fork-specific names).
  - concepts/      evergreen (halflifeDays=0)
  - originals/     180d × 0.5 (long-tail decay; new essays nudged)
  - writing/       365d × 0.4
  - daily/         14d × 1.5  (aggressive — freshness IS the signal)
  - meetings/      60d × 1.0
  - chat/          7d × 1.0
  - media/x/       7d × 1.5
  - media/articles/ 90d × 0.5
  - people/companies/ 365d × 0.3
  - deals/         180d × 0.5

DEFAULT_FALLBACK: 90d × 0.5 for unmatched slugs.

Override priority: defaults < gbrain.yml recency: < env (GBRAIN_RECENCY_DECAY)
< per-call SearchOpts.recency_decay.

parseRecencyDecayEnv format: comma-separated prefix:halflifeDays:coefficient
triples. Refuses LOUD on parse error (RecencyDecayParseError) — codex
pass-2 #M3 finding. No silent fallback like source-boost's parser.

parseRecencyDecayYaml takes already-parsed YAML; throws on bad shape.

buildRecencyComponentSql in sql-ranking.ts emits a CASE expression with
longest-prefix-first ordering, evergreen short-circuit (literal 0 when
halflifeDays=0 or coefficient=0), and EXTRACT(EPOCH ...) for non-zero
branches. Output: ((CASE WHEN p.slug LIKE 'daily/%' THEN 1.5 * 14.0 /
(14.0 + EXTRACT(EPOCH FROM (NOW() - <dateExpr>))/86400.0) ... END))

Typed NowExpr enum prevents SQL injection (codex pass-1 #5). Tests pass
{ kind: 'fixed', isoUtc } for deterministic output; production NOW().
The 'fixed' branch escapes single quotes via escapeSqlLiteral.

25 unit tests covering: env parser shape, env error cases, yaml parser
shape, merge precedence (defaults < yaml < env < caller), CASE longest-
prefix-first ordering, evergreen short-circuit, NowExpr fixed/now,
single-quote injection defense, empty decayMap fallback path, default
map composition (no fork names, concepts/ evergreen, daily/ aggressive).

Pure module. Zero consumers in this commit; commit 6 wires it into
getRecentSalience, commit 10 wires it into the post-fusion stage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: refactor getRecentSalience to consume buildRecencyComponentSql

Both engines (Postgres + PGLite) now build the salience formula's third
term via buildRecencyComponentSql instead of inlining 1.0 / (1 + days_old).
Parameters: empty decayMap + fallback { halflifeDays: 1, coefficient: 1.0 }.
Math expands to 1 * 1.0 / (1.0 + days_old) = 1 / (1 + days_old) — same
numeric output as v0.29.0.

This is a no-behavior-change refactor preparing for commit 7's recency_bias
param. recency_bias='flat' (default) reproduces v0.29.0 exactly; 'on'
swaps in DEFAULT_RECENCY_DECAY for per-prefix decay.

Single source of truth for the recency math: same builder feeds the
salience query AND (in commit 10) the post-fusion applyRecencyBoost stage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: get_recent_salience gains recency_bias param (default 'flat')

SalienceOpts.recency_bias: 'flat' | 'on' added; default 'flat' preserves
v0.29.0 ranking verbatim. Pass 'on' to opt into per-prefix decay map
(concepts/originals/writing/ evergreen; daily/, media/x/, chat/ aggressive
decay).

When recency_bias='on', the salience query reads
COALESCE(p.effective_date, p.updated_at) instead of bare p.updated_at, so
the recency component is immune to auto-link updated_at churn — old
concepts/ pages just-touched by auto-link don't suddenly look fresh.

Both engines (Postgres + PGLite) wire the param through. resolveRecencyDecayMap()
honors gbrain.yml + GBRAIN_RECENCY_DECAY env at runtime.

MCP op surface: get_recent_salience gains the param with a load-bearing
description teaching the agent when to use 'on' vs 'flat' (current state →
on; mattering across all time → flat).

No silent v0.29.0 behavior change — opt-in only (per D11 codex resolution).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: recompute_emotional_weight writes salience_touched_at; window picks up newly-salient pages

setEmotionalWeightBatch on both engines now bumps salience_touched_at to
NOW() ONLY when the new emotional_weight differs from the existing one
(IS DISTINCT FROM, NULL-safe). No-op writes (same weight) leave the
column alone — preserves "actual change" semantics.

getRecentSalience window changes from
  WHERE p.updated_at >= boundary
to
  WHERE GREATEST(p.updated_at, COALESCE(p.salience_touched_at, p.updated_at)) >= boundary

Closes codex pass-1 finding #4: pages whose emotional_weight just changed
in the dream cycle (because tags or takes shifted) but whose updated_at
is older than the salience window now correctly enter the recent-salience
results. Without this, "Garry just added a take to a 6-month-old page"
stayed invisible to get_recent_salience until the next content edit.

COALESCE(salience_touched_at, p.updated_at) handles pre-v0.29.1 rows
where salience_touched_at is NULL — they fall back to p.updated_at and
behave identically to v0.29.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: merge intent.ts → query-intent.ts; emit 3 suggestions per query

D1 + D4 + D6 + D8: single regex-pass classifier returning
{intent, suggestedDetail, suggestedSalience, suggestedRecency}.

intent + suggestedDetail are v0.29.0 behavior verbatim (legacy intent.ts
deleted; classifyQueryIntent + autoDetectDetail compat shims preserved).

NEW for v0.29.1 — two orthogonal recency-axis suggestions:

  suggestedSalience: 'off' | 'on' | 'strong'
  suggestedRecency:  'off' | 'on' | 'strong'

Resolution rules (per D6 narrow temporal-bound exception):
  - CANONICAL patterns (who is X / what is Y / code / graph) → both off
  - UNLESS an EXPLICIT_TEMPORAL_BOUND also matches (today / right now /
    this week / since X / last N days), in which case temporal-bound wins
  - STRONG_RECENCY (today / right now / this morning / just now) → strong
  - RECENCY_ON (latest / recent / this week / meeting prep / catch up
    / remind me / status update) → on
  - SALIENCE_ON (catch up / remind me / status update / prep me /
    what's going on / what matters) → on
  - default → off for both axes (v0.29.1 prime-directive: pure opt-in)

Salience and recency are TRULY orthogonal (per D9). A query like
"latest news on AI" → recency='on' but salience='off' (the user wants
fresh, not emotionally-weighted). "What's going on with widget-co" →
both on. "Who is X right now" → both 'strong'/'on' (temporal bound
beats canonical 'who is').

intent.ts deleted; test/intent.test.ts renamed → test/query-intent-legacy.test.ts
(unchanged behavior coverage). New test/query-intent.test.ts adds 21
cases covering all three axes' interactions: canonical wins on bare
'who is', temporal bound overrides, "catch me up" matches with up to 15
chars between, "today" → strong, intent vs recency independence.

Updated callers:
  - src/core/search/hybrid.ts (autoDetectDetail import)
  - test/recency-boost.test.ts (classifyQueryIntent import)
  - test/benchmark-search-quality.ts (autoDetectDetail import)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: applySalienceBoost + applyRecencyBoost + runPostFusionStages wrapper

D9 + codex pass-1 #2 + #3 + pass-2 #4: salience and recency are TRULY
ORTHOGONAL post-fusion stages, both running from ALL THREE hybridSearch
return paths (keyword-only, embed-failure-fallback, full-hybrid).

NEW src/core/search/hybrid.ts exports:
  - applySalienceBoost(results, scores, strength)
      score *= 1 + k * log(1 + score) where k = 0.15 (on) or 0.30 (strong)
      No time component. Pure mattering signal.
  - applyRecencyBoost(results, dates, strength, decayMap, fallback, nowMs?)
      Per-prefix decay factor: 1 + strengthMul * coefficient * halflife / (halflife + days_old)
      strengthMul: 1.0 (on) or 1.5 (strong)
      Evergreen prefixes (halflifeDays=0) skipped (factor 1.0).
      Pure recency signal. Independent of mattering.
  - runPostFusionStages(engine, results, opts)
      Wraps backlink + salience + recency. Called from EACH return path so
      keyless installs and embed failures get the same boost surface as
      the full hybrid path.

NEW engine methods (composite-keyed for multi-source isolation):
  - getEffectiveDates(refs: Array<{slug, source_id}>): Map<key, Date>
      Returns COALESCE(effective_date, updated_at, created_at). Key format:
      `${source_id}::${slug}`. Mirror of getBacklinkCounts shape.
  - getSalienceScores(refs: Array<{slug, source_id}>): Map<key, number>
      Returns emotional_weight × 5 + ln(1 + take_count). Composite key.

Deprecated (kept for back-compat through v0.29.x):
  - SearchOpts.afterDate / beforeDate (alias for since/until)
  - SearchOpts.recencyBoost: 0|1|2 (alias for recency: 'off'|'on'|'strong')
  - getPageTimestamps (use getEffectiveDates instead)

NEW SearchOpts fields:
  - salience: 'off' | 'on' | 'strong'
  - recency:  'off' | 'on' | 'strong'
  - since:    string (ISO-8601 or relative, replaces afterDate)
  - until:    string (replaces beforeDate)

Resolution: caller-explicit > legacy alias (recencyBoost) > heuristic
(classifyQuery's suggestedSalience / suggestedRecency).

Deleted: src/core/search/recency.ts (PR #618's, replaced) +
test/recency-boost.test.ts (its scope is replaced by query-intent.test.ts +
future post-fusion tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Wintermute <wintermute@garrytan.com>

* v0.29.1: query op gains salience + recency + since + until params; PGLite since/until parity

Combines commits 12 + 13 of the plan.

Query op surface (src/core/operations.ts):
  - salience: 'off' | 'on' | 'strong' (with load-bearing description)
  - recency:  'off' | 'on' | 'strong'
  - since:    string (ISO-8601 or relative; replaces deprecated afterDate)
  - until:    string (replaces deprecated beforeDate)

Tool descriptions teach the calling agent:
  - salience axis = mattering, no time component
  - recency axis = age decay, no mattering signal
  - omit either to let gbrain auto-detect from query text via classifyQuery

hybrid.ts maps since/until → afterDate/beforeDate at the engine call
boundary so PR #618's existing engine plumbing keeps working without
rename. Codex pass-1 #10 finding closed.

PGLite engine (codex pass-1 #10): since/until parity added to all three
search methods (searchKeyword, searchKeywordChunks, searchVector). SQL
filter against COALESCE(p.effective_date, p.updated_at, p.created_at)
so date filtering matches user content-date intent (a meeting was on
event_date, not when it got reimported). Filter is applied INSIDE the
HNSW inner CTE in searchVector so HNSW's candidate pool already
excludes out-of-range pages — preserves pagination contract.

This also closes existing cross-engine drift: pre-v0.29.1 Postgres had
afterDate/beforeDate from PR #618; PGLite had nothing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: migration v39 — eval_candidates capture columns for replay reproducibility

D11 codex pass-2 resolution: extend eval_candidates with 7 new nullable
columns so `gbrain eval replay` can reproduce captured runs of agent-explicit
salience + recency choices.

Without these columns, replays of the new axis params drift. The live
behavior depends on the resolved {salience, recency} values; v0.29.0's
schema doesn't capture them.

  as_of_ts            TIMESTAMPTZ  — brain's logical NOW at capture
                                     (replay uses this instead of wall-clock)
  salience_param      TEXT         — what the caller passed (NULL if omitted)
  recency_param       TEXT         — same
  salience_resolved   TEXT         — final value applied
  recency_resolved    TEXT         — same
  salience_source     TEXT         — 'caller' or 'auto_heuristic'
  recency_source      TEXT         — same

All nullable + additive. Pre-v0.29.1 rows stay valid. NDJSON
schema_version STAYS at 1 — consumers ignore unknown fields (codex
pass-1 #C2 dissolves; no cross-repo coordination needed).

ADD COLUMN with no DEFAULT is metadata-only on PG 11+ and PGLite —
instant on tables of any size.

src/schema.sql + src/core/pglite-schema.ts mirror the additions for fresh
installs; src/core/schema-embedded.ts regenerated. eval_capture.ts
populates the new fields in commit 16 (docs + ship).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: doctor checks — effective_date_health + salience_health

effective_date_health: sample-1000 scan detects three classes of
problems (codex pass-1 #5 resolution via the effective_date_source
sentinel column added in commit 1):

  fallback_with_fm_date  — page fell back to updated_at even though
                           frontmatter has parseable event_date / date /
                           published. The "wrong but populated" residual
                           that earlier review iterations missed.
  future_dated            — effective_date > NOW() + 1 year (corrupt
                            or typo'd century).
  pre_1990                — effective_date < 1990-01-01 (epoch math gone
                            wrong, bad parse).

Sample of last 1000 pages by default — fast on 200K-page brains. Fix
hint: gbrain reindex-frontmatter.

salience_health: detects pages with active takes whose emotional_weight
is still 0 (recompute_emotional_weight phase hasn't run since the
take landed). Reports the brain's non-zero emotional_weight count as
an informational baseline. Fix hint: gbrain dream --phase
recompute_emotional_weight.

Both checks gracefully skip on pre-v0.29.1 brains (column doesn't
exist → 42703) without surfacing as warnings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: docs + skills convention + CHANGELOG + version bump

- VERSION 0.29.0 → 0.29.1
- package.json version bump
- CHANGELOG.md: full release-summary + itemized + "To take advantage"
  block per the project's voice rules. Two-line headline + concrete
  pathology framing (existing callers unchanged; new axes opt-in;
  agent in charge per the prime directive).
- skills/conventions/salience-and-recency.md: agent-readable decision
  rules. "Current state → on. Canonical truth → off." plus the narrow
  temporal-bound exception. Cross-cutting convention propagates to
  brain skills via RESOLVER.md.
- skills/migrations/v0.29.1.md: agent-readable upgrade instructions.
  Verify steps + behavior-change reference + recovery commands.

The build-time tool-description generator from D2 (extract decision
tables from skills/conventions/salience-and-recency.md, embed into
operations.ts at build time) is deferred to a follow-up commit. The
tool descriptions on the query op + get_recent_salience are inline in
operations.ts for v0.29.1; the auto-gen + CI staleness gate land in
v0.29.2 if drift becomes a problem in practice.

148 unit tests pass across the v0.29.1 surface (effective-date,
recency-decay, query-intent, migrate, salience, recompute-emotional-weight).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Wintermute <wintermute@garrytan.com>

---------

Co-authored-by: Wintermute <wintermute@garrytan.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Wintermute <wintermute@garrytan.com>
garrytan added a commit that referenced this pull request May 8, 2026
… what's hot without being asked (#730)

* v0.29 foundation: emotional_weight column + formula + anomaly stats

Migration v34 adds pages.emotional_weight REAL DEFAULT 0.0 (column-only,
no index — salience query orders by computed score, not raw weight).
Embedded DDL (schema.sql + pglite-schema.ts + schema-embedded.ts)
mirrors the column so fresh installs don't need migration replay.

types.ts gains: PageFilters.sort enum + PAGE_SORT_SQL whitelist (engines
hardcoded ORDER BY updated_at DESC; threading lands in the next commit);
SalienceOpts/SalienceResult, AnomaliesOpts/AnomalyResult,
EmotionalWeightInputRow/EmotionalWeightWriteRow contracts.

cycle/emotional-weight.ts: pure-function score in [0..1] from tags +
takes (anglocentric default seed list; user-overridable via config key
emotional_weight.high_tags). cycle/anomaly.ts: meanStddev + cohort
threshold helpers with zero-stddev fallback (count > mean + 1) so rare
cohorts don't produce NaN sigmas.

Test coverage: migrate v34 structural assertions + 14-case formula
unit + 13-case anomaly stats unit. Codex review fixes baked in:
formula clamped to [0,1]; per-take weight clamped to [0,1] before
averaging; zero-stddev fallback finite, never NaN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 engine: batch emotional-weight methods + listPages sort

BrainEngine adds 4 methods, both engines implement:

- batchLoadEmotionalInputs(slugs?): CTE-shaped read with per-table
  pre-aggregates. A page with N tags + M takes never produces N×M rows
  (codex C4#4) — page_tags + page_takes CTEs aggregate independently,
  then LEFT JOIN to pages.

- setEmotionalWeightBatch(rows): UPDATE FROM unnest($1::text[],
  $2::text[], $3::real[]) composite-keyed on (slug, source_id). Multi-
  source brains can't fan out (codex C4#3) — pages.slug is unique only
  within source_id. Same shape that v0.18 link batches use.

- getRecentSalience: time boundary computed in JS, bound as TIMESTAMPTZ.
  SQL identical across engines (codex C5/D5 — avoids dialect drift on
  $1::interval binding which has zero current uses on PGLite).

- findAnomalies: tag + type cohort baselines via generate_series-
  densified daily-count CTEs (codex C4#6). Sparse-day rare cohorts get
  correct (mean, stddev) instead of biased upward by zero-omission.
  Year cohort deferred to v0.30.

listPages threads the new PageFilters.sort enum through both engines.
Was hardcoded ORDER BY updated_at DESC; now PAGE_SORT_SQL whitelist
maps the 4 enum values to literal SQL fragments — no injection surface.
postgres.js uses sql.unsafe; PGLite splices the fragment directly.

Regression tests (PGLite, no DATABASE_URL needed):

- multi-source-emotional-weight: same slug under two source_ids,
  setEmotionalWeightBatch on one of them, asserts the other survives
  untouched. Direct codex C4#3 guard.

- list-pages-regression (IRON RULE): old call shape (type, tag, limit)
  still returns updated_desc default; new sort=updated_asc reverses;
  sort=created_desc orders by created_at; sort=slug alphabetical;
  unsupported sort enum falls back to default (defense in depth).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 cycle: new recompute_emotional_weight phase

Adds a 9th cycle phase between extract and embed. Sees the union of
syncPagesAffected + synthesizeWrittenSlugs for incremental mode (so
synthesize-written pages get their weight computed too — codex C2 caught
that the prior plan threaded only sync). Full mode (no incremental
anchors) walks every page; users hit this path on first upgrade via
gbrain dream --phase recompute_emotional_weight.

Phase orchestrator (cycle/recompute-emotional-weight.ts) is two SQL
round-trips total regardless of brain size:
  1. batchLoadEmotionalInputs(slugs?) → per-page tag/take inputs.
  2. computeEmotionalWeight in memory (pure function).
  3. setEmotionalWeightBatch(rows) → composite-keyed UPDATE FROM unnest.

Empty affectedSlugs short-circuits (no DB read, no write). Dry-run
computes weights and reports the would-write count without touching
the DB. Engine throw bubbles into status:fail with code
RECOMPUTE_EMOTIONAL_WEIGHT_FAIL — cycle continues to the next phase.

Plumbing:
- CyclePhase type adds 'recompute_emotional_weight'.
- ALL_PHASES + NEEDS_LOCK_PHASES include it.
- CycleReport.totals adds pages_emotional_weight_recomputed (additive,
  schema_version stays "1").
- runCycle's totals rollup + status derivation honor the new field.
- synthesize.ts emits writtenSlugs in details so cycle.ts can union
  with syncPagesAffected for incremental backfill.

Tests: 7-case unit (fake-engine), 3-case PGLite e2e (full mode + dry-
run + ALL_PHASES position), 1000-page perf budget (<5s on PGLite).

Codex C2 → A: clean separation. Phase doesn't modify runExtractCore;
runs on its own seam after the existing 8 phases plus synthesize.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 ops: get_recent_salience + find_anomalies + get_recent_transcripts

Three new MCP operations + a transcripts library:

- get_recent_salience: pages ranked by emotional + activity salience.
  Subagent-allow-listed. params: days (default 14), limit (default 20,
  capped 100), slugPrefix (renamed from `kind` per codex C4#10 to
  avoid collision with PageKind/TakeKind).

- find_anomalies: cohort-level activity outliers (tag + type).
  Subagent-allow-listed. Year cohort deferred to v0.30.

- get_recent_transcripts: raw .txt transcripts from the dream-cycle
  corpus dirs. LOCAL-ONLY: rejects ctx.remote === true with
  permission_denied (codex C3). NOT in the subagent allow-list — all
  subagent calls run with remote=true, would always reject (footgun if
  visible). Cycle's synthesize phase calls discoverTranscripts
  directly, so subagents that need transcripts go through the library
  function, not the op.

Tool descriptions extracted to src/core/operations-descriptions.ts so
they're pinnable in tests and stable for the Tier-2 LLM routing eval.
Redirects on query/search/list_pages: personal/emotional questions
should reach the new ops, not semantic search. Anti-flattery hint on
query: "Do NOT assume words like crazy, notable, or big mean
impressive — they often mean difficult or emotionally charged."

list_pages gains updated_after (string ISO) and sort enum params,
surfacing the engine threading from the prior commit.

src/core/transcripts.ts: filesystem walk shared by the gated MCP op
and the (commit 5) CLI command. Reuses discoverTranscripts corpus-dir
resolution + isDreamOutput from cycle/transcript-discovery.ts. Trust
gate lives in the op handler, not the library — the library is
trusted by both the gated op and the local CLI.

Allow-list: 11 → 13 (add salience + anomalies; transcripts excluded
per codex C3, with a comment explaining why).

Tests: 21-case description pin (catches accidental edits that change
LLM-facing surface); 11-case transcripts unit covering trust gate,
mtime window, dream-output skip, summary truncation, no corpus_dir;
2-case salience type-contract smoke (full Garry-test fixture in commit
6's e2e suite).

Codex C1: routing-eval fixtures (skills/<x>/routing-eval.jsonl)
deliberately NOT shipped — routing-eval.ts is substring-match on
resolver triggers, not MCP tool routing. Real coverage lands as
test/e2e/salience-llm-routing.test.ts in commit 6.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 CLI: gbrain salience / anomalies / transcripts

Three new CLI commands wired into src/cli.ts dispatch + CLI_ONLY set +
help text:

- gbrain salience [--days N] [--limit N] [--kind PREFIX] [--json]
- gbrain anomalies [--since YYYY-MM-DD] [--lookback-days N] [--sigma N] [--json]
- gbrain transcripts recent [--days N] [--full] [--json]

Each command file mirrors src/commands/orphans.ts shape: pure data fn
+ JSON formatter + human formatter. Calls into engine.getRecentSalience
/ findAnomalies (already shipped) and src/core/transcripts.ts.

salience and anomalies show ranked rows with per-cohort
mean/stddev/sigma. transcripts honors `--full` (caps at 100KB/file)
vs default summary (first non-empty line + ~250 chars). All three
emit JSON with --json for agent consumption.

`--kind` is accepted as a slug-prefix shorthand on `gbrain salience`
even though the underlying op param is `slugPrefix` (kept the CLI
flag short; the MCP-facing param uses the more-explicit name to
align with PageKind/TakeKind/slugPrefix vocabulary).

CLI_ONLY set in src/cli.ts gains the three new command names so
they don't get forwarded to MCP-only routing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 e2e: Garry-test fixtures + Postgres parity + LLM routing eval

PGLite e2e (no DATABASE_URL needed):

- salience-pglite: the Garry test. 7 wedding-tagged pages updated today
  + 100 background pages backdated across 30 days via raw SQL UPDATE
  (codex C4#7 — engine.putPage stamps updated_at = now(), so seeding
  via the engine alone can't reproduce historical recency windows).
  Asserts wedding pages outrank random-tag noise in the 7-day window;
  slugPrefix filter narrows correctly; days=0 boundary case; limit cap.

- anomalies-pglite: same fixture shape (7 wedding pages today, 100
  background backdated). findAnomalies with sigma=3 returns the
  wedding-tag cohort with sigma_observed > 3 vs near-zero baseline;
  page_slugs sample carries the wedding pages; date with no activity
  returns []; high sigma threshold suppresses borderline cohorts
  (zero-stddev fallback stays finite — no NaN sigma).

Postgres-gated e2e:

- engine-parity-salience: PGLite ↔ Postgres parity for getRecentSalience
  and findAnomalies. Same fixture into both engines; top-result and
  cohort-set match. Closes the v0.22.0-style parity gap for the new
  v0.29 SQL idioms (EXTRACT(EPOCH ...), generate_series, CTE chain).

Tier-2 LLM routing eval (ANTHROPIC_API_KEY-gated):

- salience-llm-routing: calls Claude with v0.29 tool descriptions and
  12 personal-query phrasings ("anything crazy lately", "what's been
  going on with me", etc.). Asserts the chosen tool is in the v0.29
  set, not query() / search(). ~$0.10 per CI run on Haiku. Tests the
  ACTUAL ship criterion — replaces the discarded fake-coverage
  routing-eval.jsonl fixtures (codex C1 → B).

This is the only test that proves the description edits drive routing.
Without it, we'd ship description changes and only learn from
production behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.0: ship-prep — VERSION + CHANGELOG + CLAUDE Key Files

VERSION + package.json bump 0.28.0 → 0.29.0.

CHANGELOG.md adds a v0.29.0 release-summary in the GStack/Garry voice
plus the "To take advantage of v0.29.0" block. Headline two-liner:
"The brain tells you what's hot without being asked. Salience +
anomaly detection ship. Search rewards hypotheses; salience surfaces
them." Numbers-that-matter table covers engine surface delta, MCP op
delta, allow-list delta, cycle-phase delta, schema migration, list_pages
param surface, and test count. Itemized changes section lists the
schema migration + new cycle phase + new MCP ops + redirect
descriptions + subagent allow-list rules + new tests + a contributor
note clarifying that routing-eval.ts is not the right surface for
testing MCP tool routing (use the Tier-2 LLM eval pattern instead).

CLAUDE.md Key Files updated for the v0.29 surface:

- src/core/engine.ts: notes the 4 new methods + PageFilters.sort threading.
- src/core/migrate.ts: v34 (pages_emotional_weight) entry.
- src/core/cycle.ts: 8 → 9 phases, recompute_emotional_weight inserted
  between patterns and embed; totals.pages_emotional_weight_recomputed.
- src/core/cycle/emotional-weight.ts (NEW): formula + override path.
- src/core/cycle/anomaly.ts (NEW): stats helpers + zero-stddev fallback.
- src/core/cycle/recompute-emotional-weight.ts (NEW): phase orchestrator.
- src/core/transcripts.ts (NEW): library shared by gated MCP op + CLI.
- src/core/operations-descriptions.ts (NEW): pinned tool descriptions.
- src/core/minions/tools/brain-allowlist.ts: 11 → 13 entries; comment
  on why get_recent_transcripts is excluded.
- src/commands/salience.ts / anomalies.ts / transcripts.ts (NEW): CLI surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1 feat: recency + salience as two orthogonal options on query op (#696)

* feat: recency boost for search (v0.27.0) — temporal intent auto-detection, date filters, configurable decay

New search pipeline stage: keyword + vector → RRF → cosine re-score → backlink boost → recency boost → dedup

- applyRecencyBoost: hyperbolic decay, two strengths (moderate 30-day halflife, aggressive 7-day halflife)
- Auto-enabled when intent.ts detects temporal/event queries (detail='high')
- Manual override via SearchOpts.recencyBoost (0/1/2)
- Date filtering: afterDate/beforeDate on all three search paths (keyword, keywordChunks, vector)
- getPageTimestamps on both Postgres and PGLite engines
- 15 tests passing (boost math + intent classification)

* v0.29.1 schema: pages.{effective_date, effective_date_source, import_filename, salience_touched_at} + expression index

Migration v38 adds 4 nullable columns to pages and an expression index on
COALESCE(effective_date, updated_at) to support the new since/until date
filters. All additive — no behavior change in the default search path; only
consulted when callers opt into the new salience='on' / recency='on' axes
or pass since/until.

  effective_date         — content date (event_date / date / published /
                           filename-date / fallback). Read by recency boost
                           and date-filter paths only. Auto-link doesn't
                           touch it (immune to updated_at churn).
  effective_date_source  — sentinel for the doctor's effective_date_health
                           check ('event_date' | 'date' | 'published' |
                           'filename' | 'fallback').
  import_filename        — basename without extension, captured at import.
                           Used for filename-date precedence on daily/,
                           meetings/. Older rows leave it NULL.
  salience_touched_at    — bumped by recompute_emotional_weight when
                           emotional_weight changes. Salience window uses
                           GREATEST(updated_at, salience_touched_at) so
                           newly-salient old pages enter the recent salience
                           query.

Index strategy: a partial index on effective_date alone wouldn't help the
COALESCE expression in since/until filters (planner can't use it for the
negative side). The expression index ((COALESCE(effective_date, updated_at)))
is what actually accelerates the filter.

Postgres uses CONCURRENTLY + v14-style pg_index.indisvalid pre-drop guard
for prior failed CONCURRENTLY runs; PGLite uses plain CREATE INDEX. Mirror
of v34's pattern.

src/schema.sql + src/core/pglite-schema.ts updated for fresh installs;
src/core/schema-embedded.ts regenerated via bun run build:schema.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: computeEffectiveDate helper + putPage integration

Pure helper computing a page's effective_date from frontmatter precedence:
  1. event_date (meeting/event pages)
  2. date (dated essays)
  3. published (writing/)
  4. filename-date (leading YYYY-MM-DD in basename)
  5. updated_at (fallback)
  6. created_at (last resort)

Per-prefix override: for daily/ and meetings/ slugs, filename-date jumps
to position 1 — the filename is the user's primary signal there.

Returns {date, source}. The source label powers the doctor's
effective_date_health check to detect "fell back to updated_at" rows that
look populated but are functionally a NULL.

Range validation: parsed value must be in [1990-01-01, NOW + 1 year].
Out-of-range values drop to the next chain element.

Wired into importFromContent + importFromFile. The put_page MCP op derives
filename from slug-tail when no caller-supplied filename is available.

putPage SQL on both engines extended to write the new columns. ON CONFLICT
uses COALESCE(EXCLUDED.x, pages.x) so callers that don't know about the
new columns (auto-link, code reindex) preserve existing values rather than
blanking them. SELECT projection extended to return them; rowToPage threads
them through.

21 unit tests covering: precedence chain default order, per-prefix override,
parse failure fall-through, range validation [1990, NOW+1y], parseDateLoose
shape variants. All pass; typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: backfill orchestrator + library function for existing pages

src/core/backfill-effective-date.ts is the shared library function. Walks
pages in keyset-paginated batches (id > last_id ORDER BY id LIMIT 1000),
runs computeEffectiveDate per row, UPDATEs effective_date +
effective_date_source. Resumable via the `backfill.effective_date.last_id`
checkpoint key in the config table — a killed process can re-run and pick
up without re-doing rows. Idempotent: a full re-walk produces the same
writes.

Postgres-only: SET LOCAL statement_timeout = '600s' per batch. Doesn't
refuse the migration on low session settings (codex pass-2 #16).

src/commands/migrations/v0_29_1.ts is the orchestrator (4 phases mirroring
v0_12_2). Phase A schema (gbrain init --migrate-only), Phase B backfill
(via the library function), Phase C verify (count NULL effective_date),
Phase D record (handled by runner). The library function is reusable from
the gbrain reindex-frontmatter CLI command in the next commit.

import_filename stays NULL for backfilled rows — pre-v0.29.1 imports
didn't capture it. computeEffectiveDate uses the slug-tail when filename
is NULL; daily/2024-03-15 backfilled gets effective_date from the slug.

Registered in src/commands/migrations/index.ts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: gbrain reindex-frontmatter CLI command

Recovery / explicit-rebuild path for pages.effective_date. Used when:
  - User edited frontmatter dates after import
  - Post-upgrade backfill orchestrator finished but the user wants to
    re-walk a subset (e.g. just meetings/) after fixing some frontmatter
  - Precedence rules change between releases

Thin wrapper over backfillEffectiveDate from commit 3 — same code path
the v0_29_1 orchestrator uses; one source of truth.

Flags mirror reindex-code:
  --source <id>      Scope to one sources row (placeholder; library
                     library doesn't filter by source today, tracked v0.30+)
  --slug-prefix P    Scope to slugs starting with P (e.g. 'meetings/')
  --dry-run          Print what WOULD change, no DB writes
  --yes              Skip confirmation prompt (required for non-TTY non-JSON)
  --json             Machine-readable result envelope
  --force            Re-apply even when computed value matches existing

Wired into src/cli.ts. CLI handles its own engine lifecycle (creates +
disconnects).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: recency-decay map + buildRecencyComponentSql (pure, unused)

src/core/search/recency-decay.ts mirrors source-boost.ts in shape but
drives RECENCY ONLY (per D9 codex resolution). Salience is a separate
orthogonal axis; this map does not feed it.

DEFAULT_RECENCY_DECAY: 10 generic prefixes (no fork-specific names).
  - concepts/      evergreen (halflifeDays=0)
  - originals/     180d × 0.5 (long-tail decay; new essays nudged)
  - writing/       365d × 0.4
  - daily/         14d × 1.5  (aggressive — freshness IS the signal)
  - meetings/      60d × 1.0
  - chat/          7d × 1.0
  - media/x/       7d × 1.5
  - media/articles/ 90d × 0.5
  - people/companies/ 365d × 0.3
  - deals/         180d × 0.5

DEFAULT_FALLBACK: 90d × 0.5 for unmatched slugs.

Override priority: defaults < gbrain.yml recency: < env (GBRAIN_RECENCY_DECAY)
< per-call SearchOpts.recency_decay.

parseRecencyDecayEnv format: comma-separated prefix:halflifeDays:coefficient
triples. Refuses LOUD on parse error (RecencyDecayParseError) — codex
pass-2 #M3 finding. No silent fallback like source-boost's parser.

parseRecencyDecayYaml takes already-parsed YAML; throws on bad shape.

buildRecencyComponentSql in sql-ranking.ts emits a CASE expression with
longest-prefix-first ordering, evergreen short-circuit (literal 0 when
halflifeDays=0 or coefficient=0), and EXTRACT(EPOCH ...) for non-zero
branches. Output: ((CASE WHEN p.slug LIKE 'daily/%' THEN 1.5 * 14.0 /
(14.0 + EXTRACT(EPOCH FROM (NOW() - <dateExpr>))/86400.0) ... END))

Typed NowExpr enum prevents SQL injection (codex pass-1 #5). Tests pass
{ kind: 'fixed', isoUtc } for deterministic output; production NOW().
The 'fixed' branch escapes single quotes via escapeSqlLiteral.

25 unit tests covering: env parser shape, env error cases, yaml parser
shape, merge precedence (defaults < yaml < env < caller), CASE longest-
prefix-first ordering, evergreen short-circuit, NowExpr fixed/now,
single-quote injection defense, empty decayMap fallback path, default
map composition (no fork names, concepts/ evergreen, daily/ aggressive).

Pure module. Zero consumers in this commit; commit 6 wires it into
getRecentSalience, commit 10 wires it into the post-fusion stage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: refactor getRecentSalience to consume buildRecencyComponentSql

Both engines (Postgres + PGLite) now build the salience formula's third
term via buildRecencyComponentSql instead of inlining 1.0 / (1 + days_old).
Parameters: empty decayMap + fallback { halflifeDays: 1, coefficient: 1.0 }.
Math expands to 1 * 1.0 / (1.0 + days_old) = 1 / (1 + days_old) — same
numeric output as v0.29.0.

This is a no-behavior-change refactor preparing for commit 7's recency_bias
param. recency_bias='flat' (default) reproduces v0.29.0 exactly; 'on'
swaps in DEFAULT_RECENCY_DECAY for per-prefix decay.

Single source of truth for the recency math: same builder feeds the
salience query AND (in commit 10) the post-fusion applyRecencyBoost stage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: get_recent_salience gains recency_bias param (default 'flat')

SalienceOpts.recency_bias: 'flat' | 'on' added; default 'flat' preserves
v0.29.0 ranking verbatim. Pass 'on' to opt into per-prefix decay map
(concepts/originals/writing/ evergreen; daily/, media/x/, chat/ aggressive
decay).

When recency_bias='on', the salience query reads
COALESCE(p.effective_date, p.updated_at) instead of bare p.updated_at, so
the recency component is immune to auto-link updated_at churn — old
concepts/ pages just-touched by auto-link don't suddenly look fresh.

Both engines (Postgres + PGLite) wire the param through. resolveRecencyDecayMap()
honors gbrain.yml + GBRAIN_RECENCY_DECAY env at runtime.

MCP op surface: get_recent_salience gains the param with a load-bearing
description teaching the agent when to use 'on' vs 'flat' (current state →
on; mattering across all time → flat).

No silent v0.29.0 behavior change — opt-in only (per D11 codex resolution).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: recompute_emotional_weight writes salience_touched_at; window picks up newly-salient pages

setEmotionalWeightBatch on both engines now bumps salience_touched_at to
NOW() ONLY when the new emotional_weight differs from the existing one
(IS DISTINCT FROM, NULL-safe). No-op writes (same weight) leave the
column alone — preserves "actual change" semantics.

getRecentSalience window changes from
  WHERE p.updated_at >= boundary
to
  WHERE GREATEST(p.updated_at, COALESCE(p.salience_touched_at, p.updated_at)) >= boundary

Closes codex pass-1 finding #4: pages whose emotional_weight just changed
in the dream cycle (because tags or takes shifted) but whose updated_at
is older than the salience window now correctly enter the recent-salience
results. Without this, "Garry just added a take to a 6-month-old page"
stayed invisible to get_recent_salience until the next content edit.

COALESCE(salience_touched_at, p.updated_at) handles pre-v0.29.1 rows
where salience_touched_at is NULL — they fall back to p.updated_at and
behave identically to v0.29.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: merge intent.ts → query-intent.ts; emit 3 suggestions per query

D1 + D4 + D6 + D8: single regex-pass classifier returning
{intent, suggestedDetail, suggestedSalience, suggestedRecency}.

intent + suggestedDetail are v0.29.0 behavior verbatim (legacy intent.ts
deleted; classifyQueryIntent + autoDetectDetail compat shims preserved).

NEW for v0.29.1 — two orthogonal recency-axis suggestions:

  suggestedSalience: 'off' | 'on' | 'strong'
  suggestedRecency:  'off' | 'on' | 'strong'

Resolution rules (per D6 narrow temporal-bound exception):
  - CANONICAL patterns (who is X / what is Y / code / graph) → both off
  - UNLESS an EXPLICIT_TEMPORAL_BOUND also matches (today / right now /
    this week / since X / last N days), in which case temporal-bound wins
  - STRONG_RECENCY (today / right now / this morning / just now) → strong
  - RECENCY_ON (latest / recent / this week / meeting prep / catch up
    / remind me / status update) → on
  - SALIENCE_ON (catch up / remind me / status update / prep me /
    what's going on / what matters) → on
  - default → off for both axes (v0.29.1 prime-directive: pure opt-in)

Salience and recency are TRULY orthogonal (per D9). A query like
"latest news on AI" → recency='on' but salience='off' (the user wants
fresh, not emotionally-weighted). "What's going on with widget-co" →
both on. "Who is X right now" → both 'strong'/'on' (temporal bound
beats canonical 'who is').

intent.ts deleted; test/intent.test.ts renamed → test/query-intent-legacy.test.ts
(unchanged behavior coverage). New test/query-intent.test.ts adds 21
cases covering all three axes' interactions: canonical wins on bare
'who is', temporal bound overrides, "catch me up" matches with up to 15
chars between, "today" → strong, intent vs recency independence.

Updated callers:
  - src/core/search/hybrid.ts (autoDetectDetail import)
  - test/recency-boost.test.ts (classifyQueryIntent import)
  - test/benchmark-search-quality.ts (autoDetectDetail import)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: applySalienceBoost + applyRecencyBoost + runPostFusionStages wrapper

D9 + codex pass-1 #2 + #3 + pass-2 #4: salience and recency are TRULY
ORTHOGONAL post-fusion stages, both running from ALL THREE hybridSearch
return paths (keyword-only, embed-failure-fallback, full-hybrid).

NEW src/core/search/hybrid.ts exports:
  - applySalienceBoost(results, scores, strength)
      score *= 1 + k * log(1 + score) where k = 0.15 (on) or 0.30 (strong)
      No time component. Pure mattering signal.
  - applyRecencyBoost(results, dates, strength, decayMap, fallback, nowMs?)
      Per-prefix decay factor: 1 + strengthMul * coefficient * halflife / (halflife + days_old)
      strengthMul: 1.0 (on) or 1.5 (strong)
      Evergreen prefixes (halflifeDays=0) skipped (factor 1.0).
      Pure recency signal. Independent of mattering.
  - runPostFusionStages(engine, results, opts)
      Wraps backlink + salience + recency. Called from EACH return path so
      keyless installs and embed failures get the same boost surface as
      the full hybrid path.

NEW engine methods (composite-keyed for multi-source isolation):
  - getEffectiveDates(refs: Array<{slug, source_id}>): Map<key, Date>
      Returns COALESCE(effective_date, updated_at, created_at). Key format:
      `${source_id}::${slug}`. Mirror of getBacklinkCounts shape.
  - getSalienceScores(refs: Array<{slug, source_id}>): Map<key, number>
      Returns emotional_weight × 5 + ln(1 + take_count). Composite key.

Deprecated (kept for back-compat through v0.29.x):
  - SearchOpts.afterDate / beforeDate (alias for since/until)
  - SearchOpts.recencyBoost: 0|1|2 (alias for recency: 'off'|'on'|'strong')
  - getPageTimestamps (use getEffectiveDates instead)

NEW SearchOpts fields:
  - salience: 'off' | 'on' | 'strong'
  - recency:  'off' | 'on' | 'strong'
  - since:    string (ISO-8601 or relative, replaces afterDate)
  - until:    string (replaces beforeDate)

Resolution: caller-explicit > legacy alias (recencyBoost) > heuristic
(classifyQuery's suggestedSalience / suggestedRecency).

Deleted: src/core/search/recency.ts (PR #618's, replaced) +
test/recency-boost.test.ts (its scope is replaced by query-intent.test.ts +
future post-fusion tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Wintermute <wintermute@garrytan.com>

* v0.29.1: query op gains salience + recency + since + until params; PGLite since/until parity

Combines commits 12 + 13 of the plan.

Query op surface (src/core/operations.ts):
  - salience: 'off' | 'on' | 'strong' (with load-bearing description)
  - recency:  'off' | 'on' | 'strong'
  - since:    string (ISO-8601 or relative; replaces deprecated afterDate)
  - until:    string (replaces deprecated beforeDate)

Tool descriptions teach the calling agent:
  - salience axis = mattering, no time component
  - recency axis = age decay, no mattering signal
  - omit either to let gbrain auto-detect from query text via classifyQuery

hybrid.ts maps since/until → afterDate/beforeDate at the engine call
boundary so PR #618's existing engine plumbing keeps working without
rename. Codex pass-1 #10 finding closed.

PGLite engine (codex pass-1 #10): since/until parity added to all three
search methods (searchKeyword, searchKeywordChunks, searchVector). SQL
filter against COALESCE(p.effective_date, p.updated_at, p.created_at)
so date filtering matches user content-date intent (a meeting was on
event_date, not when it got reimported). Filter is applied INSIDE the
HNSW inner CTE in searchVector so HNSW's candidate pool already
excludes out-of-range pages — preserves pagination contract.

This also closes existing cross-engine drift: pre-v0.29.1 Postgres had
afterDate/beforeDate from PR #618; PGLite had nothing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: migration v39 — eval_candidates capture columns for replay reproducibility

D11 codex pass-2 resolution: extend eval_candidates with 7 new nullable
columns so `gbrain eval replay` can reproduce captured runs of agent-explicit
salience + recency choices.

Without these columns, replays of the new axis params drift. The live
behavior depends on the resolved {salience, recency} values; v0.29.0's
schema doesn't capture them.

  as_of_ts            TIMESTAMPTZ  — brain's logical NOW at capture
                                     (replay uses this instead of wall-clock)
  salience_param      TEXT         — what the caller passed (NULL if omitted)
  recency_param       TEXT         — same
  salience_resolved   TEXT         — final value applied
  recency_resolved    TEXT         — same
  salience_source     TEXT         — 'caller' or 'auto_heuristic'
  recency_source      TEXT         — same

All nullable + additive. Pre-v0.29.1 rows stay valid. NDJSON
schema_version STAYS at 1 — consumers ignore unknown fields (codex
pass-1 #C2 dissolves; no cross-repo coordination needed).

ADD COLUMN with no DEFAULT is metadata-only on PG 11+ and PGLite —
instant on tables of any size.

src/schema.sql + src/core/pglite-schema.ts mirror the additions for fresh
installs; src/core/schema-embedded.ts regenerated. eval_capture.ts
populates the new fields in commit 16 (docs + ship).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: doctor checks — effective_date_health + salience_health

effective_date_health: sample-1000 scan detects three classes of
problems (codex pass-1 #5 resolution via the effective_date_source
sentinel column added in commit 1):

  fallback_with_fm_date  — page fell back to updated_at even though
                           frontmatter has parseable event_date / date /
                           published. The "wrong but populated" residual
                           that earlier review iterations missed.
  future_dated            — effective_date > NOW() + 1 year (corrupt
                            or typo'd century).
  pre_1990                — effective_date < 1990-01-01 (epoch math gone
                            wrong, bad parse).

Sample of last 1000 pages by default — fast on 200K-page brains. Fix
hint: gbrain reindex-frontmatter.

salience_health: detects pages with active takes whose emotional_weight
is still 0 (recompute_emotional_weight phase hasn't run since the
take landed). Reports the brain's non-zero emotional_weight count as
an informational baseline. Fix hint: gbrain dream --phase
recompute_emotional_weight.

Both checks gracefully skip on pre-v0.29.1 brains (column doesn't
exist → 42703) without surfacing as warnings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29.1: docs + skills convention + CHANGELOG + version bump

- VERSION 0.29.0 → 0.29.1
- package.json version bump
- CHANGELOG.md: full release-summary + itemized + "To take advantage"
  block per the project's voice rules. Two-line headline + concrete
  pathology framing (existing callers unchanged; new axes opt-in;
  agent in charge per the prime directive).
- skills/conventions/salience-and-recency.md: agent-readable decision
  rules. "Current state → on. Canonical truth → off." plus the narrow
  temporal-bound exception. Cross-cutting convention propagates to
  brain skills via RESOLVER.md.
- skills/migrations/v0.29.1.md: agent-readable upgrade instructions.
  Verify steps + behavior-change reference + recovery commands.

The build-time tool-description generator from D2 (extract decision
tables from skills/conventions/salience-and-recency.md, embed into
operations.ts at build time) is deferred to a follow-up commit. The
tool descriptions on the query op + get_recent_salience are inline in
operations.ts for v0.29.1; the auto-gen + CI staleness gate land in
v0.29.2 if drift becomes a problem in practice.

148 unit tests pass across the v0.29.1 surface (effective-date,
recency-decay, query-intent, migrate, salience, recompute-emotional-weight).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Wintermute <wintermute@garrytan.com>

---------

Co-authored-by: Wintermute <wintermute@garrytan.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 master-rebase fixups: renumber + drift cleanup

- v0.29.1 migrations renumber v38/v39 → v41/v42 (master shipped takes_table at
  v37 + access_tokens_permissions at v38; v0.27.1 took v39). My v0.29.0
  emotional_weight slots in at v40; v0.29.1's pages_recency_columns lands at
  v41 and eval_candidates_recency_capture at v42.
- src/core/utils.ts comment refs updated v37 → v40 (emotional_weight) and
  v38 → v41 (effective_date/etc).
- test/brain-allowlist.test.ts: size assertion 11 → 13 + the new
  get_recent_salience / find_anomalies positive checks + the explicit
  get_recent_transcripts negative check (v0.29 added the salience pair to
  the allow-list; transcripts are deliberately excluded because all
  subagent calls have remote=true and the v0.29 trust gate rejects them —
  visibility would be a footgun).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 CI fixups: privacy allow-list + cycle phase count + migration plan

Three CI test failures on PR #730, all caused by master-side state the
v0.29 cherry-picks didn't yet account for:

1. scripts/check-privacy.sh allow-lists test/recency-decay.test.ts
   The v0.29.1 recency-decay test asserts that DEFAULT_RECENCY_DECAY's
   keys do NOT include fork-specific path prefixes. Because the assertion
   has to name the banned tokens to assert their absence, the privacy
   guard flagged the literal occurrence. Same exception class as
   CHANGELOG.md, CLAUDE.md, and scripts/check-privacy.sh itself —
   meta-rule enforcement requires mentioning what the rule forbids.

2. test/core/cycle.serial.test.ts: 9 → 10 phases.
   The yieldBetweenPhases test was written for v0.26.5 (9 phases incl.
   purge). v0.29 added a 10th phase (recompute_emotional_weight)
   between patterns and embed; the test's expected hookCalls and
   report.phases.length needed bumping.

3. test/apply-migrations.test.ts: append '0.29.1' to skippedFuture lists.
   v0.29.1 added a new entry to src/commands/migrations/index.ts; the
   buildPlan test snapshots the exact ordered list of versions, so it
   needs the new entry in both the fresh-install case and the Codex H9
   regression case.

All three verified locally:
  - bash scripts/check-privacy.sh → exit 0
  - bun test test/apply-migrations.test.ts → 18/18 pass
  - bun test test/core/cycle.serial.test.ts → 28/28 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 CI fixup: regenerate llms-full.txt to match CLAUDE.md state

build-llms test asserts the committed llms.txt + llms-full.txt match
what the generator produces from the current source tree. CLAUDE.md
got new v0.29 Key Files entries (recompute_emotional_weight phase,
emotional-weight formula, anomaly stats, transcripts library, salience
ops, etc.) without a corresponding regen. `bun run build:llms` brings
llms-full.txt back in sync; llms.txt is byte-for-byte identical so
only the larger inline bundle changed.

Verified locally: bun test test/build-llms.test.ts → 7/7 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 e2e: cover tool-surfaces + MCP dispatch path

Two gaps were uncovered when reviewing v0.29 coverage against the new
contracts the cherry-picks landed onto master.

1. test/v0_29-tool-surfaces.test.ts (unit, 9 cases)

   Existing tests pin the description constants module and the
   BRAIN_TOOL_ALLOWLIST set membership, but nothing checked the two
   filters that ACT on those constants:

   - serve-http.ts:745 filters operations by !op.localOnly to build the
     HTTP MCP tool list. Without a test, anyone removing `localOnly: true`
     from get_recent_transcripts would silently expose it to remote
     callers — defense-in-depth on top of the in-handler ctx.remote check
     would be the only guard. Now pinned: get_recent_transcripts is
     hidden, salience + anomalies stay visible.

   - buildBrainTools surfaces the v0.29 ops as `brain_get_recent_salience`
     and `brain_find_anomalies`, and EXCLUDES `brain_get_recent_transcripts`
     (codex C3 footgun gate — all subagent calls are remote=true, the op
     would always reject). Now pinned.

   Both filters are pure functions; no DB / engine.connect needed.

2. test/e2e/v0_29-mcp-dispatch-pglite.test.ts (e2e, 5 cases)

   Existing v0.29 e2e tests call engine methods directly. None went
   through the full dispatchToolCall pipeline that stdio MCP and HTTP
   MCP both use. The new file covers:

   - get_recent_salience returns ranked rows via dispatch (top result
     is the wedding-tagged page from the seeded fixture).
   - find_anomalies returns the AnomalyResult shape via dispatch.
   - get_recent_transcripts rejects with permission_denied when
     ctx.remote === true (the in-handler trust gate is the last line if
     localOnly ever drops).
   - get_recent_transcripts succeeds with ctx.remote === false (CLI
     path) and returns [] when no corpus dir is configured.
   - Unknown tool name returns the standard isError + "Unknown tool"
     envelope (regression guard for dispatch shape).

Verified locally — all 14 cases pass:
  bun test test/v0_29-tool-surfaces.test.ts                          → 9 pass
  bun test test/e2e/v0_29-mcp-dispatch-pglite.test.ts                → 5 pass

Re-ran the full v0.29 PGLite e2e suite to confirm no regressions:
  salience-pglite.test.ts                       5 pass
  anomalies-pglite.test.ts                      4 pass
  cycle-recompute-emotional-weight-pglite.test  3 pass
  list-pages-regression.test.ts                 6 pass
  multi-source-emotional-weight-pglite.test     4 pass
  backfill-perf-pglite.test.ts                  1 pass
  v0_29-mcp-dispatch-pglite.test.ts             5 pass
  -----
  Total: 28 pass / 0 fail
  Postgres parity test (DATABASE_URL gated)     7 skip (correct)
  LLM routing eval (ANTHROPIC_API_KEY gated)   12 skip (correct)
  bun run typecheck                             clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.29 CI fixup: drop unused PGLiteEngine in tool-surfaces test

scripts/check-test-isolation.sh's R3 + R4 lints flagged the new
test/v0_29-tool-surfaces.test.ts for instantiating PGLiteEngine outside
a beforeAll() block (R3) and lacking the matching afterAll(disconnect)
(R4). The intent of those rules is to prevent engine leaks across the
shard process — every PGLiteEngine must follow the canonical
beforeAll(connect+initSchema) / afterAll(disconnect) pattern.

The fix here is upstream of the rule, not a workaround: this test never
needed an engine. buildBrainTools doesn't issue any SQL at registry-build
time — it only reads `engine.kind` for the put_page namespace-wrap
branch. A `{ kind: 'pglite' } as unknown as BrainEngine` fake-engine
literal keeps the test pure-function: no WASM cold-start, no connect
lifecycle, no test-isolation rule fired.

Verified locally:
  bash scripts/check-test-isolation.sh → OK (257 non-serial unit files)
  bun test test/v0_29-tool-surfaces.test.ts → 9 pass
  bun run typecheck → clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Wintermute <wintermute@garrytan.com>
garrytan added a commit to garrytan-agents/gbrain that referenced this pull request May 10, 2026
… (EXP-5)

Reproducible cross-modal quality eval for the takes layer. Three frontier
models score a sample against the 5-dim rubric, the runner aggregates to
PASS/FAIL/INCONCLUSIVE, the receipt persists to eval_takes_quality_runs.
Trend mode segregates by rubric_version; regress mode is a CI gate that
exits 1 when any dim regresses past --threshold.

Subcommands:
  run     [--limit N --cycles N --budget-usd N --slug-prefix P --models a,b,c]
  replay  <receipt-path> [--json]                 # NO BRAIN required
  trend   [--limit N --rubric-version V --json]
  regress --against <receipt> [--threshold T --json]

Codex review integrations (D7 — all 10 findings landed):

  garrytan#1 json-repair shim re-exports BOTH parseModelJSON AND the
     ParsedScore + ParsedModelResult types. The original plan only
     re-exported the function, which would have compile-broken
     cross-modal-eval/aggregate.ts:19's type import.

  garrytan#3 Receipt name binds (corpus_sha8, prompt_sha8, models_sha8,
     rubric_sha8) so a future rubric tweak segregates trend rows
     instead of silently corrupting the quality-over-time graph.
     RUBRIC_VERSION + rubric_sha8 are persisted in every receipt.

  garrytan#4 Pricing fail-closed: any model not in pricing.ts produces an
     actionable PricingNotFoundError before any HTTP call fires.
     Same drift problem as cross-modal-eval/runner.ts:estimateCost(),
     but explicit instead of silent zero.

  garrytan#5 Aggregate requires ALL 5 declared rubric dimensions per model.
     Cross-modal-eval v1's union-of-whatever-parsed pattern allowed a
     model to omit a dim and still PASS — that's a regression-gate
     hole. Now: missing-dim drops the contribution, treated identically
     to a parse failure. Empty-scores PASS regression guard preserved.

  garrytan#6 DB-authoritative receipt persistence. Original two-phase plan had
     a split-brain reconciliation gap (disk-success/DB-fail vanishes
     from trend; DB-success/disk-fail unreplayable). Now DB row is the
     source of truth (carries full receipt JSON in a JSONB column);
     disk artifact is best-effort. replay reads disk first; loadReceiptFromDb
     reconstructs from DB when the disk file is missing.

  garrytan#10 Brain-routing: replay is the only sub-subcommand that doesn't
      need a brain. cli.ts no-DB bypass routes "eval takes-quality replay"
      directly to runReplayNoBrain, which exits 0/1/2 cleanly without
      ever touching the engine. Other modes go through connectEngine.

Files added:
  src/core/eval-shared/json-repair.ts (hoisted from cross-modal-eval)
  src/core/takes-quality-eval/{rubric,pricing,aggregate,receipt-name,
                                receipt-write,receipt,replay,regress,trend,runner}.ts
  src/commands/eval-takes-quality.ts
  docs/eval-takes-quality.md (stable schema_version: 1 contract)
  10 test files (83 cases — aggregate / receipt-name / shim / pricing /
                 rubric / receipt-write / replay / trend / regress / cli)

Files modified:
  src/cli.ts: replay no-DB bypass + engine-required dispatch
  src/core/cross-modal-eval/json-repair.ts → re-export shim
  src/core/migrate.ts: append v47 (eval_takes_quality_runs table)
  src/core/pglite-schema.ts + src/schema.sql: mirror the v47 table for
    fresh-install path. RLS toggled on the new table.
  src/core/schema-embedded.ts: regenerated via build:schema
  test/migrate.test.ts: 6 structural cases for v47

186 tests pass; typecheck clean. Replay verified working end-to-end
(reads receipt JSON file without DATABASE_URL, exits with the verdict
code, prints actionable error on missing file).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant