Skip to content

SQLite integration#6

Closed
anhldbk wants to merge 11 commits intogarrytan:masterfrom
anhldbk:master
Closed

SQLite integration#6
anhldbk wants to merge 11 commits intogarrytan:masterfrom
anhldbk:master

Conversation

@anhldbk
Copy link
Copy Markdown

@anhldbk anhldbk commented Apr 8, 2026

Overview

GBrain is a personal knowledge brain backed by Postgres + pgvector. This work adds SQLite as a zero-infrastructure alternative engine using bun:sqlite (built-in, no dependencies), FTS5 for full-text search, and vec0 for optional vector similarity — all behind the same BrainEngine interface so every command works identically regardless of backend.

What was built:

  • SQLiteEngine — full implementation of all 30 BrainEngine methods: pages, chunks, links, tags, timeline, raw data, versions, files, ingest log, config, stats, health, and graph traversal
  • SQLite schema — DDL with FTS5 virtual table and three sync triggers (insert/update/delete) to keep full-text search current automatically; WAL mode enabled at connect time
  • vec0 integration — loads the native extension at connect, creates a parallel chunks_vec vector table when available, degrades gracefully to keyword-only when not
  • gbrain init --sqlite — initializes a local brain at ~/.gbrain/brain.db (or a custom path)
  • Engine selection — CLI reads config.engine and routes to SQLite or Postgres transparently
  • Shared utilities — extracted validateSlug and contentHash from Postgres engine into src/core/utils.ts, fixing a slug validation regex bug along the way
  • File commands fix — gbrain files list/upload/sync/verify previously bypassed the engine entirely via a direct DB connection; refactored to use BrainEngine file methods that work with both backends
  • Hybrid search fix — hybridSearch unconditionally called embed() before keyword search; without an OpenAI API key this threw and skipped results entirely; now degrades gracefully to keyword-only
  • Docker fallback — Dockerfile.sqlite for environments where the vec0 native extension needs to be pre-installed
  • 39 new tests across utils.test.ts, sqlite-engine.test.ts, and fts5-query.test.ts

Disclaimer

I've built with Claude Opus

anhldbk and others added 11 commits April 8, 2026 06:55
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements the complete BrainEngine interface using bun:sqlite (no npm
deps). Includes FTS5 keyword search, optional vec0 vector search,
pages CRUD with upsert/slug validation, tags, links, graph traversal,
timeline, versions, raw data, ingest log, config, and stats/health.
All 12 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tion

files.ts was calling db.getConnection() directly, bypassing the engine
abstraction. This broke all files subcommands when using SQLiteEngine.

Added getFiles/upsertFile/findFileByHash to BrainEngine interface and
implemented in both PostgresEngine and SQLiteEngine.
garrytan added a commit that referenced this pull request Apr 10, 2026
…hema

initSchema() previously read schema.sql from disk at runtime via readFileSync,
which broke in compiled Bun binaries and Deno Edge Functions. Now uses a
generated schema-embedded.ts constant (run `bun run build:schema` to regenerate).

- Removes fs and path imports from postgres-engine.ts and db.ts
- Adds scripts/build-schema.sh for one-source-of-truth generation
- Adds build:schema npm script

Fixes Issue #22 Bug #6.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@garrytan garrytan mentioned this pull request Apr 10, 2026
garrytan added a commit that referenced this pull request Apr 11, 2026
* fix: 7 bug fixes from Issue #9 and #22

- fix(mcp): use ListToolsRequestSchema/CallToolRequestSchema instead of string literals (Issue #9, PR #25)
- fix(mcp): handleToolCall reads dry_run from params instead of hardcoding false (#22 Bug #11)
- fix(search): keyword search returns best chunk per page via DISTINCT ON, not all chunks (#22 Bug #8)
- fix(search): dedup layer 1 keeps top 3 chunks per page instead of collapsing to 1 (#22 Bug #12)
- fix(engine): transaction uses scoped engine via Object.create, no shared state mutation (#22 Bug #2)
- fix(engine): upsertChunks uses UPSERT instead of DELETE+INSERT, preserves existing embeddings (#22 Bug #1)
- fix(slugs): validateSlug normalizes to lowercase, pathToSlug lowercases consistently (#22 Bug #4)
- schema: add unique index on content_chunks(page_id, chunk_index) for UPSERT support
- schema: add access_tokens and mcp_request_log tables via migration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: embed schema.sql at build time, remove fs dependency from initSchema

initSchema() previously read schema.sql from disk at runtime via readFileSync,
which broke in compiled Bun binaries and Deno Edge Functions. Now uses a
generated schema-embedded.ts constant (run `bun run build:schema` to regenerate).

- Removes fs and path imports from postgres-engine.ts and db.ts
- Adds scripts/build-schema.sh for one-source-of-truth generation
- Adds build:schema npm script

Fixes Issue #22 Bug #6.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: 5 more bug fixes from Issue #22

- fix(file_upload): call storage.upload() in all 3 paths (operation, CLI upload, CLI sync) with rollback semantics (#22 Bug #9)
- fix(import): use atomic index counter for parallel queue instead of array.shift() race, preserve checkpoint on errors (#22 Bug #3)
- fix(s3): replace unsigned fetch with @aws-sdk/client-s3 for proper SigV4 auth, supports R2/MinIO via forcePathStyle (#22 Bug #10)
- fix(redirect): verify remote file exists before deleting local copy, skip files not found in storage (#22 Bug #5)
- deps: add @aws-sdk/client-s3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: remote MCP server via Supabase Edge Functions

Deploy GBrain as a serverless remote MCP endpoint on your existing Supabase
instance. One brain, accessible from Claude Desktop, Claude Code, Cowork,
Perplexity Computer, and any MCP client. Zero new infrastructure.

New files:
- supabase/functions/gbrain-mcp/index.ts — Edge Function with Hono + MCP SDK
- supabase/functions/gbrain-mcp/deno.json — Deno import map
- src/edge-entry.ts — curated bundle entry point (excludes fs-dependent modules)
- src/commands/auth.ts — standalone token management (create/list/revoke/test)
- scripts/deploy-remote.sh — one-script deployment
- .env.production.example — 3-value config template

Changes:
- config.ts: lazy-evaluate CONFIG_DIR (no homedir() at module scope)
- schema.sql: add access_tokens + mcp_request_log tables
- package.json: add build:edge script

Auth: bearer tokens via access_tokens table (SHA-256 hashed, per-client, revocable)
Transport: WebStandardStreamableHTTPServerTransport (stateless, Streamable HTTP)
Health: /health endpoint (unauth: 200/503, auth: postgres/pgvector/openai checks)
Excluded from remote: sync_brain, file_upload (may exceed 60s timeout)

Setup: clone, fill .env.production, run scripts/deploy-remote.sh, create token, done.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: per-client MCP setup guides

- docs/mcp/DEPLOY.md — deployment walkthrough, auth, troubleshooting, latency table
- docs/mcp/CLAUDE_CODE.md — claude mcp add command
- docs/mcp/CLAUDE_DESKTOP.md — Settings > Integrations (NOT JSON config!)
- docs/mcp/CLAUDE_COWORK.md — remote + local bridge paths
- docs/mcp/PERPLEXITY.md — Perplexity Computer connector setup
- docs/mcp/CHATGPT.md — coming soon (requires OAuth 2.1, P0 TODO)
- docs/mcp/ALTERNATIVES.md — Tailscale Funnel + ngrok self-hosted options

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.6.0)

GBrain v0.6.0: Remote MCP server via Supabase Edge Functions + 12 bug fixes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add Remote MCP Server section to README

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: make document-release mandatory in CLAUDE.md, add MCP key files

Post-ship requirements section: document-release is NOT optional. Lists every
file that must be checked on every ship. A ship without updated docs is incomplete.

Also adds remote MCP server files to Key files section.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: batch upsertChunks into single statement to prevent deadlocks

The per-chunk UPSERT loop caused deadlocks under parallel workers because
each INSERT ON CONFLICT acquired row-level locks sequentially. Multiple
workers upserting different pages could deadlock on the shared unique index.

Fix: batch all chunks into a single multi-row INSERT ON CONFLICT statement.
One round-trip, one lock acquisition. COALESCE preserves existing embeddings
when the new value is NULL.

Fixes CI failure: "E2E: Parallel Import > parallel import with --workers 4"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: advisory lock in initSchema() prevents deadlock on concurrent DDL

When multiple processes call initSchema() concurrently (e.g., test setup +
CLI subprocess, or parallel workers during E2E tests), the schema SQL's
DROP TRIGGER + CREATE TRIGGER statements acquire AccessExclusiveLock on
different tables, causing deadlocks.

Fix: pg_advisory_lock(42) serializes all initSchema() calls within the
same database. The lock is session-scoped and released in a finally block.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add explicit test timeouts for CLI subprocess E2E tests

CLI subprocess tests (Setup Journey, Doctor Command, Parallel Import)
spawn `bun run src/cli.ts` which takes several seconds to JIT compile +
connect. The Bun test framework default 5000ms per-test timeout is too
tight for CI. Added 30-60s timeouts matching each subprocess's own
timeout to prevent false failures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: infinite recursion in config.ts exported getConfigDir/getConfigPath

The replace_all refactor created recursive functions: the exported
getConfigDir() called the private getConfigDir() which called itself.
Renamed exports to configDir()/configPath() to avoid shadowing.

Also adds scripts/smoke-test-mcp.ts — verified all 8 MCP tool calls
work against a real Postgres database.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@garrytan
Copy link
Copy Markdown
Owner

Thank you for this ambitious work on SQLite integration! We've decided to defer the local DB architecture decision (SQLite vs PGLite) to a future release so we can evaluate both approaches properly. The utility extractions and file command fixes in this PR are great ideas we'll revisit. Appreciate the comprehensive implementation!

@garrytan garrytan closed this Apr 11, 2026
@anhldbk
Copy link
Copy Markdown
Author

anhldbk commented Apr 14, 2026

@garrytan ah yes, I think PGLite is more suitable. :)

garrytan added a commit that referenced this pull request Apr 15, 2026
- #3: autopilot extract step was a no-op (imported but never called)
- #6: PGLite orphan_pages query aligned with Postgres (check both inbound+outbound)
- #8: embedPage throws instead of process.exit (was killing sync/autopilot)
- #9: dead-links set auto_fixable=false (needs repo path we may not have)
- #10: JSON auto-fix output was dead code (unreachable !jsonMode check)
- #14: autopilot lock file prevents concurrent instances
- #20: --dir without value no longer crashes extract

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request Apr 15, 2026
* feat: migrate 8 existing skills to conformance format

Add YAML frontmatter (name, version, description, triggers, tools, mutating),
Contract, Anti-Patterns, and Output Format sections to all existing skills.
Rename Workflow to Phases. Ingest becomes thin router delegating to specialized
ingestion skills (Phase 2).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add RESOLVER.md, conventions directory, and output rules

RESOLVER.md is the skill dispatcher modeled on Wintermute's AGENTS.md.
Categorized routing table: Always-on, Brain ops, Ingestion, Thinking,
Operational, Setup, Identity. Conventions directory extracts cross-cutting
rules (quality, brain-first lookup, model routing, test-before-bulk).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add skills conformance and resolver validation tests

skills-conformance.test.ts validates every skill has YAML frontmatter with
required fields, Contract, Anti-Patterns, and Output Format sections, and
manifest.json coverage. resolver.test.ts validates routing table categories,
skill path existence, and manifest-to-resolver coverage. 50 new tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add 9 brain skills from Wintermute (Phase 2)

Generalized from Wintermute's battle-tested skills:
- signal-detector: always-on idea+entity capture on every message
- brain-ops: brain-first lookup, read-enrich-write loop, source attribution
- idea-ingest: links/articles/tweets with author people page mandatory
- media-ingest: video/audio/PDF/book with entity extraction (absorbs video/youtube/book)
- meeting-ingestion: transcripts with attendee enrichment chaining
- citation-fixer: audit and fix citation formatting
- repo-architecture: filing rules by primary subject
- skill-creator: create skills with conformance standard + MECE check
- daily-task-manager: task lifecycle with priority levels

All Garry-specific references generalized. Core workflows preserved.
Updated RESOLVER.md and manifest.json.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add operational infrastructure + identity layer (Phase 3)

Operational skills:
- daily-task-prep: morning prep with calendar context and open threads
- cross-modal-review: quality gate via second model with refusal routing
- cron-scheduler: schedule staggering, quiet hours, wake-up override, idempotency
- reports: timestamped reports with keyword routing
- testing: skill validation framework (conformance checks)
- soul-audit: 6-phase interview generating SOUL.md, USER.md, ACCESS_POLICY.md, HEARTBEAT.md
- webhook-transforms: external events to brain signals with dead-letter queue

Identity layer:
- SOUL.md template (agent identity, generated by soul-audit)
- USER.md template (user profile, generated by soul-audit)
- ACCESS_POLICY.md template (4-tier access control)
- HEARTBEAT.md template (operational cadence)
- cross-modal.yaml convention (review pairs, refusal routing chain)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update CLAUDE.md with 24 skills, RESOLVER.md, conventions, templates

GBrain is now a GStack mod for agent platforms. Updated architecture description,
key files listing (16 new skill files, RESOLVER.md, conventions, templates), skills
section (24 skills organized by resolver categories), and testing section (new
conformance and resolver tests).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add GStack detection + mod status to gbrain init (Phase 4)

After brain initialization, gbrain init now reports:
- Number of skills loaded (from manifest.json)
- GStack detection (checks known host paths, uses gstack-global-discover if available)
- GStack install instructions if not found
- Resolver and soul-audit pointers

Also adds installDefaultTemplates() for SOUL.md/USER.md/ACCESS_POLICY.md/HEARTBEAT.md
deployment, and detectGStack() using gstack-global-discover with fallback to known paths
(DRY: doesn't reimplement GStack's host detection logic).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: v0.10.0 release documentation

- CHANGELOG: 24 skills, signal detector, RESOLVER.md, soul-audit, access control,
  conventions, conformance standard, GStack detection in init
- README: updated skill section with 24 skills, resolver, conventions
- TODOS: added runtime MCP access control (P1)
- VERSION: 0.9.2 → 0.10.0
- package.json + manifest.json version bumped

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add skill table to CHANGELOG v0.10.0

16-row table detailing every new skill, what it does, and why it matters.
Written to sell the upgrade, not document the implementation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: restore package.json version after merge conflict resolution

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: zero-based README rewrite for GStackBrain v0.10.0

Lead with GStack mod identity. 24 skills table organized by category.
Install block references RESOLVER.md and soul-audit. GBrain+GStack
relationship explained. Removed redundancy (733 -> 406 lines).
All essential content preserved: install, recipes, architecture,
search, commands, engines, voice, knowledge model.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: extract install block to INSTALL_FOR_AGENTS.md, simplify README

The 30-line copy-paste install block becomes one line:
"Retrieve and follow INSTALL_FOR_AGENTS.md"

Benefits: agent always gets latest instructions (no stale copy-paste),
README stays clean, install details live where agents read them.

README now leads with what GBrain does ("gives your agent a brain")
instead of GStack relationship. Removed "requires frontier model" note.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: 3 bugs in init.ts from merge conflict resolution

1. llstatSync typo (merge corruption) → lstatSync
2. __dirname undefined in ESM module → fileURLToPath polyfill
3. require('fs') in ESM → use imported readFileSync

All three would crash gbrain init at runtime. Caught by /review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add checkResolvable shared core function for resolver validation

Shared function at src/core/check-resolvable.ts validates that all skills
are reachable from RESOLVER.md, detects MECE overlaps (with whitelist for
always-on/router skills), finds gaps in frontmatter triggers, and scans
for DRY violations. Returns structured ResolvableIssue objects with
machine-parseable fix objects alongside human-readable action strings.

Three call sites: bun test, gbrain doctor, skill-creator skill.

Cleans up test/resolver.test.ts: removes stale 9-line skip list, imports
from production check-resolvable.ts instead of reimplementing parsing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: expand doctor with resolver validation, filesystem-first architecture

Doctor now runs filesystem checks (resolver health, skill conformance) before
connecting to DB. New --fast flag skips DB checks. Falls back to filesystem-only
when DB is unavailable. Adds schema_version: 2 to JSON output, composite health
score (0-100), and structured issues array with action strings for agent parsing.

Resolver health check calls checkResolvable() and surfaces actionable fix
instructions. Link integrity check uses engine.getHealth() dead_links count.

CLI routing split: doctor dispatched before connectEngine() so filesystem
checks always run. Fixes Codex-identified blocker where doctor required DB.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add adaptive load-aware throttling and fail-improve loop

backoff.ts: System load checking (CPU via os.loadavg, memory via os.freemem),
exponential backoff with 20-attempt max guard, active hours multiplier (2x
slower during waking hours), concurrent process limit (max 2). Windows-safe:
defaults to "proceed" when os.loadavg returns zeros.

fail-improve.ts: Deterministic-first, LLM-fallback pattern with JSONL failure
logging. Cascade failure handling: when both paths fail, throws LLM error and
logs both. Log rotation at 1000 entries. Call count tracking for deterministic
hit rate metrics. Auto-generates test cases from successful LLM fallbacks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add transcription service and enrichment-as-a-service

transcription.ts: Groq Whisper (default) with OpenAI fallback. Files >25MB
segmented via ffmpeg. Provider auto-detection from env vars. Clear error
messages for missing API keys and unsupported formats.

enrichment-service.ts: Global enrichment service callable from any ingest
pathway. Entity slug generation (people/jane-doe, companies/acme-corp),
mention counting via searchKeyword, tier auto-escalation (Tier 3→2→1 based
on mention frequency and source diversity), batch enrichment with backoff
throttling, regex-based entity extraction from text.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add data-research skill with recipe system, extraction, dedup, tracker

New skill: data-research — one parameterized pipeline for any email-to-
structured-data workflow (investor updates, donations, company metrics).
7-phase pipeline: define recipe, search, classify, extract (with extraction
integrity rule), archive, deduplicate, update tracker.

data-research.ts: Recipe validation, MRR/ARR/runway/headcount regex
extraction (battle-tested patterns), dedup with configurable tolerance,
markdown tracker parsing/appending, quarterly/monthly date windowing,
6-phase HTML email stripping with 500KB ReDoS cap.

Registers data-research in manifest.json (25th skill) and RESOLVER.md.
Fixes backoff test robustness for high-load systems.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for v0.10.0 infrastructure additions

CLAUDE.md: added 6 new core files (check-resolvable, backoff, fail-improve,
transcription, enrichment-service, data-research), 6 new test files, updated
skill count to 25, test file count to 34.

README.md: updated skill count to 25, added data-research to skills table.

CHANGELOG.md: added Infrastructure section documenting resolver validation,
doctor expansion, adaptive throttling, fail-improve loop, voice transcription,
enrichment service, and data-research skill.

TODOS.md: anonymized personal references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: doctor.ts use ES module imports, harden backoff test

Replace require('fs') with ES module import in doctor.ts for consistency
with the rest of the file. Backoff test made resilient to parallel test
execution leaking module-level state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: sync --watch routing, dead_links parity, doctor command, embed --slugs

- Move sync to CLI_ONLY so --watch flag reaches runSync() (was routed through
  operation layer which only calls performSync single-pass)
- Hide sync_brain from CLI help (MCP still exposes it)
- Fix performFullSync missing sync state persistence (C1)
- Align Postgres dead_links query to match PGLite (count dangling links, not
  empty-content chunks) (C3)
- Fix doctor recommending nonexistent 'gbrain embed refresh' (C4)
- Refactor doctor outputResults to not call process.exit directly
- Add --slugs flag to embed for targeted page embedding
- Add sync auto-extract + auto-embed after performSync
- Add noExtract to SyncOpts
- Route extract, features, autopilot in CLI_ONLY
- Update help text with new commands

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: extract, features, and autopilot commands

- gbrain extract <links|timeline|all> — batch extraction of links and timeline
  entries from brain markdown files. Broad regex for all .md links (C7: filters
  external URLs). Frontmatter field parsing (company, investors, attendees).
  Directory-based link type inference. JSONL progress on stderr for agents.
  Sync integration hooks (extractLinksForSlugs, extractTimelineForSlugs).

- gbrain features [--json] [--auto-fix] — scan brain usage, pitch unused features
  with the user's own numbers. Priority 1 (data quality): missing embeddings,
  dead links. Priority 2 (unused features): zero links, zero timeline, low
  coverage, unconfigured integrations, no sync. Embedded recipe metadata for
  binary-safe integration detection. Persistence in ~/.gbrain/feature-offers.json.
  Doctor teaser hook. Upgrade hook.

- gbrain autopilot [--repo] [--interval N] — self-maintaining brain daemon.
  Pipeline: sync → extract → embed. Health-based adaptive scheduling
  (brain_score >= 90 doubles interval, < 70 halves it). --install/--uninstall
  for launchd (macOS) and crontab (Linux). Signal handling. Consecutive error
  tracking (stops at 5). Log to ~/.gbrain/autopilot.log.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: hook features scan into post-upgrade flow

After gbrain post-upgrade completes, automatically run gbrain features to show
the user what's new and what to fix. Best-effort (doesn't fail the upgrade).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: brain_score (0-100) in BrainHealth

Weighted composite score computed in getHealth() for both Postgres and PGLite:
  embed_coverage: 0.35, link_density: 0.25, timeline_coverage: 0.15,
  no_orphans: 0.15, no_dead_links: 0.10

Returns 0 for empty brains. Agents use brain_score as a health gate.
Autopilot uses it for adaptive scheduling (>=90 slows down, <70 speeds up).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: extract and features unit tests

25 tests covering:
- extractMarkdownLinks: relative links, external URL filtering, edge cases
- extractLinksFromFile: slug resolution, frontmatter parsing, directory-based
  type inference (works_at, deal_for, invested_in)
- extractTimelineFromContent: bullet format, header format with detail,
  em/en dash handling, empty content
- features: module exports, brain_score calculation weights, CLI routing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: instruction layer for extract, features, autopilot

Agent-facing tools are invisible without instruction-layer coverage.
- RESOLVER.md: add routing for extract, features, autopilot
- maintain/SKILL.md: add link graph extraction, timeline extraction,
  autopilot check sections

Without these, agents reading skills/ will never discover or run the
new commands. This is the #1 DX finding from the devex review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.10.1)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: sync CLAUDE.md with v0.10.1 additions

Add extract.ts, features.ts, autopilot.ts to key files.
Add extract.test.ts, features.test.ts to test list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: adversarial review fixes — 7 issues

- #3: autopilot extract step was a no-op (imported but never called)
- #6: PGLite orphan_pages query aligned with Postgres (check both inbound+outbound)
- #8: embedPage throws instead of process.exit (was killing sync/autopilot)
- #9: dead-links set auto_fixable=false (needs repo path we may not have)
- #10: JSON auto-fix output was dead code (unreachable !jsonMode check)
- #14: autopilot lock file prevents concurrent instances
- #20: --dir without value no longer crashes extract

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* security: fix command injection + plaintext API key in daemon install

- #1: Crontab install used echo pipe with shell-interpolated values.
  Now uses a temp file via crontab(1) and single-quote escaping on all
  interpolated paths. No shell expansion possible.

- #2: OPENAI_API_KEY was baked as plaintext into the launchd plist
  (readable by any local process, backed up by Time Machine). Now uses
  a wrapper script (~/.gbrain/autopilot-run.sh) that sources ~/.zshrc
  at runtime. No secrets in plist or crontab.

- #16: extract.ts used a custom 20-line YAML parser that only handled
  single-line key:value pairs. Multi-line arrays (attendees list with
  - items) were silently ignored. Now uses the project's gray-matter
  parser via parseMarkdown() from src/core/markdown.ts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@esteban-dozsa
Copy link
Copy Markdown

@garrytan consider surrealBD https://github.com/surrealdb/surrealdb hybrid search all in one.

TFITZ57 added a commit to TFITZ57/gbrain that referenced this pull request Apr 23, 2026
* feat: GStackBrain — 16 new skills, resolver, conventions, identity layer (v0.10.0) (#120)

* feat: migrate 8 existing skills to conformance format

Add YAML frontmatter (name, version, description, triggers, tools, mutating),
Contract, Anti-Patterns, and Output Format sections to all existing skills.
Rename Workflow to Phases. Ingest becomes thin router delegating to specialized
ingestion skills (Phase 2).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add RESOLVER.md, conventions directory, and output rules

RESOLVER.md is the skill dispatcher modeled on Wintermute's AGENTS.md.
Categorized routing table: Always-on, Brain ops, Ingestion, Thinking,
Operational, Setup, Identity. Conventions directory extracts cross-cutting
rules (quality, brain-first lookup, model routing, test-before-bulk).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add skills conformance and resolver validation tests

skills-conformance.test.ts validates every skill has YAML frontmatter with
required fields, Contract, Anti-Patterns, and Output Format sections, and
manifest.json coverage. resolver.test.ts validates routing table categories,
skill path existence, and manifest-to-resolver coverage. 50 new tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add 9 brain skills from Wintermute (Phase 2)

Generalized from Wintermute's battle-tested skills:
- signal-detector: always-on idea+entity capture on every message
- brain-ops: brain-first lookup, read-enrich-write loop, source attribution
- idea-ingest: links/articles/tweets with author people page mandatory
- media-ingest: video/audio/PDF/book with entity extraction (absorbs video/youtube/book)
- meeting-ingestion: transcripts with attendee enrichment chaining
- citation-fixer: audit and fix citation formatting
- repo-architecture: filing rules by primary subject
- skill-creator: create skills with conformance standard + MECE check
- daily-task-manager: task lifecycle with priority levels

All Garry-specific references generalized. Core workflows preserved.
Updated RESOLVER.md and manifest.json.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add operational infrastructure + identity layer (Phase 3)

Operational skills:
- daily-task-prep: morning prep with calendar context and open threads
- cross-modal-review: quality gate via second model with refusal routing
- cron-scheduler: schedule staggering, quiet hours, wake-up override, idempotency
- reports: timestamped reports with keyword routing
- testing: skill validation framework (conformance checks)
- soul-audit: 6-phase interview generating SOUL.md, USER.md, ACCESS_POLICY.md, HEARTBEAT.md
- webhook-transforms: external events to brain signals with dead-letter queue

Identity layer:
- SOUL.md template (agent identity, generated by soul-audit)
- USER.md template (user profile, generated by soul-audit)
- ACCESS_POLICY.md template (4-tier access control)
- HEARTBEAT.md template (operational cadence)
- cross-modal.yaml convention (review pairs, refusal routing chain)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update CLAUDE.md with 24 skills, RESOLVER.md, conventions, templates

GBrain is now a GStack mod for agent platforms. Updated architecture description,
key files listing (16 new skill files, RESOLVER.md, conventions, templates), skills
section (24 skills organized by resolver categories), and testing section (new
conformance and resolver tests).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add GStack detection + mod status to gbrain init (Phase 4)

After brain initialization, gbrain init now reports:
- Number of skills loaded (from manifest.json)
- GStack detection (checks known host paths, uses gstack-global-discover if available)
- GStack install instructions if not found
- Resolver and soul-audit pointers

Also adds installDefaultTemplates() for SOUL.md/USER.md/ACCESS_POLICY.md/HEARTBEAT.md
deployment, and detectGStack() using gstack-global-discover with fallback to known paths
(DRY: doesn't reimplement GStack's host detection logic).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: v0.10.0 release documentation

- CHANGELOG: 24 skills, signal detector, RESOLVER.md, soul-audit, access control,
  conventions, conformance standard, GStack detection in init
- README: updated skill section with 24 skills, resolver, conventions
- TODOS: added runtime MCP access control (P1)
- VERSION: 0.9.2 → 0.10.0
- package.json + manifest.json version bumped

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add skill table to CHANGELOG v0.10.0

16-row table detailing every new skill, what it does, and why it matters.
Written to sell the upgrade, not document the implementation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: restore package.json version after merge conflict resolution

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: zero-based README rewrite for GStackBrain v0.10.0

Lead with GStack mod identity. 24 skills table organized by category.
Install block references RESOLVER.md and soul-audit. GBrain+GStack
relationship explained. Removed redundancy (733 -> 406 lines).
All essential content preserved: install, recipes, architecture,
search, commands, engines, voice, knowledge model.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: extract install block to INSTALL_FOR_AGENTS.md, simplify README

The 30-line copy-paste install block becomes one line:
"Retrieve and follow INSTALL_FOR_AGENTS.md"

Benefits: agent always gets latest instructions (no stale copy-paste),
README stays clean, install details live where agents read them.

README now leads with what GBrain does ("gives your agent a brain")
instead of GStack relationship. Removed "requires frontier model" note.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: 3 bugs in init.ts from merge conflict resolution

1. llstatSync typo (merge corruption) → lstatSync
2. __dirname undefined in ESM module → fileURLToPath polyfill
3. require('fs') in ESM → use imported readFileSync

All three would crash gbrain init at runtime. Caught by /review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add checkResolvable shared core function for resolver validation

Shared function at src/core/check-resolvable.ts validates that all skills
are reachable from RESOLVER.md, detects MECE overlaps (with whitelist for
always-on/router skills), finds gaps in frontmatter triggers, and scans
for DRY violations. Returns structured ResolvableIssue objects with
machine-parseable fix objects alongside human-readable action strings.

Three call sites: bun test, gbrain doctor, skill-creator skill.

Cleans up test/resolver.test.ts: removes stale 9-line skip list, imports
from production check-resolvable.ts instead of reimplementing parsing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: expand doctor with resolver validation, filesystem-first architecture

Doctor now runs filesystem checks (resolver health, skill conformance) before
connecting to DB. New --fast flag skips DB checks. Falls back to filesystem-only
when DB is unavailable. Adds schema_version: 2 to JSON output, composite health
score (0-100), and structured issues array with action strings for agent parsing.

Resolver health check calls checkResolvable() and surfaces actionable fix
instructions. Link integrity check uses engine.getHealth() dead_links count.

CLI routing split: doctor dispatched before connectEngine() so filesystem
checks always run. Fixes Codex-identified blocker where doctor required DB.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add adaptive load-aware throttling and fail-improve loop

backoff.ts: System load checking (CPU via os.loadavg, memory via os.freemem),
exponential backoff with 20-attempt max guard, active hours multiplier (2x
slower during waking hours), concurrent process limit (max 2). Windows-safe:
defaults to "proceed" when os.loadavg returns zeros.

fail-improve.ts: Deterministic-first, LLM-fallback pattern with JSONL failure
logging. Cascade failure handling: when both paths fail, throws LLM error and
logs both. Log rotation at 1000 entries. Call count tracking for deterministic
hit rate metrics. Auto-generates test cases from successful LLM fallbacks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add transcription service and enrichment-as-a-service

transcription.ts: Groq Whisper (default) with OpenAI fallback. Files >25MB
segmented via ffmpeg. Provider auto-detection from env vars. Clear error
messages for missing API keys and unsupported formats.

enrichment-service.ts: Global enrichment service callable from any ingest
pathway. Entity slug generation (people/jane-doe, companies/acme-corp),
mention counting via searchKeyword, tier auto-escalation (Tier 3→2→1 based
on mention frequency and source diversity), batch enrichment with backoff
throttling, regex-based entity extraction from text.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add data-research skill with recipe system, extraction, dedup, tracker

New skill: data-research — one parameterized pipeline for any email-to-
structured-data workflow (investor updates, donations, company metrics).
7-phase pipeline: define recipe, search, classify, extract (with extraction
integrity rule), archive, deduplicate, update tracker.

data-research.ts: Recipe validation, MRR/ARR/runway/headcount regex
extraction (battle-tested patterns), dedup with configurable tolerance,
markdown tracker parsing/appending, quarterly/monthly date windowing,
6-phase HTML email stripping with 500KB ReDoS cap.

Registers data-research in manifest.json (25th skill) and RESOLVER.md.
Fixes backoff test robustness for high-load systems.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for v0.10.0 infrastructure additions

CLAUDE.md: added 6 new core files (check-resolvable, backoff, fail-improve,
transcription, enrichment-service, data-research), 6 new test files, updated
skill count to 25, test file count to 34.

README.md: updated skill count to 25, added data-research to skills table.

CHANGELOG.md: added Infrastructure section documenting resolver validation,
doctor expansion, adaptive throttling, fail-improve loop, voice transcription,
enrichment service, and data-research skill.

TODOS.md: anonymized personal references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: doctor.ts use ES module imports, harden backoff test

Replace require('fs') with ES module import in doctor.ts for consistency
with the rest of the file. Backoff test made resilient to parallel test
execution leaking module-level state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: README rewrite with production brain stats, sample output, new infrastructure

Lead with the flex: 17,888 pages, 4,383 people, 723 companies, 526 meeting
transcripts built in 12 days. Show sample query output so readers see what
they'll get. Document self-improving infrastructure (tier auto-escalation,
fail-improve loop, doctor trajectory). Add data-research recipes to Getting
Data In. Update commands section with doctor --fix, transcribe, research
init/list. Fix stale "24" references to "25".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: README lead with YC President origin and production agent deployments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: README lead with skill philosophy and link to Thin Harness Fat Skills

Skills section now explains: skill files are code, they encode entire
workflows, they call deterministic TypeScript for the parts that shouldn't
be LLM judgment. Links to the tweet and the architecture essay.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: link GStack repo, add 70K stars and 30K daily users

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: remove meeting transcript count from README (sensitive)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: README lead with YC President origin and production agent deployments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: rename political-donations recipe to expense-tracker (sensitivity)

Renamed the built-in data-research recipe from political-donations to
expense-tracker across README, CHANGELOG, SKILL.md, and reports routing.
Same extraction patterns (amounts, dates, recipients), neutral framing.
Also renamed social-radar keyword route to social-mentions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: sync pipeline, extract, features, autopilot (v0.10.1) (#129)

* feat: migrate 8 existing skills to conformance format

Add YAML frontmatter (name, version, description, triggers, tools, mutating),
Contract, Anti-Patterns, and Output Format sections to all existing skills.
Rename Workflow to Phases. Ingest becomes thin router delegating to specialized
ingestion skills (Phase 2).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add RESOLVER.md, conventions directory, and output rules

RESOLVER.md is the skill dispatcher modeled on Wintermute's AGENTS.md.
Categorized routing table: Always-on, Brain ops, Ingestion, Thinking,
Operational, Setup, Identity. Conventions directory extracts cross-cutting
rules (quality, brain-first lookup, model routing, test-before-bulk).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add skills conformance and resolver validation tests

skills-conformance.test.ts validates every skill has YAML frontmatter with
required fields, Contract, Anti-Patterns, and Output Format sections, and
manifest.json coverage. resolver.test.ts validates routing table categories,
skill path existence, and manifest-to-resolver coverage. 50 new tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add 9 brain skills from Wintermute (Phase 2)

Generalized from Wintermute's battle-tested skills:
- signal-detector: always-on idea+entity capture on every message
- brain-ops: brain-first lookup, read-enrich-write loop, source attribution
- idea-ingest: links/articles/tweets with author people page mandatory
- media-ingest: video/audio/PDF/book with entity extraction (absorbs video/youtube/book)
- meeting-ingestion: transcripts with attendee enrichment chaining
- citation-fixer: audit and fix citation formatting
- repo-architecture: filing rules by primary subject
- skill-creator: create skills with conformance standard + MECE check
- daily-task-manager: task lifecycle with priority levels

All Garry-specific references generalized. Core workflows preserved.
Updated RESOLVER.md and manifest.json.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add operational infrastructure + identity layer (Phase 3)

Operational skills:
- daily-task-prep: morning prep with calendar context and open threads
- cross-modal-review: quality gate via second model with refusal routing
- cron-scheduler: schedule staggering, quiet hours, wake-up override, idempotency
- reports: timestamped reports with keyword routing
- testing: skill validation framework (conformance checks)
- soul-audit: 6-phase interview generating SOUL.md, USER.md, ACCESS_POLICY.md, HEARTBEAT.md
- webhook-transforms: external events to brain signals with dead-letter queue

Identity layer:
- SOUL.md template (agent identity, generated by soul-audit)
- USER.md template (user profile, generated by soul-audit)
- ACCESS_POLICY.md template (4-tier access control)
- HEARTBEAT.md template (operational cadence)
- cross-modal.yaml convention (review pairs, refusal routing chain)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update CLAUDE.md with 24 skills, RESOLVER.md, conventions, templates

GBrain is now a GStack mod for agent platforms. Updated architecture description,
key files listing (16 new skill files, RESOLVER.md, conventions, templates), skills
section (24 skills organized by resolver categories), and testing section (new
conformance and resolver tests).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add GStack detection + mod status to gbrain init (Phase 4)

After brain initialization, gbrain init now reports:
- Number of skills loaded (from manifest.json)
- GStack detection (checks known host paths, uses gstack-global-discover if available)
- GStack install instructions if not found
- Resolver and soul-audit pointers

Also adds installDefaultTemplates() for SOUL.md/USER.md/ACCESS_POLICY.md/HEARTBEAT.md
deployment, and detectGStack() using gstack-global-discover with fallback to known paths
(DRY: doesn't reimplement GStack's host detection logic).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: v0.10.0 release documentation

- CHANGELOG: 24 skills, signal detector, RESOLVER.md, soul-audit, access control,
  conventions, conformance standard, GStack detection in init
- README: updated skill section with 24 skills, resolver, conventions
- TODOS: added runtime MCP access control (P1)
- VERSION: 0.9.2 → 0.10.0
- package.json + manifest.json version bumped

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add skill table to CHANGELOG v0.10.0

16-row table detailing every new skill, what it does, and why it matters.
Written to sell the upgrade, not document the implementation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: restore package.json version after merge conflict resolution

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: zero-based README rewrite for GStackBrain v0.10.0

Lead with GStack mod identity. 24 skills table organized by category.
Install block references RESOLVER.md and soul-audit. GBrain+GStack
relationship explained. Removed redundancy (733 -> 406 lines).
All essential content preserved: install, recipes, architecture,
search, commands, engines, voice, knowledge model.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: extract install block to INSTALL_FOR_AGENTS.md, simplify README

The 30-line copy-paste install block becomes one line:
"Retrieve and follow INSTALL_FOR_AGENTS.md"

Benefits: agent always gets latest instructions (no stale copy-paste),
README stays clean, install details live where agents read them.

README now leads with what GBrain does ("gives your agent a brain")
instead of GStack relationship. Removed "requires frontier model" note.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: 3 bugs in init.ts from merge conflict resolution

1. llstatSync typo (merge corruption) → lstatSync
2. __dirname undefined in ESM module → fileURLToPath polyfill
3. require('fs') in ESM → use imported readFileSync

All three would crash gbrain init at runtime. Caught by /review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add checkResolvable shared core function for resolver validation

Shared function at src/core/check-resolvable.ts validates that all skills
are reachable from RESOLVER.md, detects MECE overlaps (with whitelist for
always-on/router skills), finds gaps in frontmatter triggers, and scans
for DRY violations. Returns structured ResolvableIssue objects with
machine-parseable fix objects alongside human-readable action strings.

Three call sites: bun test, gbrain doctor, skill-creator skill.

Cleans up test/resolver.test.ts: removes stale 9-line skip list, imports
from production check-resolvable.ts instead of reimplementing parsing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: expand doctor with resolver validation, filesystem-first architecture

Doctor now runs filesystem checks (resolver health, skill conformance) before
connecting to DB. New --fast flag skips DB checks. Falls back to filesystem-only
when DB is unavailable. Adds schema_version: 2 to JSON output, composite health
score (0-100), and structured issues array with action strings for agent parsing.

Resolver health check calls checkResolvable() and surfaces actionable fix
instructions. Link integrity check uses engine.getHealth() dead_links count.

CLI routing split: doctor dispatched before connectEngine() so filesystem
checks always run. Fixes Codex-identified blocker where doctor required DB.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add adaptive load-aware throttling and fail-improve loop

backoff.ts: System load checking (CPU via os.loadavg, memory via os.freemem),
exponential backoff with 20-attempt max guard, active hours multiplier (2x
slower during waking hours), concurrent process limit (max 2). Windows-safe:
defaults to "proceed" when os.loadavg returns zeros.

fail-improve.ts: Deterministic-first, LLM-fallback pattern with JSONL failure
logging. Cascade failure handling: when both paths fail, throws LLM error and
logs both. Log rotation at 1000 entries. Call count tracking for deterministic
hit rate metrics. Auto-generates test cases from successful LLM fallbacks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add transcription service and enrichment-as-a-service

transcription.ts: Groq Whisper (default) with OpenAI fallback. Files >25MB
segmented via ffmpeg. Provider auto-detection from env vars. Clear error
messages for missing API keys and unsupported formats.

enrichment-service.ts: Global enrichment service callable from any ingest
pathway. Entity slug generation (people/jane-doe, companies/acme-corp),
mention counting via searchKeyword, tier auto-escalation (Tier 3→2→1 based
on mention frequency and source diversity), batch enrichment with backoff
throttling, regex-based entity extraction from text.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add data-research skill with recipe system, extraction, dedup, tracker

New skill: data-research — one parameterized pipeline for any email-to-
structured-data workflow (investor updates, donations, company metrics).
7-phase pipeline: define recipe, search, classify, extract (with extraction
integrity rule), archive, deduplicate, update tracker.

data-research.ts: Recipe validation, MRR/ARR/runway/headcount regex
extraction (battle-tested patterns), dedup with configurable tolerance,
markdown tracker parsing/appending, quarterly/monthly date windowing,
6-phase HTML email stripping with 500KB ReDoS cap.

Registers data-research in manifest.json (25th skill) and RESOLVER.md.
Fixes backoff test robustness for high-load systems.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for v0.10.0 infrastructure additions

CLAUDE.md: added 6 new core files (check-resolvable, backoff, fail-improve,
transcription, enrichment-service, data-research), 6 new test files, updated
skill count to 25, test file count to 34.

README.md: updated skill count to 25, added data-research to skills table.

CHANGELOG.md: added Infrastructure section documenting resolver validation,
doctor expansion, adaptive throttling, fail-improve loop, voice transcription,
enrichment service, and data-research skill.

TODOS.md: anonymized personal references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: doctor.ts use ES module imports, harden backoff test

Replace require('fs') with ES module import in doctor.ts for consistency
with the rest of the file. Backoff test made resilient to parallel test
execution leaking module-level state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: sync --watch routing, dead_links parity, doctor command, embed --slugs

- Move sync to CLI_ONLY so --watch flag reaches runSync() (was routed through
  operation layer which only calls performSync single-pass)
- Hide sync_brain from CLI help (MCP still exposes it)
- Fix performFullSync missing sync state persistence (C1)
- Align Postgres dead_links query to match PGLite (count dangling links, not
  empty-content chunks) (C3)
- Fix doctor recommending nonexistent 'gbrain embed refresh' (C4)
- Refactor doctor outputResults to not call process.exit directly
- Add --slugs flag to embed for targeted page embedding
- Add sync auto-extract + auto-embed after performSync
- Add noExtract to SyncOpts
- Route extract, features, autopilot in CLI_ONLY
- Update help text with new commands

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: extract, features, and autopilot commands

- gbrain extract <links|timeline|all> — batch extraction of links and timeline
  entries from brain markdown files. Broad regex for all .md links (C7: filters
  external URLs). Frontmatter field parsing (company, investors, attendees).
  Directory-based link type inference. JSONL progress on stderr for agents.
  Sync integration hooks (extractLinksForSlugs, extractTimelineForSlugs).

- gbrain features [--json] [--auto-fix] — scan brain usage, pitch unused features
  with the user's own numbers. Priority 1 (data quality): missing embeddings,
  dead links. Priority 2 (unused features): zero links, zero timeline, low
  coverage, unconfigured integrations, no sync. Embedded recipe metadata for
  binary-safe integration detection. Persistence in ~/.gbrain/feature-offers.json.
  Doctor teaser hook. Upgrade hook.

- gbrain autopilot [--repo] [--interval N] — self-maintaining brain daemon.
  Pipeline: sync → extract → embed. Health-based adaptive scheduling
  (brain_score >= 90 doubles interval, < 70 halves it). --install/--uninstall
  for launchd (macOS) and crontab (Linux). Signal handling. Consecutive error
  tracking (stops at 5). Log to ~/.gbrain/autopilot.log.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: hook features scan into post-upgrade flow

After gbrain post-upgrade completes, automatically run gbrain features to show
the user what's new and what to fix. Best-effort (doesn't fail the upgrade).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: brain_score (0-100) in BrainHealth

Weighted composite score computed in getHealth() for both Postgres and PGLite:
  embed_coverage: 0.35, link_density: 0.25, timeline_coverage: 0.15,
  no_orphans: 0.15, no_dead_links: 0.10

Returns 0 for empty brains. Agents use brain_score as a health gate.
Autopilot uses it for adaptive scheduling (>=90 slows down, <70 speeds up).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: extract and features unit tests

25 tests covering:
- extractMarkdownLinks: relative links, external URL filtering, edge cases
- extractLinksFromFile: slug resolution, frontmatter parsing, directory-based
  type inference (works_at, deal_for, invested_in)
- extractTimelineFromContent: bullet format, header format with detail,
  em/en dash handling, empty content
- features: module exports, brain_score calculation weights, CLI routing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: instruction layer for extract, features, autopilot

Agent-facing tools are invisible without instruction-layer coverage.
- RESOLVER.md: add routing for extract, features, autopilot
- maintain/SKILL.md: add link graph extraction, timeline extraction,
  autopilot check sections

Without these, agents reading skills/ will never discover or run the
new commands. This is the #1 DX finding from the devex review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.10.1)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: sync CLAUDE.md with v0.10.1 additions

Add extract.ts, features.ts, autopilot.ts to key files.
Add extract.test.ts, features.test.ts to test list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: adversarial review fixes — 7 issues

- #3: autopilot extract step was a no-op (imported but never called)
- #6: PGLite orphan_pages query aligned with Postgres (check both inbound+outbound)
- #8: embedPage throws instead of process.exit (was killing sync/autopilot)
- #9: dead-links set auto_fixable=false (needs repo path we may not have)
- #10: JSON auto-fix output was dead code (unreachable !jsonMode check)
- #14: autopilot lock file prevents concurrent instances
- #20: --dir without value no longer crashes extract

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* security: fix command injection + plaintext API key in daemon install

- #1: Crontab install used echo pipe with shell-interpolated values.
  Now uses a temp file via crontab(1) and single-quote escaping on all
  interpolated paths. No shell expansion possible.

- #2: OPENAI_API_KEY was baked as plaintext into the launchd plist
  (readable by any local process, backed up by Time Machine). Now uses
  a wrapper script (~/.gbrain/autopilot-run.sh) that sources ~/.zshrc
  at runtime. No secrets in plist or crontab.

- #16: extract.ts used a custom 20-line YAML parser that only handled
  single-line key:value pairs. Multi-line arrays (attendees list with
  - items) were silently ignored. Now uses the project's gray-matter
  parser via parseMarkdown() from src/core/markdown.ts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* security: fix wave 3 — 9 vulns (file_upload, SSRF, recipe trust, prompt injection) (#174)

* feat(engine): add cap parameter to clampSearchLimit (H6)

clampSearchLimit(limit, defaultLimit, cap = MAX_SEARCH_LIMIT) — third arg
is a caller-specified cap so operation handlers can enforce limits below
MAX_SEARCH_LIMIT. Backward compatible: existing two-arg callers still cap
at MAX_SEARCH_LIMIT.

This fixes a Codex-caught semantics bug: the prior signature took (limit,
defaultLimit) where the second arg was misread as a cap. clampSearchLimit(x, 20)
was actually allowing values up to 100, not 20.

* feat(integrations): SSRF defense + recipe trust boundary (B1, B2, Fix 2, Fix 4, B3, B4)

- B1: split loadAllRecipes into trusted (package-bundled) and untrusted
  (cwd/recipes, $GBRAIN_RECIPES_DIR) tiers. Only package-bundled recipes
  get embedded=true. Closes the fake trust boundary that let any cwd-local
  recipe bypass health-check gates.
- B2: hard-block string health_checks for non-embedded recipes (was previously
  only blocked when isUnsafeHealthCheck regex matched, which the cwd recipe
  exploit bypassed). Embedded recipes still get the regex defense.
- Fix 2: gate command DSL health_checks on isEmbedded. Non-embedded
  recipes cannot spawnSync.
- Fix 4 + B3 + B4: gate http DSL health_checks on isEmbedded; for embedded
  recipes, validate URLs via new isInternalUrl() before fetch:
  - Scheme allowlist (http/https only): blocks file:, data:, blob:, ftp:, javascript:
  - IPv4 range check covering hex/octal/decimal/single-integer bypass forms
  - IPv6 loopback ::1 + IPv4-mapped ::ffff: (canonicalized hex hextets handled)
  - Metadata hostnames (AWS, GCP, instance-data) blocked
  - fetch with redirect: 'manual' + per-hop re-validation up to 3 hops

Original PRs #105-109 by @garagon. Wave 3 collector branch reimplemented
the fixes after Codex outside-voice review found that PRs #106/#108 alone
did not actually gate cwd-local recipes (B1) and that PR #108 missed
redirect-following SSRF (B3) and non-http schemes (B4).

* feat(file_upload): path/slug/filename validation + remote-caller confinement (Fix 1, B5, H5, M4, Fix 5)

- Fix 1 + B5 + H1: validateUploadPath uses realpathSync + path.relative
  to defeat symlink-parent traversal. lstatSync alone (the original PR #105
  approach) only catches final-component symlinks; a symlinked parent dir
  still followed to /etc/passwd. Now the entire path chain is resolved.
- H5: validatePageSlug uses an allowlist regex (alphanumeric + hyphens,
  slash-separated segments). Closes URL-encoded traversal (%2e%2e%2f),
  Unicode lookalikes, backslashes, control chars implicitly.
- M4: validateFilename allowlist regex. Rejects control chars, backslash,
  RTL override (\u202E), leading dot/dash. Filename flows into storage_path
  so this matters for every storage backend.
- Fix 5: clamp list_pages and get_ingest_log limits at the operation layer
  via new clampSearchLimit cap parameter (list_pages caps at 100,
  get_ingest_log at 50). Internal bulk commands bypass the operation
  layer and remain uncapped.
- New OperationContext.remote flag distinguishes trusted local CLI from
  untrusted MCP callers. file_upload uses strict cwd confinement when
  remote=true (default), loose mode when remote=false (CLI). MCP stdio
  server sets remote=true; cli.ts and handleToolCall (gbrain call) set
  remote=false.

Original PR #105 by @garagon. Issue #139 reported by @Hybirdss.

* feat(search): query sanitization + structural prompt boundary (Fix 3, M1, M2, M3)

- M1: restructure callHaikuForExpansion to use a system message that declares
  the user query as untrusted data, plus an XML-tagged <user_query> boundary
  in the user message. Layered defense with the existing tool_choice constraint
  (3 layers vs 1).
- Fix 3 (regex sanitizer, defense-in-depth): sanitizeQueryForPrompt strips
  triple-backtick code fences, XML/HTML tags, leading injection prefixes,
  and caps at 500 chars. Original query is still used for downstream search;
  only the LLM-facing copy is sanitized.
- M2: sanitizeExpansionOutput validates the model's alternative_queries array
  before it flows into search. Strips control chars, caps length, dedupes
  case-insensitively, drops empty/non-string items, caps to 2 items.
- M3: console.warn on stripped content NEVER logs the query text — privacy-safe
  debug signal only.

Original PR #107 by @garagon. M1/M2/M3 are wave 3 hardening per Codex review.

* chore: bump version and changelog (v0.10.2)

Security wave 3: 9 vulnerabilities closed across file_upload, recipe trust
boundary, SSRF defense, prompt injection, and limit clamping. See CHANGELOG
for full details.

Contributors:
- @garagon (PRs #105-109)
- @Hybirdss (Issue #139)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: sync documentation with v0.10.2 security wave 3

- CLAUDE.md: document OperationContext.remote, new security helpers
  (validateUploadPath, validatePageSlug, validateFilename, isInternalUrl,
  parseOctet, hostnameToOctets, isPrivateIpv4, getRecipeDirs,
  sanitizeQueryForPrompt, sanitizeExpansionOutput), updated clampSearchLimit
  signature, recipe trust boundary, new test files
- docs/integrations/README.md: replace string-form health_check example
  with typed DSL (string checks now hard-block for non-embedded recipes);
  add recipe trust boundary subsection
- docs/mcp/DEPLOY.md: document file_upload remote-caller cwd confinement,
  symlink rejection, slug/filename allowlists

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Minions v7 + v0.11.1 canonical migration + skillify (#130)

* feat: add minion_jobs schema, migration v5, and executeRaw to BrainEngine

Foundation for the Minions job queue system. Adds:
- minion_jobs table (20 columns) with CHECK constraints, partial indexes,
  and RLS. Inspired by BullMQ's job model, adapted for Postgres.
- Migration v5 creates the table for existing databases.
- executeRaw<T>() method on BrainEngine interface for raw SQL access,
  needed by the Minions module for claim queries (FOR UPDATE SKIP LOCKED),
  token-fenced writes, and atomic stall detection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: Minions job queue — queue, worker, backoff, types

BullMQ-inspired Postgres-native job queue built into GBrain. No Redis.
No external dependencies. Postgres transactions replace Lua scripts.

- MinionQueue: submit, claim (FOR UPDATE SKIP LOCKED), complete/fail
  (token-fenced), atomic stall detection (CTE), delayed promotion,
  parent-child resolution, prune, stats
- MinionWorker: handler registry, lock renewal, graceful SIGTERM,
  exponential backoff with jitter, UnrecoverableError bypass
- MinionJobContext: updateProgress(), log(), isActive() for handlers
- 8-state machine: waiting/active/completed/failed/delayed/dead/
  cancelled/waiting-children

Patterns stolen from: BullMQ (lock tokens, stall detection, flows),
Sidekiq (dead set, backoff formula), Inngest (checkpoint/resume).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: 43 tests for Minions job queue

Full coverage of the Minions module against PGLite in-memory:
- Queue CRUD (9): submit, get, list, remove, cancel, retry, duplicate
- State machine (6): waiting→active→completed/failed, retry→delayed→waiting
- Backoff (4): exponential, fixed, jitter range, attempts_made=0 edge
- Stall detection (3): detect stalled, counter increment, max→dead
- Dependencies (5): parent waits, fail_parent, continue, remove_dep, orphan
- Worker lifecycle (5): register, start-without-handlers, claim+execute,
  non-Error throws, UnrecoverableError bypass
- Lock management (3): renewal, token mismatch, claim sets lock fields
- Claim mechanics (4): empty queue, priority ordering, name filtering,
  delayed promotion timing
- Cancel & retry (2): cancel active, retry dead

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: Minions CLI commands and MCP operations

Wire Minions into the GBrain CLI and MCP layer:

CLI (gbrain jobs):
  submit <name> [--params JSON] [--follow] [--dry-run]
  list [--status S] [--queue Q] [--limit N]
  get <id> — detailed view with attempt history
  cancel/retry/delete <id>
  prune [--older-than 30d]
  stats — job health dashboard
  work [--queue Q] [--concurrency N] — Postgres-only worker daemon

6 MCP operations (contract-first, auto-exposed via MCP server):
  submit_job, get_job, list_jobs, cancel_job, retry_job, get_job_progress

Built-in handlers: sync, embed, lint, import. --follow runs inline.
Worker daemon blocked on PGLite (exclusive file lock).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for Minions job queue

CLAUDE.md: added Minions files to key files, updated operation count (36),
BrainEngine method count (38), test file count (45), added jobs CLI commands.
CHANGELOG.md: added Minions entry to v0.10.0 (background jobs, retry, stall
detection, worker daemon).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: Minions v2 — agent orchestration primitives (pause/resume, inbox, tokens, replay)

Adds the foundation for Minions as universal agent orchestration infrastructure.
GBrain's Postgres-native job queue now supports durable, observable, steerable
background agents. The OpenClaw plugin (separate repo) will consume these via
library import, not MCP, for zero-latency local integration.

## New capabilities

- **Concurrent worker** — Promise pool replaces sequential loop. Per-job
  AbortController for cooperative cancellation. Graceful shutdown waits for
  all in-flight jobs via Promise.allSettled.
- **Pause/resume** — pauseJob clears the lock and fires AbortSignal on active
  jobs. Handlers check ctx.signal.aborted and exit cleanly. resumeJob returns
  paused jobs to waiting. Catch block skips failJob when signal.aborted.
- **Inbox (separate table)** — minion_inbox table for sidechannel messages.
  sendMessage with sender validation (parent job or admin). readInbox is
  token-fenced and marks read_at atomically. Separate table avoids row bloat
  from rewriting JSONB on every send.
- **Token accounting** — tokens_input/tokens_output/tokens_cache_read columns.
  updateTokens accumulates; completeJob rolls child tokens up to parent.
  USD cost computed at read time (no cost_usd column — pricing too volatile).
- **Job replay** — replayJob clones a terminal job with optional data overrides.
  New job, fresh attempts, no parent link.

## Handler contract additions

MinionJobContext now provides:
- `signal: AbortSignal` — cooperative cancellation
- `updateTokens(tokens)` — accumulate token usage
- `readInbox()` — check for sidechannel messages
- `log()` — now accepts string or TranscriptEntry

## MCP operations added

pause_job, resume_job, replay_job, send_job_message — all auto-generate CLI
commands and MCP server endpoints.

## Library exports

package.json exports map adds ./minions and ./engine-factory paths so plugins
can `import { MinionQueue } from 'gbrain/minions'` for direct library use.

## Instruction layer (the teaching)

- skills/minion-orchestrator/SKILL.md — when/how to use Minions, decision
  matrix, lifecycle management, anti-patterns
- skills/conventions/subagent-routing.md — cross-cutting rule: all background
  work goes through Minions
- RESOLVER.md — trigger entries for agent orchestration
- manifest.json — registered

## Schema migration v6

Additive: 3 token columns, paused status, minion_inbox table with unread index.
Full Postgres + PGLite support. No backfill needed.

## Tests

65 tests (was 43): pause/resume (5), inbox (6), tokens (4), replay (4),
concurrent worker context (3), plus all existing coverage.

## What's NOT in this commit

Deferred to follow-up PRs:
- LISTEN/NOTIFY subscribe (needs real Postgres E2E)
- Resource governor (depends on concurrent worker stress testing)
- Routing eval harness (needs API keys + benchmark data)
- OpenClaw plugin (separate @gbrain/openclaw-minions-plugin repo)

See docs/designs/MINIONS_AGENT_ORCHESTRATION.md for full CEO-approved design.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(minions): migration v7 — agent_parity_layer schema

Adds columns on minion_jobs (depth, max_children, timeout_ms, timeout_at,
remove_on_complete, remove_on_fail, idempotency_key) plus the new
minion_attachments table. Three partial indexes for bounded scans:
idx_minion_jobs_timeout, idx_minion_jobs_parent_status, and
uniq_minion_jobs_idempotency. Check constraints enforce non-negative depth
and positive child cap / timeout.

Additive migration — existing installs pick it up via ensureSchema on next
use. No user action required.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(minions): extend types for v7 parity layer

Extends MinionJob with depth/max_children/timeout_ms/timeout_at/
remove_on_complete/remove_on_fail/idempotency_key. Extends MinionJobInput
with the same options plus max_spawn_depth override. Adds MinionQueueOpts
(maxSpawnDepth default 5, maxAttachmentBytes default 5 MiB). Adds
AttachmentInput/Attachment shapes and ChildDoneMessage in the InboxMessage
union. rowToMinionJob updated to pick up the new columns.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(minions): attachments validator

New module validateAttachment() gates every attachment write. Rejects empty
filenames, path traversal (.., /, \), null bytes, oversized content (5 MiB
default, per-queue override), invalid base64, and implausible content_type
headers. Returns normalized { filename, content_type, content (Buffer),
sha256, size } on success.

The DB also enforces UNIQUE (job_id, filename) as defense-in-depth for
concurrent addAttachment races — JS-only checks are not sufficient.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(minions): queue v7 — depth, child cap, timeouts, cascade, idempotency, child_done

Wraps completeJob and failJob in engine.transaction() so parent hook
invocations (resolveParent, failParent, removeChildDependency) fold into
the same transaction as the child update. A process crash between child
and parent can't strand the parent in waiting-children anymore.

Adds v7 behaviors:
- Depth tracking. add() computes depth = parent.depth + 1 and rejects
  past maxSpawnDepth (default 5).
- Per-parent child cap. add() takes SELECT ... FOR UPDATE on the parent,
  counts non-terminal children, rejects when count >= max_children.
  NULL max_children = no cap.
- Per-job wall-clock timeout. claim() populates timeout_at when
  timeout_ms is set. New handleTimeouts() dead-letters expired rows with
  error_text='timeout exceeded'. Terminal — no retry.
- Cascade cancel. cancelJob() walks descendants via recursive CTE with
  depth-100 runaway cap. Returns the root row. Re-parented descendants
  (parent_job_id NULL) are naturally excluded.
- Idempotency. add() uses INSERT ... ON CONFLICT (idempotency_key) DO
  NOTHING RETURNING; falls back to SELECT when RETURNING is empty. Same
  key always yields the same job id.
- child_done inbox. completeJob inserts {type:'child_done', child_id,
  job_name, result} into the parent's inbox in the same transaction as
  the token rollup, guarded by EXISTS so terminal/deleted parents skip
  without FK violation. New readChildCompletions(parent_id, lock_token,
  since?) helper; token-fenced like readInbox.
- removeOnComplete / removeOnFail. Deletes the row after the parent hook
  fires, so parent policy sees consistent state.
- Attachment methods. addAttachment validates via validateAttachment
  then INSERTs; UNIQUE (job_id, filename) backs the JS dup check.
  listAttachments, getAttachment, deleteAttachment round out the API.

Fixes pre-existing inverted status bug: add() now puts children in
waiting/delayed (not waiting-children) and atomically flips the parent
to waiting-children in the same transaction. Tests no longer need
manual UPDATE workarounds.

Two correctness fixes:
- Sibling completion race. Under READ COMMITTED, two grandchildren
  completing concurrently each saw the other as still-active in the
  pre-commit snapshot and neither flipped the parent. Fixed by taking
  SELECT ... FOR UPDATE on the parent row at the start of completeJob
  and failJob transactions, serializing siblings on the parent lock.
- JSONB double-encode. postgres.js conn.unsafe(sql, params) auto-
  JSON-encodes parameters. Calling JSON.stringify(obj) first stored a
  JSON string literal (jsonb_typeof=string) and broke payload->>'key'
  queries silently. Removed JSON.stringify from three call sites
  (child_done inbox post, updateProgress, sendMessage). PGLite tolerated
  both forms so unit tests missed it — real-PG E2E caught it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(minions): worker — timeout safety net + handleTimeouts tick

Worker tick now calls handleStalled() first, then handleTimeouts() — stall
requeue wins over timeout dead-letter when both could fire in the same
cycle. handleTimeouts() guards on lock_until > now() so stalled jobs take
the retryable path.

launchJob schedules a per-job setTimeout(timeout_ms) that fires ctx.signal
as a best-effort handler interrupt. The timer is always cleared in .finally
so process exit isn't delayed by a dangling timer. Handlers that respect
AbortSignal stop cleanly; handlers that ignore it still get dead-lettered
by the DB-side handleTimeouts.

Removed post-completeJob and post-failJob parent-hook calls from the worker
— those are now inside the queue method transactions. Worker becomes
simpler and crash-safer.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(minions): 33 new unit tests for v7 parity layer

Covers depth cap, per-parent child cap, timeout dead-letter, cascade
cancel (including the re-parent edge case), removeOnComplete /
removeOnFail, idempotency (single + concurrent), child_done inbox
(posted in txn + survives child removeOnComplete + since cursor),
attachment validation (oversize, path traversal, null byte, duplicates,
base64), AbortSignal firing on pause mid-handler, catch-block skipping
failJob when aborted, worker in-flight bookkeeping, token-rollup guard
when parent already terminal, and setTimeout safety-net cleanup.

Existing tests updated to remove the inverted-status manual UPDATE
workarounds that the add() fix made obsolete.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(e2e): Minions v7 concurrency + OpenClaw resilience coverage

minions-concurrency.test.ts spins two MinionWorker instances against the
test Postgres, submits 20 jobs, and asserts zero double-claims (every job
runs exactly once). This is the only test that actually proves FOR UPDATE
SKIP LOCKED under real concurrency — PGLite runs on a single connection
and can't exercise the race.

minions-resilience.test.ts covers the six OpenClaw daily pains:
1. Spawn storm caps enforce under concurrent submit. 2. Agent stall →
handleStalled() requeues; handleTimeouts() skips (lock_until guard).
3. Forgotten dispatches recoverable via child_done inbox. 4. Cascade
cancel stops grandchildren mid-flight. 5. Deep tree fan-in
(parent → 3 children → 2 grandchildren each) completes with the full
inbox chain. 6. Parent crash/recovery resumes from persisted state.

helpers.ts extends ALL_TABLES with minion_attachments, minion_inbox, and
minion_jobs (FK dependents first) so E2E teardown doesn't leak rows
between runs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: release v0.11.0 — Minions v7 agent orchestration primitives

Bumps VERSION / package.json to 0.11.0. Adds CHANGELOG entry covering
depth tracking, max_children, per-job timeouts, cascade cancel,
idempotency keys, child_done inbox, removeOnComplete/Fail, attachments,
migration v7, plus the two correctness fixes (sibling completion race
and JSONB double-encode).

TODOS.md captures the four v7 follow-ups: per-queue rate limiting,
repeat/cron scheduler, worker event emitter, and waitForChildren
convenience helpers.

1066 unit + 105 E2E = 1171 tests passing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(minions): unify JSONB inserts, tighten nullish coalescing

Three non-blocker cleanups from post-ship review of v0.11.0:

- queue.ts add() and completeJob(): pre-stringifying with JSON.stringify
  while other sites pass raw objects with $n::jsonb casts. postgres.js
  double-encodes if you stringify first — works on PGLite (text→JSONB
  auto-cast), fails silently on real PG. Unify on raw object + explicit
  $n::jsonb cast.
- queue.ts readChildCompletions: since clause used sent_at > $2 relying
  on PG's implicit text→TIMESTAMPTZ coercion. Explicit $2::timestamptz
  is safer and clearer.
- types.ts rowToMinionJob: parent_job_id used || which coerces 0 to null.
  Harmless today (SERIAL IDs start at 1) but ?? is semantically correct.

All 110 unit tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(minions): updateProgress missed $1::jsonb cast in unification

Residual from c502b7e — updateProgress was the only remaining JSONB write
without the explicit ::jsonb cast. Not broken (implicit cast works) but
breaks the convention the prior commit unified everywhere else.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* doc: Minions v7 skill count + jobs subcommands (26 skills)

README: bump skill count 25 → 26, add minion-orchestrator row, add
`gbrain jobs` command family block so v0.11.0's headline feature is
actually discoverable from the top-level commands reference.

CLAUDE.md: unit test count 48 → 49 (minions.test.ts expanded), skill
count 25 → 26, add minion-orchestrator to Key files + skills categorization,
expand MinionQueue one-liner to cover v7 primitives (depth/child-cap,
timeouts, idempotency, child_done inbox, removeOnComplete/Fail).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat: Minions adoption UX — smoke test + migration + pain-triggered routing

Teach OpenClaw when to reach for Minions vs native subagents. Ship three
pieces so upgrading from v0.10.x actually lands for real users:

- `gbrain jobs smoke` — one-command health check that submits a `noop` job,
  runs a worker, verifies completion, and prints engine-aware guidance
  (PGLite installs get the "daemon needs Postgres, use --follow" note).
  Fails loud if schema's below v7 so the user knows to `gbrain init`.

- `skills/migrations/v0.11.0.md` — post-upgrade migration file the
  auto-update agent reads. Six steps: apply schema, run smoke, ask user
  via AskUserQuestion which mode they want (always / pain_triggered / off),
  write to `~/.gbrain/preferences.json`, sanity-check handlers, mark done.
  Completeness scores on each option so the recommendation is explicit.

- `skills/conventions/subagent-routing.md` rewritten — was a "MUST use
  Minions for ALL background work" mandate, now reads preferences.json
  on every routing decision and branches on three modes. Mode B
  (pain_triggered) is the default: keep subagents until gateway drops
  state, parallel > 3, runtime > 5min, or user expresses frustration.
  Then pitch the switch in-session with a specific script.

Rename pass: "Minions v7" → "Minions" in README (JOBS block), TODOS.md
(P1 section header + depends-on), CHANGELOG.md v0.11.0 entry. v7 stays
as the internal schema version in code/migration contexts. The product
name is just Minions.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* doc(readme): promote Minions — 6 OpenClaw pains + how each is fixed

The one-line mention in the skills table wasn't doing the work. Added a
dedicated section between "How It Works" and "Getting Data In" that leads
with the six multi-agent failures every OpenClaw user hits daily (spawn
storms, hung handlers, forgotten dispatches, unstructured debugging,
gateway crashes, runaway grandchildren) and maps each pain to the
specific Minions primitive that fixes it.

Includes the smoke test command, the adoption default (pain_triggered),
and a pointer to skills/minion-orchestrator for the full patterns.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(bench): add harness for Minions vs OpenClaw subagent dispatch

Shared harness (openclawDispatch + minionsHandler) using matching
claude-haiku-4-5 calls on both sides so the delta measures queue+
dispatch overhead on top of identical LLM work. Includes
statsFromResults (p50/p95/p99) and formatStats helpers. Uses
`openclaw agent --local` embedded mode; does not test gateway
multi-agent fan-out (documented in the harness header).

* test(bench): durability under SIGKILL — Minions vs OpenClaw --local

Headline bench for the claim: when the orchestrator dies mid-dispatch,
Minions rescues via PG state + stall detection; OpenClaw --local loses
in-flight work outright.

Minions side: seed 10 active+expired-lock rows (exact state a SIGKILLed
worker leaves) then run a rescue worker. Expect 10/10 completed.
OpenClaw side: spawn 10 `openclaw agent --local` in parallel, SIGKILL
each at 500ms, count pre-kill delivered output. Expect 0/10 — no
persistence layer, nothing to recover.

Budget: ~$0 (Minions handlers sleep 10ms; OC calls die at 500ms so
partial LLM billing is negligible).

* test(bench): per-dispatch throughput — Minions vs OpenClaw --local

20 serial dispatches each side, identical claude-haiku-4-5 call with the
same trivial prompt. p50/p95/p99 reported via statsFromResults. Serial
(not parallel) so the per-dispatch cost is measured honestly and LLM
token spend stays bounded (~$0.08 total).

Minions: one queue, one worker, one concurrency. Submit → poll to
completion before next submit. OpenClaw: N sequential
`openclaw agent --local` spawns.

* test(bench): fan-out — Minions 10-wide concurrency vs 10 parallel OC spawns

Parent dispatches 10 children, waits for all to return. Minions uses
worker concurrency=10 sharing one warm process; OpenClaw parallel
`openclaw agent --local` spawns, each boots its own runtime.

3 runs × 10 children per run. Reports ok count and wall time per run
plus summary. Honest caveat documented: does not test OC gateway
multi-agent fan-out — that needs a custom WS client and LLM-backed
parent agent. This measures what users script today.

Budget: ~$0.12 LLM spend.

* test(bench): memory — 10 in-flight subagents, single-proc vs 10-proc cost

Measures resident memory for keeping 10 subagents in flight. Minions:
one worker process, concurrency=10 with handlers that park on a
promise — sample RSS of the test process via process.memoryUsage().
OpenClaw: 10 parallel `openclaw agent --local` processes, sum their
RSS via `ps -o rss=`.

Handlers are cheap sleeps, no LLM — we want harness memory, not LLM
client state. Budget: $0.

* test(bench): fan-out — don't gate on OC success rate, report numbers

Initial run showed OC parallel `--local` at 10-wide hits 40% failure
rate (17/30 across 3 runs). That's the finding, not a test bug —
process startup stampede + LLM rate limits. Bench now prints error
samples and reports the numbers instead of gating.

Minions side still gates at 90% (30/30 observed in practice).

* doc(benchmarks): Minions vs OpenClaw --local subagent dispatch

Real numbers on four claims: durability, throughput, fan-out, memory.
Same claude-haiku-4-5 call on both sides so the delta is queue+dispatch+
process cost on top of identical LLM work.

Headline: Minions rescues 10/10 from a SIGKILLed worker in 458ms while
OpenClaw --local loses all 10; ~10× faster per dispatch (778ms p50 vs
8086ms p50); ~21× faster at 10-wide fan-out AND 100% reliable vs OC's
43% failure rate; 2 MB vs 814 MB to keep 10 subagents in flight.

Honest caveats section covers what this doesn't test (OC gateway
multi-agent, load tests, other models). Fully reproducible via
test/e2e/bench-vs-openclaw/.

* doc(readme): inject Minions vs OpenClaw bench numbers

Headline deltas now in the Minions section: 10/10 vs 0/10 on crash,
~10× faster per dispatch, ~21× faster fan-out at 10-wide with 0%
failure vs 43%, ~400× less memory. Links to the full bench doc.

Prose first said Minions "fixes all six pains." Now it shows the
numbers that prove it.

* bench: production Wintermute benchmark — Minions 753ms vs sub-agent timeout

Real deployment: 45K-page brain on Render+Supabase. Task: pull 99 tweets,
write brain page, commit, sync. Minions: 753ms, $0. Sub-agent: gateway
timeout (>10s, couldn't even spawn under production load).

Also: 19,240 tweets backfilled across 36 months in 15 min at $0.
Sub-agents would cost $1.08 and fail 40% of spawns.

* bench: tweet ingestion — Minions 719ms vs OpenClaw 12.5s (17×)

Production benchmark with runnable test code:
- test/e2e/bench-vs-openclaw/tweet-ingest.bench.ts (reusable)
- docs/benchmarks/2026-04-18-tweet-ingestion.md (publishable)

Task: pull 100 tweets from X API, write brain page, commit, sync.
Minions: 719ms mean, $0, 100% success.
OpenClaw: 12,480ms mean, $0.03/run, 60% success (gateway timeouts).
At scale: 36-month backfill, 19K tweets, 15 min, $0 vs est. $1.08.

* doc(benchmarks): Wintermute production data point for Minions vs OpenClaw

Adds a production-environment data point to the Minions README section:
one month of tweet ingest on Wintermute (Render + Supabase + 45K-page brain)
ran end-to-end in 753ms for \$0.00 via Minions, while the equivalent
sessions_spawn hit the 10s gateway timeout and produced nothing.

Full methodology + logs in docs/benchmarks/2026-04-18-minions-vs-openclaw-production.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(core): preferences.ts + cli-util.ts — foundations for v0.11.1

Adds two foundational modules that apply-migrations (Lane A-4), the
v0.11.0 orchestrator (Lane C-1), and the stopgap script (Lane C-4) all
depend on.

- src/core/preferences.ts: atomic-write ~/.gbrain/preferences.json
  (mktemp + rename, 0o600, forward-compatible for unknown keys) with
  validateMinionMode, loadPreferences, savePreferences. Plus
  appendCompletedMigration + loadCompletedMigrations for the
  ~/.gbrain/migrations/completed.jsonl log (tolerates malformed lines).
  Uses process.env.HOME || homedir() so $HOME overrides work in CI and
  tests; Bun's os.homedir() caches the initial value and ignores later
  mutations.
- src/core/cli-util.ts: promptLine(prompt) helper, extracted from
  src/commands/init.ts:212-224. Shared so init, apply-migrations, and
  the v0.11.0 orchestrator's mode prompt don't each reinvent it.

test/preferences.test.ts: 21 unit tests covering load/save atomicity,
0o600 perms, forward-compat for unknown keys, minion_mode validation,
completed.jsonl JSONL append idempotence, auto-ts population, malformed-
line tolerance in loadCompletedMigrations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(init): add --migrate-only flag (schema-only, no saveConfig)

Context: v0.11.0 migration orchestrators need a safe way to re-apply the
schema against an existing brain without risking a config flip. Today
running bare `gbrain init` with no flags defaults to PGLite and calls
saveConfig, which would silently overwrite an existing Postgres
database_url — caught by Codex in the v0.11.1 plan review as a
show-stopper data-loss bug.

The new --migrate-only path:
  - loadConfig() reads the existing config (does NOT call saveConfig)
  - errors out with a clear "run gbrain init first" if no config exists
  - connects via the already-configured engine, calls engine.initSchema(),
    disconnects
  - --json emits structured success/error payloads

Everything downstream in the v0.11.1 migration chain (apply-migrations,
the stopgap bash script, the package.json postinstall hook) will invoke
this flag rather than bare gbrain init.

test/init-migrate-only.test.ts: 4 tests covering the no-config error
path, --json error payload shape, happy-path with a PGLite fixture
(verifies config.json content is byte-identical after the call — the
real invariant), and idempotent rerun.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(migrations): TS registry replaces filesystem migration scan

Context: Codex flagged that bun build --compile produces a self-contained
binary, and the existing findMigrationsDir() in upgrade.ts:145 walks
skills/migrations/v*.md on disk — which fails on a compiled install
because the markdown files aren't bundled. The plan's fix is a TS
registry: migrations are code, imported directly, visible to both source
installs and compiled binaries.

- src/commands/migrations/types.ts: shared Migration, OrchestratorOpts,
  OrchestratorResult types.
- src/commands/migrations/index.ts: exports the migrations[] array,
  getMigration(version), and compareVersions() (semver comparator).
  The feature_pitch data that lived in the MD file frontmatter now
  lives here as a code constant on each Migration, so runPostUpgrade's
  post-upgrade pitch printer can consume it without a filesystem read.
- src/commands/migrations/v0_11_0.ts: stub orchestrator + pitch. The
  full phase implementation lands in Lane C-1; for now the stub throws
  a clear "not yet implemented" so apply-migrations --list (Lane A-4)
  can still enumerate the migration.

test/migrations-registry.test.ts: 9 tests covering ascending-semver
ordering, feature_pitch shape invariants, getMigration lookup, and
compareVersions edge cases (equal / newer / older / single-digit
across major bumps).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cli): gbrain apply-migrations — migration runner CLI

Reads ~/.gbrain/migrations/completed.jsonl, diffs against the TS migration
registry, runs pending orchestrators. Resumes status:"partial" entries
(the stopgap bash script writes these so v0.11.1 apply-migrations can
pick up where it left off). Idempotent: rerunning when up-to-date exits 0.

Flags:
  --list                    Show applied + partial + pending + future.
  --dry-run                 Print the plan; take no action.
  --yes / --non-interactive Skip prompts (used by runPostUpgrade + postinstall).
  --mode <a|p|o>            Preset minion_mode (bypasses the Phase C TTY prompt).
  --migration vX.Y.Z        Force-run one specific version.
  --host-dir <path>         Include $PWD in host-file walk (default is
                            $HOME/.claude + $HOME/.openclaw only).
  --no-autopilot-install    Skip Phase F.

Diff rule (Codex H9): apply when no status:"complete" entry exists AND
migration.version ≤ installed VERSION. Previously proposed rule was
"version > currentVersion", which would SKIP v0.11.0 when running v0.11.1;
regression test in apply-migrations.test.ts pins the correct semantics.

Registered in src/cli.ts CLI_ONLY Set; dispatched before connectEngine so
each phase owns its own engine/subprocess lifecycle (no double-connect
when the orchestrator shells out to init --migrate-only or jobs smoke).

test/apply-migrations.test.ts: 18 unit tests covering parseArgs for every
flag, indexCompleted/statusForVersion correctness (including stopgap-then-
complete transition), and buildPlan's four buckets (applied / par…
garrytan added a commit that referenced this pull request Apr 23, 2026
PR 0 plumbing for connected gbrains. Adds an optional brainId field that
identifies which database an operation targets and ensures subagents
inherit the parent job's brain instead of process-wide defaults. No
dispatch-path changes in this commit — that is PR 1 (registry wiring at
MCP + CLI entry points). The fields exist so callers can set them now
and downstream code respects them.

Changes:

- src/core/operations.ts: OperationContext grows `brainId?: string`.
  Optional for back-compat. 'host' is the implicit default when absent.
  Orthogonal to v0.18.0's source_id (source = which repo within the
  brain, brain = which database). See docs/architecture/brains-and-sources.md.

- src/core/minions/types.ts: SubagentHandlerData gains `brain_id?: string`.
  Parent jobs set this when submitting a child subagent to lock the
  child into a specific brain. Omitted = host (unchanged behavior).

- src/core/minions/handlers/subagent.ts: buildBrainTools call site
  reads data.brain_id and passes it through. Child subagents spawned
  from this handler will see the same brainId unless they override in
  their own data.

- src/core/minions/tools/brain-allowlist.ts: BuildBrainToolsOpts +
  OpContextDeps grow brainId; buildOpContext stamps it on every
  OperationContext the subagent builds for tool calls. Addresses Codex
  finding #6 (brain-allowlist hardwired parent config without brain
  awareness, so switching brain only in subagent.ts was not enough).

Tests: 166 affected tests green (subagent suite + minions + brain
registry + resolver). Typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request Apr 23, 2026
…d RESOLVER) (#372)

* feat(mounts): connected-gbrains PR 0 foundation — registry + resolver + CLI

Lays the foundation for connected gbrains (v0.19.0) per the approved plan.
This is PR 0 — minimal runtime for direct-transport, path-mounted brains.

What this slice ships:
- src/core/brain-registry.ts — keyed BrainRegistry with lazy engine init,
  schema-validated mounts.json loader, DuplicateMountPathError (load-bearing
  identity check per Codex finding #9 correction), UnknownBrainError with
  actionable available-id list. Pure: no AsyncLocalStorage, no singleton
  mutation. ~280 LOC.

- src/core/brain-resolver.ts — 6-tier brain-id resolution mirroring
  v0.18.0's source-resolver.ts so agents learn ONE mental model:
    1. --brain <id>     2. GBRAIN_BRAIN_ID env      3. .gbrain-mount dotfile
    4. longest-path match over registered mounts    5. (reserved v2 default)
    6. 'host' fallback
  Orthogonal to --source: --brain picks which DB, --source picks the repo
  within that DB. Corruption-resistant: mounts.json load failures fall
  through to 'host' instead of breaking every CLI invocation.

- src/commands/mounts.ts — `gbrain mounts add|list|remove` (direct transport
  only). Validates on add (path exists on disk, id regex, no dupes). WARNS
  but does not block on same db_url/db_path across ids (teams may
  legitimately alias a remote brain). Password redaction in list output.
  Atomic write via temp+rename. 0600 perms. PR 1 adds pin/sync/enable;
  PR 2 adds --mcp-url + OAuth.

- src/cli.ts — wires `gbrain mounts` into handleCliOnly (no DB required
  for the config-only subcommands).

- test/brain-registry.test.ts (28 cases): schema validation across every
  malformed-input branch, ALS-free resolution, duplicate id + path detection,
  disabled-mount exclusion, UnknownBrainError context.

- test/brain-resolver.test.ts (22 cases): priority order (explicit > env >
  dotfile > path-prefix > fallback), dotfile walk-up, malformed dotfile
  recovery, longest-prefix match, sibling-path false-positive guard,
  loader-failure defense.

- test/mounts-cli.test.ts (17 cases): parseAddArgs surface, redactUrl,
  atomic write, add/list/remove roundtrip via temp HOME.

67 new tests, all green. Typecheck clean. Depends on mcp-key-mgmt (base
branch) for the OAuth/scope annotations that PR 2 will leverage.

Next in this branch: PR 0 still needs (a) the deep host-brain-bias audit
(postgres-engine internal singleton fallback + a few operations.ts
callers), (b) OperationContext threading to make ctx.brainId populated at
dispatch, (c) composeResolvers + composeManifests, (d) aggregated
~/.gbrain/mounts-cache/ for host-agent runtime ownership.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(mounts): brains-and-sources mental model + agent routing convention

Two orthogonal axes organize GBrain knowledge. Users AND agents need to
understand both, or queries misroute silently.

  --brain  → WHICH DATABASE    (host + mounts)
  --source → WHICH REPO IN DB  (v0.18.0 sources: wiki, gstack, ...)

Both axes use the same 6-tier resolution (explicit > env > dotfile >
path-prefix > default > fallback), so learning one teaches both.

Ships:

- docs/architecture/brains-and-sources.md — canonical mental model doc.
  Covers four topologies with ASCII diagrams:
    1. Single-person developer (one brain, one source)
    2. Personal brain with multiple repos (one brain, N sources)
    3. Personal + one team brain mount (2 brains)
    4. Senior user with multiple team memberships (N mounted team brains
       alongside personal) — the CEO-class topology
  Explicit "when to move each axis" decision table. Generic example names
  throughout per the project's privacy rule.

- skills/conventions/brain-routing.md — agent-facing decision table.
  Rules for when to switch brain (team-owned question, explicit name,
  data owner changes) vs switch source (working in a repo, topic scoped
  to one repo). Cross-brain federation is latent-space only in v0.19 —
  the agent fans out; the DB never does. Anti-patterns listed: silent
  brain jumps, writing to host when data is team-owned, missing brain
  prefix in citations, ignoring .gbrain-mount dotfiles.

- CLAUDE.md — adds "Two organizational axes (read this first)" section
  at the top pointing at both new docs.

- AGENTS.md — adds brains-and-sources.md + brain-routing.md to the
  "read this order" (positions 3 and 4, before RESOLVER.md).

- skills/RESOLVER.md — adds brain-routing.md to the Conventions section
  so it appears alongside quality.md, brain-first.md, subagent-routing.md.

No code changes. Pre-existing check-resolvable warnings unchanged (2
warnings on base unrelated to this work). 67 PR-0 tests still green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(mounts): thread brainId through OperationContext + subagent chain

PR 0 plumbing for connected gbrains. Adds an optional brainId field that
identifies which database an operation targets and ensures subagents
inherit the parent job's brain instead of process-wide defaults. No
dispatch-path changes in this commit — that is PR 1 (registry wiring at
MCP + CLI entry points). The fields exist so callers can set them now
and downstream code respects them.

Changes:

- src/core/operations.ts: OperationContext grows `brainId?: string`.
  Optional for back-compat. 'host' is the implicit default when absent.
  Orthogonal to v0.18.0's source_id (source = which repo within the
  brain, brain = which database). See docs/architecture/brains-and-sources.md.

- src/core/minions/types.ts: SubagentHandlerData gains `brain_id?: string`.
  Parent jobs set this when submitting a child subagent to lock the
  child into a specific brain. Omitted = host (unchanged behavior).

- src/core/minions/handlers/subagent.ts: buildBrainTools call site
  reads data.brain_id and passes it through. Child subagents spawned
  from this handler will see the same brainId unless they override in
  their own data.

- src/core/minions/tools/brain-allowlist.ts: BuildBrainToolsOpts +
  OpContextDeps grow brainId; buildOpContext stamps it on every
  OperationContext the subagent builds for tool calls. Addresses Codex
  finding #6 (brain-allowlist hardwired parent config without brain
  awareness, so switching brain only in subagent.ts was not enough).

Tests: 166 affected tests green (subagent suite + minions + brain
registry + resolver). Typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(mounts): composeResolvers + composeManifests + aggregated cache

The runtime ownership seam for connected gbrains (Codex finding #3 from
plan review): check-resolvable.ts VALIDATES RESOLVER.md; it does not
DISPATCH skills. Host agents (Wintermute/OpenClaw/Claude Code) read
skills/RESOLVER.md directly to route user requests. Without an aggregated
resolver, mounted team brains cannot contribute skills to the host
agent's routing table.

This commit adds the aggregation:

- src/core/mounts-cache.ts (NEW): pure composeResolvers + composeManifests
  functions plus filesystem writers for ~/.gbrain/mounts-cache/. The
  aggregated files carry every host skill plus every mount skill,
  namespace-prefixed (e.g. `yc-media::ingest`). Host skills always beat
  a same-named mount skill (locked decision 1); bare-name collisions
  between two mounts surface as structured ambiguity info so doctor can
  warn (PR 1).

  Also addresses Codex finding #8: manifests compose alongside the
  resolver, else doctor conformance breaks on remote skills.

- src/commands/mounts.ts: refreshMountsCache() called on `mounts add`
  and `mounts remove` (the latter clearing the cache entirely when the
  last mount goes away). Uses findRepoRoot() to locate the host skills
  dir; skips with a stderr note when run outside a gbrain repo so the
  user isn't confused by a "cache not refreshed" error in the wrong
  cwd.

- test/mounts-cache.test.ts (NEW): 23 unit tests covering empty world,
  host-only, single mount, two-mount ambiguity, host-shadows-mount,
  disabled mount excluded, missing RESOLVER.md is a no-op, manifest
  composition with same-name collision, render shape, atomic rewrite,
  clear on missing dir.

Output format for ~/.gbrain/mounts-cache/RESOLVER.md adds a Brain column
so host agents can see which brain each trigger routes to at a glance,
plus Shadows and Ambiguous sections when those conditions exist.

Tests: 90 PR 0 tests green (brain-registry + resolver + mounts-cache +
mounts-cli). Full suite regression pending in task 11.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(mounts): force instance-level pool for mount brains + CI guard

Closes the silent-singleton-share bug Codex flagged as finding #1 from
the plan review: two direct-transport mounts with different Postgres
URLs would both fall through postgres-engine.ts's `get sql()` getter to
db.getConnection() and quietly share whichever singleton connected
first. Your yc-media writes end up in garrys-list or vice versa. No
error at the call site — just wrong data.

The fix:

- src/core/brain-registry.ts: initMountBrain now passes poolSize when
  calling engine.connect(). That forces postgres-engine.ts:33-60 down
  the instance-level path (setting this._sql) instead of the module
  singleton path (calling db.connect). Hard-coded 5 for PR 0 — per-mount
  override is PR 1. PGLite ignores poolSize (no pool concept), so this
  is Postgres-specific.

  Host brain still uses the singleton path via initHostBrain (unchanged).
  That is fine for PR 0: the singleton is "the host's one connection"
  by definition. PR 1 removes the singleton entirely once every CLI
  command is engine-injectable.

- scripts/check-no-legacy-getconnection.sh (NEW): CI grep guard against
  new db.getConnection() / db.connect() calls landing in src/core/ or
  src/commands/ (the multi-brain dispatch surface). Has an explicit
  ALLOWED list grandfathering today's legitimate callers, each marked
  "PR 1 refactors" so the list shrinks over time. Skips comment lines
  so the grep doesn't trip on doc references to the old pattern.

- package.json: scripts.test chains the new guard after the existing
  check-jsonb-pattern + check-progress-to-stdout guards. `bun run test`
  now fails the build on singleton regression.

Tests: 295 affected pass (registry, resolver, mounts-cache, mounts-cli,
minions, pglite-engine). Typecheck clean. CI guard reports "ok: no new
singleton callers" on current tree.

Left for PR 1: remove the singleton fallback in postgres-engine.ts's
`get sql()` entirely; refactor src/commands/doctor.ts, files.ts,
repair-jsonb.ts, serve-http.ts, init.ts, and the 3 localOnly ops in
operations.ts (file_list, file_upload, file_url) to accept ctx.engine
explicitly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(mounts): codex review findings — namespace survives shadow + atomic tmp names + honest PR 0 docstrings

Codex outside-voice review on PR #372 found 5 issues. Real bugs fixed, overclaims
rewritten. Details:

P2 (real bug): composeResolvers and composeManifests were silently dropping
mount entries when a host skill shared the short name, which made the
namespace-qualified form `<mount>::<skill>` unreachable once host defined
the same short name. That defeated the entire namespace-disambiguation
model — if host had `ingest`, no mount could ship an `ingest` skill even
with explicit `yc-media::ingest`. Fix: always keep namespace-qualified
mount entries in the composed output. Shadow tracking moves to metadata
(`shadows[]`) that doctor can warn on, but never drops routing.

  Before:  host ingest + yc-media ingest → only 1 entry (host), yc-media::ingest unreachable
  After:   host ingest + yc-media ingest → 2 entries: bare `ingest` = host, `yc-media::ingest` = mount
  Verified live: gbrain mounts add of a mount with `ingest` now shows
  `team-demo::ingest` alongside host `ingest` in the aggregated manifest.

P1 (real bug): writeMountsFile + writeMountsCache used fixed `.tmp`
filenames. Two concurrent `gbrain mounts add` invocations (e.g. from
parallel terminals or CI) would clobber each other's temp file and
one writer's update would be lost. Fix: tmp filenames include
`process.pid + random suffix` so every writer has its own scratch file.
The atomic rename is self-contained per-writer. (Full lock + read-modify-
write safety deferred to PR 1 under `gbrain mounts sync --lock`.)

P1 (honesty): `SubagentHandlerData.brain_id` +
`BuildBrainToolsOpts.brainId` docstrings claimed child jobs inherit the
parent's brain and brain tools target the resolved brain. True for the
`ctx.brainId` field only — `ctx.engine` is still the worker's base
engine at dispatch time because `buildOpContext` doesn't yet do the
registry lookup, and `gbrain agent run` doesn't yet accept `--brain` to
populate the field on submission. Rewrote both docstrings to state the
PR 0 behavior explicitly (field plumbed, engine routing is PR 1) so
nobody reads the code thinking multi-brain subagents already work.

Also cleaned up two `require('fs')` runtime imports left over from the
initial PR — swapped for ESM named imports (renameSync). Pre-existing
style issue surfaced by the self-review pass.

Tests: 90 PR-0 tests pass. Updated two shadow-related test cases to
assert the corrected semantics (both entries survive, host wins bare
name, namespace form routes to mount).

Not fixed in this commit (documented as known PR 0 limitations):
- `file_list` / `file_upload` / `file_url` in operations.ts still hit the
  singleton (localOnly + admin, never reachable from HTTP MCP — safe in
  practice, refactor in PR 1 alongside command-level cleanups).
- writeMountsCache's two-file swap (RESOLVER.md + manifest.json) is not
  atomic across files; readers can briefly observe mismatched pairs.
  Acceptable because the cache is recomputable at any time from
  mounts.json. Generation-directory swap is PR 1 work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(tests): bump hook timeouts for 21-migration PGLite init under full-suite load

Root cause of 19 pre-existing full-suite flakes (CHANGELOG v0.18.0 noted
"17 pre-existing master timeouts"): every PGLite test does

  beforeAll/beforeEach(async () => {
    engine = new PGLiteEngine();
    await engine.connect({});
    await engine.initSchema();  // runs 21 migrations through v0.18.2
  });

In isolation this takes ~5s. Under full-suite contention (128 files,
process-shared FS and CPU) it exceeds bun's default 5000ms hook timeout,
beforeEach times out, engine stays undefined, then afterEach crashes
with `TypeError: undefined is not an object (evaluating 'engine.disconnect')`.
That single hook failure reports as the whole test "failing" even though
the test body never executed, which is why the failure count sometimes
looked inflated compared to the number of genuinely-broken tests.

Fix applied across 7 test files:

- Raise setup hook timeout to 30_000 (6x the default) — gives migration
  init enough headroom even under worst-case load without masking real
  regressions in a post-migration test.
- Raise teardown hook timeout to 15_000 — engine.disconnect() is usually
  fast but can stall when PGLite's WASM runtime is still completing a
  migration at shutdown.
- Add `if (engine) await engine.disconnect()` guard so afterEach doesn't
  double-fault when beforeEach already failed. This was the source of
  the opaque "(unnamed)" failures — they were disconnect crashes,
  not test-body failures.

Files:
  test/dream.test.ts                (5 beforeEach + 5 afterEach blocks)
  test/orphans.test.ts              (1 pair)
  test/brain-allowlist.test.ts      (1 pair)
  test/oauth.test.ts                (1 pair)
  test/extract-db.test.ts           (1 pair)
  test/multi-source-integration.test.ts (1 pair)
  test/core/cycle.test.ts           (1 pair)

Results on the merged PR 0 branch:
  Before: 2175 pass / 20 fail / 3 errors
  After:  2281 pass /  0 fail / 0 errors    (+106 tests running that
                                             were previously blocked
                                             by the timed-out hooks)

No changes to production code. No test assertions changed. Just
timeout-bump + null-guard discipline that should have been in these
hooks from the start. The real longer-term fix is reusing an engine
across tests where possible (brain-allowlist.test.ts already does this
via beforeAll+DELETE-pages pattern), but that's per-file structural
work — out of scope for this cleanup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: regenerate llms-full.txt for brains-and-sources + brain-routing docs

The test/build-llms.test.ts test validates that the committed llms.txt
and llms-full.txt match the current generator output. PR 0 added
docs/architecture/brains-and-sources.md content paths and updated
CLAUDE.md + skills/RESOLVER.md in earlier commits, but the generated
bundle file wasn't regenerated alongside. This caused one of the 20
fails we chased down today — a straight content mismatch, not a runtime
bug. Running `bun run build:llms` picks up the new section content so
the bundle matches the sources again.

No functional change. Only the compiled doc bundle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request Apr 24, 2026
… exit

Lane A of PR #364 review fixes (20-item multi-lane plan). Addresses the
codex-tier + CEO + Eng findings on src/core/minions/supervisor.ts:

Safety + correctness:
- Atomic O_CREAT|O_EXCL PID lock via openSync('wx') with stale-file
  liveness check. Prevents two supervisors racing on the same PID file.
  (codex #1)
- Health check now queries status='active' AND lock_until < now()
  matching queue.ts:848's authoritative stalled definition. The prior
  `status = 'stalled'` predicate returned zero rows forever because
  'stalled' is not a persisted value in the schema. (codex #2)
- All health queries scoped to WHERE queue = $1 via opts.queue binding.
  Multi-queue installs no longer see cross-queue false positives.
  (codex #3)
- Class default allowShellJobs flipped true→false AND explicit
  `delete env.GBRAIN_ALLOW_SHELL_JOBS` when false, so child workers
  don't silently inherit the var from the parent shell. (eng #8, codex #9)
- Unified shutdown(reason, exitCode) — max-crashes now routes through
  the same drain path as SIGTERM. Single source of truth for lifecycle
  cleanup; prerequisite for trustworthy audit events (Lane C). (eng #1)
- Default PID path moves from /tmp to ~/.gbrain/supervisor.pid with
  mkdirSync recursive + GBRAIN_SUPERVISOR_PID_FILE env override.
  Matches the rest of the product's ~/.gbrain/ convention; fresh
  installs no longer hit ENOENT. (CEO #2 + codex #6)

Refinements:
- crashCount = 1 after 5-min stable-run reset (was 0, produced
  calculateBackoffMs(-1) = 500ms by accident). Now reads as 'first
  crash of a new cycle' with a clean 1s backoff. (Nit 1)
- Top-of-file POSTGRES-ONLY docstring documenting why the supervisor
  can't run against PGLite. (Nit 2)
- inBackoff flag suppresses 'worker not alive' warn during the
  expected null-child window (crash → sleep → next spawn). (eng #2)
- Tracked listener refs for SIGTERM/SIGINT removed in shutdown() so
  integration tests spinning up/tearing down multiple supervisors on
  one process don't leak handlers. (eng #3)
- Single FILTER query replaces two SELECT counts — one round-trip
  instead of two, three metrics in one pass. (eng #10)
- child.on('error') listener emits worker_spawn_failed event for
  ENOENT/EACCES; exit handler still increments crashCount as usual
  so max-crashes bounds permanent misconfigurations. (codex #7)
- healthInFlight boolean guard with try/finally prevents overlapping
  health checks from stacking on a hung DB. (codex #8)

Documented exit codes (ExitCodes const):
  0 CLEAN, 1 MAX_CRASHES, 2 LOCK_HELD, 3 PID_UNWRITABLE
  Agent can branch on exit=2 ('another supervisor, I'm fine') vs
  exit=1 ('escalate to human').

Event emitter surface:
  - started / worker_spawned / worker_exited / worker_spawn_failed
  - backoff / health_warn / health_error / max_crashes_exceeded
  - shutting_down / stopped
  Plumbed through emit() with an onEvent callback hook for Lane C's
  audit writer. json:false is the default; Lane C's --json mode
  flips it and writes JSONL to stderr.

CLI changes (src/commands/jobs.ts):
- `gbrain jobs supervisor` gains --allow-shell-jobs (explicit opt-in
  mirroring the env-var gate), --cli-path (override auto-resolution
  for exotic setups), and --json (JSONL lifecycle events on stderr).
- Expanded --help body with description, 3 examples, and exit-code
  table. (DX Fix A per review)
- Three-tier PID path resolution: --pid-file > GBRAIN_SUPERVISOR_PID_FILE
  > ~/.gbrain/supervisor.pid (via exported DEFAULT_PID_FILE).
- Removed the catch-fallback to process.argv[1] — resolveGbrainCliPath()
  throws its own actionable install-hint error, which is what dev users
  need instead of a cryptic spawn failure on a .ts path. (codex #5)

Tests: existing 7 supervisor.test.ts cases continue to pass.
Integration tests (crash-restart, max-crashes, SIGTERM-during-backoff,
env-inheritance regression) land in Lane E.

Out of scope for this lane (tracked in follow-up lanes):
- Audit file writer at ~/.gbrain/audit/supervisor-YYYY-Www.jsonl (Lane C)
- Documentation pass (Lane B)
- supervisor start/status/stop subcommands (Lane C)
- gbrain doctor supervisor check (Lane D)
- /ship release hygiene (Lane F)
- autopilot.ts migration to MinionSupervisor (deferred to follow-up PR
  per codex — requires non-blocking start() API redesign, not ~30 lines)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request Apr 24, 2026
…nager (#364)

* feat: add `gbrain jobs supervisor` — self-healing worker process manager

Adds a first-class supervisor command that:
- Spawns `gbrain jobs work` as a child process
- Restarts on crash with exponential backoff (1s→60s cap)
- Resets crash counter after 5min of stable operation
- PID file locking prevents duplicate supervisors
- Periodic health checks (stalled jobs, completion gaps)
- Graceful shutdown (SIGTERM→35s→SIGKILL)

Usage:
  gbrain jobs supervisor --concurrency 4

Replaces ad-hoc nohup patterns in bootstrap scripts.
The autopilot command's internal supervisor can be migrated
to use this in a follow-up.

Tests: 7 pass (backoff calc, PID management, crash tracking)

* supervisor: atomic PID lock, queue-scoped health, env safety, unified exit

Lane A of PR #364 review fixes (20-item multi-lane plan). Addresses the
codex-tier + CEO + Eng findings on src/core/minions/supervisor.ts:

Safety + correctness:
- Atomic O_CREAT|O_EXCL PID lock via openSync('wx') with stale-file
  liveness check. Prevents two supervisors racing on the same PID file.
  (codex #1)
- Health check now queries status='active' AND lock_until < now()
  matching queue.ts:848's authoritative stalled definition. The prior
  `status = 'stalled'` predicate returned zero rows forever because
  'stalled' is not a persisted value in the schema. (codex #2)
- All health queries scoped to WHERE queue = $1 via opts.queue binding.
  Multi-queue installs no longer see cross-queue false positives.
  (codex #3)
- Class default allowShellJobs flipped true→false AND explicit
  `delete env.GBRAIN_ALLOW_SHELL_JOBS` when false, so child workers
  don't silently inherit the var from the parent shell. (eng #8, codex #9)
- Unified shutdown(reason, exitCode) — max-crashes now routes through
  the same drain path as SIGTERM. Single source of truth for lifecycle
  cleanup; prerequisite for trustworthy audit events (Lane C). (eng #1)
- Default PID path moves from /tmp to ~/.gbrain/supervisor.pid with
  mkdirSync recursive + GBRAIN_SUPERVISOR_PID_FILE env override.
  Matches the rest of the product's ~/.gbrain/ convention; fresh
  installs no longer hit ENOENT. (CEO #2 + codex #6)

Refinements:
- crashCount = 1 after 5-min stable-run reset (was 0, produced
  calculateBackoffMs(-1) = 500ms by accident). Now reads as 'first
  crash of a new cycle' with a clean 1s backoff. (Nit 1)
- Top-of-file POSTGRES-ONLY docstring documenting why the supervisor
  can't run against PGLite. (Nit 2)
- inBackoff flag suppresses 'worker not alive' warn during the
  expected null-child window (crash → sleep → next spawn). (eng #2)
- Tracked listener refs for SIGTERM/SIGINT removed in shutdown() so
  integration tests spinning up/tearing down multiple supervisors on
  one process don't leak handlers. (eng #3)
- Single FILTER query replaces two SELECT counts — one round-trip
  instead of two, three metrics in one pass. (eng #10)
- child.on('error') listener emits worker_spawn_failed event for
  ENOENT/EACCES; exit handler still increments crashCount as usual
  so max-crashes bounds permanent misconfigurations. (codex #7)
- healthInFlight boolean guard with try/finally prevents overlapping
  health checks from stacking on a hung DB. (codex #8)

Documented exit codes (ExitCodes const):
  0 CLEAN, 1 MAX_CRASHES, 2 LOCK_HELD, 3 PID_UNWRITABLE
  Agent can branch on exit=2 ('another supervisor, I'm fine') vs
  exit=1 ('escalate to human').

Event emitter surface:
  - started / worker_spawned / worker_exited / worker_spawn_failed
  - backoff / health_warn / health_error / max_crashes_exceeded
  - shutting_down / stopped
  Plumbed through emit() with an onEvent callback hook for Lane C's
  audit writer. json:false is the default; Lane C's --json mode
  flips it and writes JSONL to stderr.

CLI changes (src/commands/jobs.ts):
- `gbrain jobs supervisor` gains --allow-shell-jobs (explicit opt-in
  mirroring the env-var gate), --cli-path (override auto-resolution
  for exotic setups), and --json (JSONL lifecycle events on stderr).
- Expanded --help body with description, 3 examples, and exit-code
  table. (DX Fix A per review)
- Three-tier PID path resolution: --pid-file > GBRAIN_SUPERVISOR_PID_FILE
  > ~/.gbrain/supervisor.pid (via exported DEFAULT_PID_FILE).
- Removed the catch-fallback to process.argv[1] — resolveGbrainCliPath()
  throws its own actionable install-hint error, which is what dev users
  need instead of a cryptic spawn failure on a .ts path. (codex #5)

Tests: existing 7 supervisor.test.ts cases continue to pass.
Integration tests (crash-restart, max-crashes, SIGTERM-during-backoff,
env-inheritance regression) land in Lane E.

Out of scope for this lane (tracked in follow-up lanes):
- Audit file writer at ~/.gbrain/audit/supervisor-YYYY-Www.jsonl (Lane C)
- Documentation pass (Lane B)
- supervisor start/status/stop subcommands (Lane C)
- gbrain doctor supervisor check (Lane D)
- /ship release hygiene (Lane F)
- autopilot.ts migration to MinionSupervisor (deferred to follow-up PR
  per codex — requires non-blocking start() API redesign, not ~30 lines)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: supervisor as canonical worker deployment pattern

Lane B of PR #364 review fixes. Reframes docs/guides/minions-deployment.md
around `gbrain jobs supervisor` as the default answer (blocker 7), deletes
the 68-line legacy bash watchdog (F10), and updates README + deployment
snippets to match.

docs/guides/minions-deployment.md:
- New 'Worker supervision' section at the top with the canonical 3-command
  agent pattern (start --detach / status --json / stop) and a documented
  exit-code table (0 clean, 1 max-crashes, 2 lock-held, 3 PID-unwritable).
- 'Which supervisor when?' decision table: container = supervisor as
  PID 1, Linux VM = systemd-over-supervisor, dev laptop = bare terminal.
- New 'Agent usage' section for OpenClaw / Hermes / Cursor / Codex — the
  3-turn discover-start-maintain workflow that replaces shell archaeology
  with machine-parseable JSON events + an audit file at
  ~/.gbrain/audit/supervisor-YYYY-Www.jsonl.
- Demoted the 'Option 1: watchdog cron' path entirely; replaced with a
  straightforward upgrade migration block (stop script, remove cron line,
  start supervisor, verify via doctor).
- Preconditions now check Postgres connectivity directly (supervisor is
  Postgres-only; the CLI rejects PGLite with a clear error).

Snippets:
- systemd.service: ExecStart now invokes `gbrain jobs supervisor` instead
  of raw `gbrain jobs work`. Two-layer supervision (systemd → supervisor
  → worker) buys automatic restart on reboot plus fast crash recovery.
  ReadWritePaths expanded to cover $HOME/.gbrain (supervisor PID + audit).
- Procfile + fly.toml.partial: same change — platform restarts the
  container on host events, supervisor restarts the worker on crashes.
- minion-watchdog.sh: deleted (git history retains it for anyone in an
  exotic deployment). Supervisor subsumes every capability it had plus
  atomic PID locking, structured audit events, queue-scoped health
  checks, and graceful drain on SIGTERM.

README.md:
- Added a paragraph under the Minions section pointing `gbrain jobs
  supervisor` as canonical, noting the --detach / status / stop surface
  and the audit file path, with a link to the full deployment guide.
  Kept `gbrain jobs work` documented for direct raw invocation but
  flagged 'prefer supervisor' for any long-running use.

The supervisor `--help` body itself (3 examples + exit-code table in
src/commands/jobs.ts) landed with Lane A — this lane finishes the
discoverability story by making the supervisor findable via doc grep,
README landing, and deployment-guide landing paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* supervisor: daemon-manager subcommands + JSONL audit writer

Lane C of PR #364 review fixes. Adds the daemon-manager CLI surface so
agents can drive `gbrain jobs supervisor` in 3 turns instead of 10, and
the audit writer that makes lifecycle events inspectable across process
restarts. (Blocker 8, closes DX Fix A/B/C.)

New: src/core/minions/handlers/supervisor-audit.ts
  - writeSupervisorEvent(emission, supervisorPid) appends JSONL to
    `${GBRAIN_AUDIT_DIR:-~/.gbrain/audit}/supervisor-YYYY-Www.jsonl`.
    ISO-week rotation via a `computeSupervisorAuditFilename()` helper
    that mirrors `shell-audit.ts` exactly (year-boundary ISO week math,
    Thursday anchor, etc).
  - readSupervisorEvents({sinceMs}) returns parsed events from the
    current week's file, oldest-first, for Lane D's doctor check.
    Malformed lines are skipped silently (disk-full truncation is
    already best-effort at write time).
  - Reuses `resolveAuditDir()` from shell-audit.ts so the
    `GBRAIN_AUDIT_DIR` env var override works identically across all
    gbrain audit trails.

src/commands/jobs.ts: supervisor subcommand dispatcher
  - `gbrain jobs supervisor [start] [--detach] [--json] ...` — default
    subcommand. Without --detach, runs foreground as before. With
    --detach, forks a background child (inheriting stderr so the caller
    can still tail JSONL events), writes a stdout payload:
      {"event":"started","supervisor_pid":N,"pid_file":"...","detached":true}
    and exits 0. Stdin/stdout on the detached child are /dev/null so
    the parent shell isn't held open.
  - `gbrain jobs supervisor status [--json]` — reads the PID file,
    checks liveness via `kill -0`, then reads the last 24h from the
    supervisor audit file to compute crashes_24h / last_start /
    max_crashes_exceeded. Exits 0 if running, 1 if not. JSON output
    is machine-parseable; human output is a 5-line ASCII report.
  - `gbrain jobs supervisor stop [--json]` — reads PID, sends SIGTERM,
    polls `kill -0` every 250ms for up to 40s (supervisor's own 35s
    worker-drain + 5s slack). Reports outcome: drained / timeout_40s
    / pid_file_missing / pid_file_corrupt / process_gone. Exit 0 on
    clean stop.
  - `--json` flag is already plumbed through to the supervisor opts
    from Lane A — this lane adds the onEvent audit-writer callback
    so every supervisor emission (started, worker_spawned,
    worker_exited, worker_spawn_failed, backoff, health_warn,
    health_error, max_crashes_exceeded, shutting_down, stopped) lands
    in the JSONL file with the supervisor's PID attached.

--help body updated:
  - Three separate usage lines (start / status / stop).
  - SUBCOMMANDS block with one-line summaries each.
  - EXIT CODES block (unchanged from Lane A, moved under SUBCOMMANDS).
  - EXAMPLES block updated with status --json + stop + --detach forms.

Tests: existing 127 supervisor + minions tests continue to pass.
Integration tests for the new subcommands + audit writer land with
Lane E.

Follow-up (Lane D): `gbrain doctor` will read readSupervisorEvents()
from this module to surface a `supervisor` health check alongside its
existing checks (DB connectivity, schema version, queue health).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* doctor: add supervisor health check

Lane D of PR #364 review fixes. Closes the observability loop: now that
Lane C writes supervisor lifecycle events to
`${GBRAIN_AUDIT_DIR:-~/.gbrain/audit}/supervisor-YYYY-Www.jsonl`,
`gbrain doctor` surfaces a `supervisor` check alongside its existing
health indicators.

Implementation (src/commands/doctor.ts, filesystem-only block 3b-bis):
- Resolves DEFAULT_PID_FILE via the same three-tier logic as the start
  path (--pid-file > GBRAIN_SUPERVISOR_PID_FILE > ~/.gbrain/supervisor.pid).
- Reads the PID file + `kill -0 <pid>` for liveness.
- Calls readSupervisorEvents({sinceMs: 24h}) from the audit module to
  derive last_start / crashes_24h / max_crashes_exceeded.
- Suppresses the check entirely when the user has never invoked the
  supervisor (no PID file AND no audit events) — avoids noise on
  installs that don't use the feature.

Status thresholds:
  fail   max_crashes_exceeded event seen in last 24h
         (supervisor gave up; operator needs to restart or triage)
  warn   supervisor not running but audit shows prior use
         (unexpected stop — likely crash or manual kill)
  warn   running but > 3 crashes in last 24h
         (supervisor recovering but worker is unstable)
  ok     running + ≤ 3 crashes + no max_crashes event

All failure paths emit a paste-ready recovery command. Read/import
errors are swallowed (best-effort like the other doctor checks).

Tests: all 127 supervisor + minions tests still green; 13 existing
doctor tests unaffected.

F3 done. All four lanes A/B/C/D are now committed; Lane E (integration
tests) and Lane F (/ship v0.20.2) remain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test: 4 critical integration tests for supervisor lifecycle

Lane E of PR #364 review fixes (blocker 10). Fills the ~15% coverage
gap flagged in the eng review by actually exercising the code paths
that will break in production — crash-restart loop, max-crashes exit,
SIGTERM-during-backoff, env-var inheritance — via real spawn() calls
against fake shell-script workers. No mocks: real fork, real signals,
real env propagation, real audit file writes.

test/fixtures/supervisor-runner.ts (new, 55 lines):
  A standalone bun script that constructs a MinionSupervisor from env
  vars (SUP_PID_FILE / SUP_CLI_PATH / SUP_MAX_CRASHES / SUP_BACKOFF_FLOOR_MS
  / SUP_HEALTH_INTERVAL_MS / SUP_ALLOW_SHELL_JOBS / SUP_AUDIT_DIR) and
  calls start(). Mock engine returns empty rows for executeRaw (health
  check path still exercised without Postgres). Tests spawn this as a
  subprocess because MinionSupervisor.start() calls process.exit() on
  shutdown — can't run it in the test runner's own process.

test/supervisor.test.ts (existing; 91 → 300 lines):
  - Added IntegrationHarness helper: creates a unique tmpdir per test,
    a fake worker shell script, a PID-file path, and an audit-dir path;
    cleanup runs in finally.
  - spawnSupervisor() forks bun on the runner with env vars set.
  - readAudit() reads the supervisor-YYYY-Www.jsonl file via the
    existing readSupervisorEvents() helper (Lane C), threading
    GBRAIN_AUDIT_DIR through so tests don't collide on ~/.gbrain.
  - waitFor(pred, timeoutMs) polls helper for event-driven tests.

Four integration tests (with _backoffFloorMs=5 for <1s suite runs):

  1. "respawns the worker after a crash and eventually exits with
     max-crashes code=1"
     Worker always `exit 1`. maxCrashes=3. Asserts: exit code 1, PID
     file cleaned up, audit contains started + 3x worker_spawned +
     3x worker_exited + max_crashes_exceeded + shutting_down + stopped,
     and the stopped event carries {reason:'max_crashes', exit_code:1}.
     Locks in blockers 1 (PID lock), 2+3+6 (health SQL doesn't 500),
     5 (unified shutdown emits right events), F8 (spawn errors counted).

  2. "receives SIGTERM while sleeping between crashes and exits 0 cleanly"
     Worker always `exit 1`, backoff floor 800ms to catch the sleep.
     Asserts: SIGTERM during backoff → exit code 0 (not 1) in <5s,
     no signal kill (process.exit via shutdown), audit contains
     shutting_down {reason:'SIGTERM'} + stopped, PID file cleaned up.
     Locks in eng Issue 1 (unified exit path), eng Issue 3 (signal
     handlers don't accumulate across shutdowns).

  3. "strips inherited GBRAIN_ALLOW_SHELL_JOBS when allowShellJobs=false,
     even if parent has it set"  ⚠ CRITICAL regression test
     Parent env has GBRAIN_ALLOW_SHELL_JOBS=1. SUP_ALLOW_SHELL_JOBS=0.
     Worker writes $GBRAIN_ALLOW_SHELL_JOBS (or 'UNSET' if absent) to
     an OUT_FILE. Asserts child sees 'UNSET'. Locks in codex #9 + eng
     #8: the `else delete env.GBRAIN_ALLOW_SHELL_JOBS` branch from
     Lane A is load-bearing for the supervisor's security posture;
     this test prevents a future refactor silently re-opening the
     inheritance hole.

  4. "DOES pass GBRAIN_ALLOW_SHELL_JOBS to child when allowShellJobs=true"
     Positive-path companion to #3. SUP_ALLOW_SHELL_JOBS=1 → worker
     sees '1'. Confirms the else-branch doesn't over-strip and that
     operators who explicitly opt in still get shell-exec enabled.

Plus two audit-format unit tests:
  - computeSupervisorAuditFilename format (regex match)
  - Year-boundary ISO week: 2027-01-01 → supervisor-2026-W53.jsonl
    (matches the shell-audit.ts pattern exactly)

Before: 7 tests covering backoff math + PID helpers (~15% behavioral
coverage per eng review).
After: 13 tests across all critical lifecycle paths (crash-restart,
max-crashes, SIGTERM, env-inheritance, audit rotation).

All 146 tests in supervisor + minions + doctor suites green in ~8s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.20.2)

Lane F of PR #364 review fixes. Closes the multi-lane plan with release
hygiene: VERSION bump 0.19.0 → 0.20.2, package.json sync, CHANGELOG entry
in GStack voice with release summary + "numbers that matter" table +
"To take advantage of v0.20.2" migration block + itemized changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: escape template-literal interpolation in supervisor --help

The --help body in src/commands/jobs.ts is one big backtick template
literal. The supervisor subcommand description I added in Lane B used
both `${GBRAIN_AUDIT_DIR:-~/.gbrain/audit}` (parsed as a template
interpolation into an undefined variable) and inline `code` backticks
(parsed as nested template literals). CI caught it with ~200 tsc parse
errors across the file.

Fix:
- Escape `${...}` → `\${...}` so the audit-file path renders literally.
- Replace prose inline-code backticks with plain single-quote fences
  (`gbrain jobs work` → 'gbrain jobs work', `~/.gbrain/supervisor.pid`
  → ~/.gbrain/supervisor.pid). `--help` output is human prose; the
  single-quote form reads cleanly in a terminal without needing to
  smuggle nested backticks through a template literal.

`bunx tsc --noEmit` is clean. 146 tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: regenerate llms-full.txt after Lane B doc rewrite

CI drift guard caught that `llms-full.txt` didn't match the current
generator output. Root cause: the Lane B rewrite of
`docs/guides/minions-deployment.md` (supervisor as canonical, watchdog
deleted) changed content that gets inlined into `llms-full.txt`, but I
didn't run `bun run build:llms` to regenerate.

`bun test test/build-llms.test.ts` now clean (7/7 pass).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: root <root@localhost>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request Apr 26, 2026
Bumped 0.22.0 → 0.26.0 to slot above master's v0.21 chain with headroom
for v0.23/0.24/0.25 to ship from master between now and merge.

Security fixes (all from CSO finding writeups):

#1 cookie-parser middleware — admin dashboard auth was silently broken.
   Express 5 has no built-in cookie parsing; req.cookies was always
   undefined, so /admin/login set the cookie but every subsequent admin
   API call returned 401. Added cookie-parser@^1.4.7 + @types/cookie-parser
   as direct + dev deps. app.use(cookieParser()) wired before CORS.

#2 + #3 TOCTOU races — exchangeAuthorizationCode and exchangeRefreshToken
   used SELECT-then-DELETE, letting concurrent requests with the same
   code/refresh both pass the SELECT before either ran DELETE, both
   issuing token pairs. Switched to atomic DELETE...RETURNING. RFC 6749
   §10.5 (codes) + §10.4 (refresh detection) violations closed. Added
   regression tests that fire 10 concurrent exchanges and assert exactly
   one wins — both pass.

#5 pgArray escape + DCR redirect_uri validation — pgArray() did
   `arr.join(',')` with no escaping, so an element containing a comma
   would be parsed by Postgres as TWO array elements. With --enable-dcr
   on, this could smuggle a second redirect_uri into a registered client
   and steal auth codes. Now every element is double-quoted with `"` and
   `\` escaped. Added validateRedirectUri() per RFC 6749 §3.1.2.1:
   redirect_uris must be https:// or loopback (localhost / 127.0.0.1).
   Wired into the DCR registerClient path; CLI registration trusts the
   operator and bypasses. Regression test confirms a comma-in-URI element
   round-trips as 1 element, not 2.

#6 --public-url flag — issuerUrl was hardcoded to http://localhost:{port}.
   Behind reverse proxies / ngrok / production deploys, the issuer claim
   in tokens wouldn't match the discovery URL clients hit (RFC 8414 §3.3).
   New --public-url URL flag on `gbrain serve --http`, propagates through
   serve.ts → serve-http.ts → ServeHttpOptions.publicUrl → issuerUrl.
   Startup banner surfaces the configured issuer.

Findings #4 (admin requests filter dead code), #7 (admin register-client
hardcoded grant_types), #8 (legacy token grandfathering posture) are
documentation / minor functional fixes and are deferred per user direction.

Tests: oauth.test.ts now 34 cases (was 27). 7 new:
- single-use TOCTOU regression (10 concurrent code exchanges)
- single-use TOCTOU regression (10 concurrent refresh exchanges)
- redirect_uri http://localhost passes
- redirect_uri https://example.com passes
- redirect_uri http://example.com (non-loopback plaintext) rejected
- redirect_uri non-URL rejected
- redirect_uri with embedded comma stored as single element

Files:
- VERSION, package.json: 0.22.0 → 0.26.0
- CHANGELOG.md: heading + table + "To take advantage" + "pre-v0.22" → v0.26;
  new "Security hardening (post-/cso pass)" subsection at top of itemized
  changes; CLI flag list updated for --public-url.
- src/core/oauth-provider.ts: pgArray escape, validateRedirectUri,
  registerClient enforces validation, DELETE...RETURNING in
  exchangeAuthorizationCode + exchangeRefreshToken.
- src/commands/serve-http.ts: cookie-parser import + wire-up,
  publicUrl option, issuerUrl honors it, startup banner shows issuer.
- src/commands/serve.ts: parses --public-url and threads through.
- src/cli.ts: help text adds --public-url URL flag.
- test/oauth.test.ts: +7 regression tests (now 34 total).
- llms-full.txt: regenerated.

Typecheck clean. 34 oauth + 14 cli tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@momoiicom momoiicom mentioned this pull request Apr 28, 2026
1 task
garrytan added a commit that referenced this pull request May 3, 2026
…oard (#358)

* feat: OAuth 2.1 schema tables + shared token utilities

Add oauth_clients, oauth_tokens, oauth_codes tables to both PGLite and
Postgres schemas. Migration v5 creates tables for existing databases.
PGLite now includes auth infrastructure (access_tokens, mcp_request_log,
OAuth tables) because `serve --http` makes it network-accessible.

Extract hashToken() and generateToken() to src/core/utils.ts for DRY
reuse across auth.ts and oauth-provider.ts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: GBrainOAuthProvider — MCP SDK OAuthServerProvider implementation

Implements OAuthServerProvider backed by raw SQL (PGLite or Postgres).
Supports client credentials, authorization code with PKCE, token refresh
with rotation, revocation, and legacy access_tokens fallback.

Key decisions from eng review:
- Uses raw SQL connection, not BrainEngine (OAuth is infrastructure)
- All tokens/secrets SHA-256 hashed before storage
- Legacy tokens grandfathered as read+write+admin
- sweepExpiredTokens() wrapped in try/catch (non-blocking startup)
- Client credentials: no refresh token per RFC 6749 4.4.3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: scope + localOnly annotations on all 30 operations

Add AuthInfo, scope ('read'|'write'|'admin'), and localOnly fields to
Operation interface. Per-operation audit:
- 14 read ops, 9 write ops, 2 admin ops, 4 admin+localOnly ops
- sync_brain, file_upload, file_list, file_url: admin + localOnly
- Scope enforcement happens in serve-http.ts before handler dispatch

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: HTTP MCP server with OAuth 2.1 + 27 OAuth tests

gbrain serve --http starts Express 5 server with:
- MCP SDK mcpAuthRouter (authorize, token, register, revoke endpoints)
- Custom client_credentials handler (SDK doesn't support CC grant)
- Bearer auth + scope enforcement on /mcp tool calls
- Admin dashboard auth via HTTP-only cookie + bootstrap token
- SSE live activity feed at /admin/events
- DCR default OFF (--enable-dcr to enable)
- Rate limiting on /token (50/15min)
- localOnly operations excluded from HTTP

CLI: gbrain serve --http [--port 3131] [--token-ttl 3600] [--enable-dcr]

Dependencies: express@5.2.1, express-rate-limit@7.5.1, cors@2.8.6
SDK pinned to exact 1.29.0 (was ^1.0.0)

27 new tests covering OAuth provider, scope enforcement, auth code flow,
refresh rotation, token revocation, legacy fallback, and sweep.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: React admin dashboard — 7 screens, dark theme, Krug-designed

Admin SPA at /admin with client-side routing (#login, #dashboard,
#agents, #log). Built with Vite + React, served from admin/dist/.

Screens:
- Login: one field, one button, zero happy talk
- Dashboard: metrics bar, SSE live activity feed, token health panel
- Agents: table with scopes/badges, + Register Agent button
- Register: modal form (name, scopes), 3 mindless choices
- Credentials: full-screen modal, copy buttons, download JSON, warning
- Request Log: paginated table (50/page), time-relative timestamps
- Agent Detail: slide-out drawer, config export tabs (Perplexity/Claude/JSON)

Design tokens: #0a0a0f bg, Inter + JetBrains Mono, 4-32px spacing.
Build: bun run build:admin (Vite, 65KB gzipped).
Admin API: /admin/api/register-client endpoint for dashboard registration.
SPA serving: Express static + index.html fallback for client-side routing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: add admin SPA lockfile

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v1.0.0.0)

Milestone release: multi-agent GBrain with OAuth 2.1, HTTP server,
and React admin dashboard. See CHANGELOG.md for details.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: update project documentation for v1.0.0.0

Sync README, CLAUDE.md, and docs/mcp/ with the OAuth 2.1 + HTTP server
+ admin dashboard surface that shipped in v1.0.0.0.

- README.md: new "Remote MCP with OAuth 2.1" section covering
  gbrain serve --http, admin dashboard, scoped operations, legacy
  bearer fallback; add serve --http + auth notes to the commands
  reference.
- CLAUDE.md: add src/commands/serve-http.ts, src/core/oauth-provider.ts,
  admin/ directory as key files; document scope + localOnly additions
  to Operation contract; add oauth.test.ts (27 cases) to the test list;
  add v1.0.0 key-commands section clarifying that OAuth client
  registration is via the /admin dashboard or SDK (no CLI subcommand).
- docs/mcp/DEPLOY.md: promote --http as the recommended remote path,
  add OAuth 2.1 Setup section, list ChatGPT in supported clients,
  remove the "not yet implemented" footer.
- docs/mcp/CHATGPT.md (new): unblocks the P0 TODO. Full ChatGPT
  connector setup via OAuth 2.1 + PKCE.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: wire gbrain auth subcommand with OAuth register-client

Previously auth.ts was a standalone script invoked via
`bun run src/commands/auth.ts`. CHANGELOG and README documented
`gbrain auth ...` commands that didn't actually work.

- Export `runAuth(args)` from auth.ts (keeps standalone entry intact
  via `import.meta.url === file://${process.argv[1]}` check)
- Add `auth` to CLI_ONLY + dispatch in handleCliOnly
- New subcommand `gbrain auth register-client <name> [--grant-types]
  [--scopes]` wraps GBrainOAuthProvider.registerClientManual
- Lazy DB check: only subcommands that need DATABASE_URL error out

Now the documented CLI flow works end to end:
  gbrain auth register-client perplexity --grant-types client_credentials --scopes "read write"
  gbrain serve --http --port 3131

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: reflect wired gbrain auth register-client CLI

After /ship, the doc subagent wrote docs assuming `gbrain auth
register-client` did not exist (it said so explicitly in CLAUDE.md:184).
A follow-up commit (c4a86ce) wired it into src/cli.ts + src/commands/auth.ts.
These docs were now contradicting reality.

- CLAUDE.md: removed "There is no gbrain auth register-client CLI
  subcommand" claim, documented the three registration paths
  (CLI / dashboard / SDK).
- README.md: replaced `bun run src/commands/auth.ts` hint with
  `gbrain auth create|list|revoke|test` and `gbrain auth register-client`.
- docs/mcp/DEPLOY.md: added CLI registration example above the
  programmatic example.
- TODOS.md: moved "ChatGPT MCP support (OAuth 2.1)" P0 item to
  Completed with v1.0.0.0 completion note. Closes the P0 that had been
  blocking the "every AI client" promise since v0.6.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: enable RLS on OAuth tables + loosen v24-exact test assertion

CI Tier 1 (Mechanical) was failing on 4 E2E tests after the v0.18.1 RLS
hardening landed on master (PR #343). Our v25 oauth_infrastructure migration
adds 3 new public tables (oauth_clients, oauth_tokens, oauth_codes) but
didn't enable RLS, so gbrain doctor's new check flagged them and the
"RLS on every public table" assertion failed.

Fixes:
- src/schema.sql: ALTER TABLE ... ENABLE ROW LEVEL SECURITY for the 3 OAuth
  tables inside the existing BYPASSRLS-gated DO block (fresh installs).
- src/core/migrate.ts v25: append a BYPASSRLS-gated DO block after the OAuth
  CREATE TABLE statements (existing installs on upgrade). Mirrors the v24
  rls_backfill gating pattern — RAISE WARNING if the current role lacks
  BYPASSRLS, so migrations don't silently lock the operator out.
- src/core/schema-embedded.ts: regenerated via `bun run build:schema`.
- test/e2e/mechanical.test.ts: one unrelated v24 test asserted the post-
  migration version equals exactly '24'. That breaks when any later
  migration exists (like our v25). Relaxed to `>= 24` since the test's
  intent is "v24 didn't abort the chain", not "v24 is the final version".

Verified locally: 78/78 E2E tests pass against real Postgres 16 + pgvector.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: regenerate llms-full.txt for v1.0.0 docs

CI test/build-llms.test.ts > committed llms.txt + llms-full.txt match
current generator output failed. The committed llms-full.txt was built
before the v1.0.0 doc updates landed (OAuth 2.1 README section, new
docs/mcp/CHATGPT.md, CLAUDE.md serve-http references, etc.), so the
regen-drift guard flagged it.

Ran `bun run build:llms`. llms.txt is unchanged (skinny index still
matches); llms-full.txt picks up 166 net-new lines of bundled content.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* connected-gbrains PR 0 — minimal runtime (mounts, registry, aggregated RESOLVER) (#372)

* feat(mounts): connected-gbrains PR 0 foundation — registry + resolver + CLI

Lays the foundation for connected gbrains (v0.19.0) per the approved plan.
This is PR 0 — minimal runtime for direct-transport, path-mounted brains.

What this slice ships:
- src/core/brain-registry.ts — keyed BrainRegistry with lazy engine init,
  schema-validated mounts.json loader, DuplicateMountPathError (load-bearing
  identity check per Codex finding #9 correction), UnknownBrainError with
  actionable available-id list. Pure: no AsyncLocalStorage, no singleton
  mutation. ~280 LOC.

- src/core/brain-resolver.ts — 6-tier brain-id resolution mirroring
  v0.18.0's source-resolver.ts so agents learn ONE mental model:
    1. --brain <id>     2. GBRAIN_BRAIN_ID env      3. .gbrain-mount dotfile
    4. longest-path match over registered mounts    5. (reserved v2 default)
    6. 'host' fallback
  Orthogonal to --source: --brain picks which DB, --source picks the repo
  within that DB. Corruption-resistant: mounts.json load failures fall
  through to 'host' instead of breaking every CLI invocation.

- src/commands/mounts.ts — `gbrain mounts add|list|remove` (direct transport
  only). Validates on add (path exists on disk, id regex, no dupes). WARNS
  but does not block on same db_url/db_path across ids (teams may
  legitimately alias a remote brain). Password redaction in list output.
  Atomic write via temp+rename. 0600 perms. PR 1 adds pin/sync/enable;
  PR 2 adds --mcp-url + OAuth.

- src/cli.ts — wires `gbrain mounts` into handleCliOnly (no DB required
  for the config-only subcommands).

- test/brain-registry.test.ts (28 cases): schema validation across every
  malformed-input branch, ALS-free resolution, duplicate id + path detection,
  disabled-mount exclusion, UnknownBrainError context.

- test/brain-resolver.test.ts (22 cases): priority order (explicit > env >
  dotfile > path-prefix > fallback), dotfile walk-up, malformed dotfile
  recovery, longest-prefix match, sibling-path false-positive guard,
  loader-failure defense.

- test/mounts-cli.test.ts (17 cases): parseAddArgs surface, redactUrl,
  atomic write, add/list/remove roundtrip via temp HOME.

67 new tests, all green. Typecheck clean. Depends on mcp-key-mgmt (base
branch) for the OAuth/scope annotations that PR 2 will leverage.

Next in this branch: PR 0 still needs (a) the deep host-brain-bias audit
(postgres-engine internal singleton fallback + a few operations.ts
callers), (b) OperationContext threading to make ctx.brainId populated at
dispatch, (c) composeResolvers + composeManifests, (d) aggregated
~/.gbrain/mounts-cache/ for host-agent runtime ownership.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(mounts): brains-and-sources mental model + agent routing convention

Two orthogonal axes organize GBrain knowledge. Users AND agents need to
understand both, or queries misroute silently.

  --brain  → WHICH DATABASE    (host + mounts)
  --source → WHICH REPO IN DB  (v0.18.0 sources: wiki, gstack, ...)

Both axes use the same 6-tier resolution (explicit > env > dotfile >
path-prefix > default > fallback), so learning one teaches both.

Ships:

- docs/architecture/brains-and-sources.md — canonical mental model doc.
  Covers four topologies with ASCII diagrams:
    1. Single-person developer (one brain, one source)
    2. Personal brain with multiple repos (one brain, N sources)
    3. Personal + one team brain mount (2 brains)
    4. Senior user with multiple team memberships (N mounted team brains
       alongside personal) — the CEO-class topology
  Explicit "when to move each axis" decision table. Generic example names
  throughout per the project's privacy rule.

- skills/conventions/brain-routing.md — agent-facing decision table.
  Rules for when to switch brain (team-owned question, explicit name,
  data owner changes) vs switch source (working in a repo, topic scoped
  to one repo). Cross-brain federation is latent-space only in v0.19 —
  the agent fans out; the DB never does. Anti-patterns listed: silent
  brain jumps, writing to host when data is team-owned, missing brain
  prefix in citations, ignoring .gbrain-mount dotfiles.

- CLAUDE.md — adds "Two organizational axes (read this first)" section
  at the top pointing at both new docs.

- AGENTS.md — adds brains-and-sources.md + brain-routing.md to the
  "read this order" (positions 3 and 4, before RESOLVER.md).

- skills/RESOLVER.md — adds brain-routing.md to the Conventions section
  so it appears alongside quality.md, brain-first.md, subagent-routing.md.

No code changes. Pre-existing check-resolvable warnings unchanged (2
warnings on base unrelated to this work). 67 PR-0 tests still green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(mounts): thread brainId through OperationContext + subagent chain

PR 0 plumbing for connected gbrains. Adds an optional brainId field that
identifies which database an operation targets and ensures subagents
inherit the parent job's brain instead of process-wide defaults. No
dispatch-path changes in this commit — that is PR 1 (registry wiring at
MCP + CLI entry points). The fields exist so callers can set them now
and downstream code respects them.

Changes:

- src/core/operations.ts: OperationContext grows `brainId?: string`.
  Optional for back-compat. 'host' is the implicit default when absent.
  Orthogonal to v0.18.0's source_id (source = which repo within the
  brain, brain = which database). See docs/architecture/brains-and-sources.md.

- src/core/minions/types.ts: SubagentHandlerData gains `brain_id?: string`.
  Parent jobs set this when submitting a child subagent to lock the
  child into a specific brain. Omitted = host (unchanged behavior).

- src/core/minions/handlers/subagent.ts: buildBrainTools call site
  reads data.brain_id and passes it through. Child subagents spawned
  from this handler will see the same brainId unless they override in
  their own data.

- src/core/minions/tools/brain-allowlist.ts: BuildBrainToolsOpts +
  OpContextDeps grow brainId; buildOpContext stamps it on every
  OperationContext the subagent builds for tool calls. Addresses Codex
  finding #6 (brain-allowlist hardwired parent config without brain
  awareness, so switching brain only in subagent.ts was not enough).

Tests: 166 affected tests green (subagent suite + minions + brain
registry + resolver). Typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(mounts): composeResolvers + composeManifests + aggregated cache

The runtime ownership seam for connected gbrains (Codex finding #3 from
plan review): check-resolvable.ts VALIDATES RESOLVER.md; it does not
DISPATCH skills. Host agents (Wintermute/OpenClaw/Claude Code) read
skills/RESOLVER.md directly to route user requests. Without an aggregated
resolver, mounted team brains cannot contribute skills to the host
agent's routing table.

This commit adds the aggregation:

- src/core/mounts-cache.ts (NEW): pure composeResolvers + composeManifests
  functions plus filesystem writers for ~/.gbrain/mounts-cache/. The
  aggregated files carry every host skill plus every mount skill,
  namespace-prefixed (e.g. `yc-media::ingest`). Host skills always beat
  a same-named mount skill (locked decision 1); bare-name collisions
  between two mounts surface as structured ambiguity info so doctor can
  warn (PR 1).

  Also addresses Codex finding #8: manifests compose alongside the
  resolver, else doctor conformance breaks on remote skills.

- src/commands/mounts.ts: refreshMountsCache() called on `mounts add`
  and `mounts remove` (the latter clearing the cache entirely when the
  last mount goes away). Uses findRepoRoot() to locate the host skills
  dir; skips with a stderr note when run outside a gbrain repo so the
  user isn't confused by a "cache not refreshed" error in the wrong
  cwd.

- test/mounts-cache.test.ts (NEW): 23 unit tests covering empty world,
  host-only, single mount, two-mount ambiguity, host-shadows-mount,
  disabled mount excluded, missing RESOLVER.md is a no-op, manifest
  composition with same-name collision, render shape, atomic rewrite,
  clear on missing dir.

Output format for ~/.gbrain/mounts-cache/RESOLVER.md adds a Brain column
so host agents can see which brain each trigger routes to at a glance,
plus Shadows and Ambiguous sections when those conditions exist.

Tests: 90 PR 0 tests green (brain-registry + resolver + mounts-cache +
mounts-cli). Full suite regression pending in task 11.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(mounts): force instance-level pool for mount brains + CI guard

Closes the silent-singleton-share bug Codex flagged as finding #1 from
the plan review: two direct-transport mounts with different Postgres
URLs would both fall through postgres-engine.ts's `get sql()` getter to
db.getConnection() and quietly share whichever singleton connected
first. Your yc-media writes end up in garrys-list or vice versa. No
error at the call site — just wrong data.

The fix:

- src/core/brain-registry.ts: initMountBrain now passes poolSize when
  calling engine.connect(). That forces postgres-engine.ts:33-60 down
  the instance-level path (setting this._sql) instead of the module
  singleton path (calling db.connect). Hard-coded 5 for PR 0 — per-mount
  override is PR 1. PGLite ignores poolSize (no pool concept), so this
  is Postgres-specific.

  Host brain still uses the singleton path via initHostBrain (unchanged).
  That is fine for PR 0: the singleton is "the host's one connection"
  by definition. PR 1 removes the singleton entirely once every CLI
  command is engine-injectable.

- scripts/check-no-legacy-getconnection.sh (NEW): CI grep guard against
  new db.getConnection() / db.connect() calls landing in src/core/ or
  src/commands/ (the multi-brain dispatch surface). Has an explicit
  ALLOWED list grandfathering today's legitimate callers, each marked
  "PR 1 refactors" so the list shrinks over time. Skips comment lines
  so the grep doesn't trip on doc references to the old pattern.

- package.json: scripts.test chains the new guard after the existing
  check-jsonb-pattern + check-progress-to-stdout guards. `bun run test`
  now fails the build on singleton regression.

Tests: 295 affected pass (registry, resolver, mounts-cache, mounts-cli,
minions, pglite-engine). Typecheck clean. CI guard reports "ok: no new
singleton callers" on current tree.

Left for PR 1: remove the singleton fallback in postgres-engine.ts's
`get sql()` entirely; refactor src/commands/doctor.ts, files.ts,
repair-jsonb.ts, serve-http.ts, init.ts, and the 3 localOnly ops in
operations.ts (file_list, file_upload, file_url) to accept ctx.engine
explicitly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(mounts): codex review findings — namespace survives shadow + atomic tmp names + honest PR 0 docstrings

Codex outside-voice review on PR #372 found 5 issues. Real bugs fixed, overclaims
rewritten. Details:

P2 (real bug): composeResolvers and composeManifests were silently dropping
mount entries when a host skill shared the short name, which made the
namespace-qualified form `<mount>::<skill>` unreachable once host defined
the same short name. That defeated the entire namespace-disambiguation
model — if host had `ingest`, no mount could ship an `ingest` skill even
with explicit `yc-media::ingest`. Fix: always keep namespace-qualified
mount entries in the composed output. Shadow tracking moves to metadata
(`shadows[]`) that doctor can warn on, but never drops routing.

  Before:  host ingest + yc-media ingest → only 1 entry (host), yc-media::ingest unreachable
  After:   host ingest + yc-media ingest → 2 entries: bare `ingest` = host, `yc-media::ingest` = mount
  Verified live: gbrain mounts add of a mount with `ingest` now shows
  `team-demo::ingest` alongside host `ingest` in the aggregated manifest.

P1 (real bug): writeMountsFile + writeMountsCache used fixed `.tmp`
filenames. Two concurrent `gbrain mounts add` invocations (e.g. from
parallel terminals or CI) would clobber each other's temp file and
one writer's update would be lost. Fix: tmp filenames include
`process.pid + random suffix` so every writer has its own scratch file.
The atomic rename is self-contained per-writer. (Full lock + read-modify-
write safety deferred to PR 1 under `gbrain mounts sync --lock`.)

P1 (honesty): `SubagentHandlerData.brain_id` +
`BuildBrainToolsOpts.brainId` docstrings claimed child jobs inherit the
parent's brain and brain tools target the resolved brain. True for the
`ctx.brainId` field only — `ctx.engine` is still the worker's base
engine at dispatch time because `buildOpContext` doesn't yet do the
registry lookup, and `gbrain agent run` doesn't yet accept `--brain` to
populate the field on submission. Rewrote both docstrings to state the
PR 0 behavior explicitly (field plumbed, engine routing is PR 1) so
nobody reads the code thinking multi-brain subagents already work.

Also cleaned up two `require('fs')` runtime imports left over from the
initial PR — swapped for ESM named imports (renameSync). Pre-existing
style issue surfaced by the self-review pass.

Tests: 90 PR-0 tests pass. Updated two shadow-related test cases to
assert the corrected semantics (both entries survive, host wins bare
name, namespace form routes to mount).

Not fixed in this commit (documented as known PR 0 limitations):
- `file_list` / `file_upload` / `file_url` in operations.ts still hit the
  singleton (localOnly + admin, never reachable from HTTP MCP — safe in
  practice, refactor in PR 1 alongside command-level cleanups).
- writeMountsCache's two-file swap (RESOLVER.md + manifest.json) is not
  atomic across files; readers can briefly observe mismatched pairs.
  Acceptable because the cache is recomputable at any time from
  mounts.json. Generation-directory swap is PR 1 work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(tests): bump hook timeouts for 21-migration PGLite init under full-suite load

Root cause of 19 pre-existing full-suite flakes (CHANGELOG v0.18.0 noted
"17 pre-existing master timeouts"): every PGLite test does

  beforeAll/beforeEach(async () => {
    engine = new PGLiteEngine();
    await engine.connect({});
    await engine.initSchema();  // runs 21 migrations through v0.18.2
  });

In isolation this takes ~5s. Under full-suite contention (128 files,
process-shared FS and CPU) it exceeds bun's default 5000ms hook timeout,
beforeEach times out, engine stays undefined, then afterEach crashes
with `TypeError: undefined is not an object (evaluating 'engine.disconnect')`.
That single hook failure reports as the whole test "failing" even though
the test body never executed, which is why the failure count sometimes
looked inflated compared to the number of genuinely-broken tests.

Fix applied across 7 test files:

- Raise setup hook timeout to 30_000 (6x the default) — gives migration
  init enough headroom even under worst-case load without masking real
  regressions in a post-migration test.
- Raise teardown hook timeout to 15_000 — engine.disconnect() is usually
  fast but can stall when PGLite's WASM runtime is still completing a
  migration at shutdown.
- Add `if (engine) await engine.disconnect()` guard so afterEach doesn't
  double-fault when beforeEach already failed. This was the source of
  the opaque "(unnamed)" failures — they were disconnect crashes,
  not test-body failures.

Files:
  test/dream.test.ts                (5 beforeEach + 5 afterEach blocks)
  test/orphans.test.ts              (1 pair)
  test/brain-allowlist.test.ts      (1 pair)
  test/oauth.test.ts                (1 pair)
  test/extract-db.test.ts           (1 pair)
  test/multi-source-integration.test.ts (1 pair)
  test/core/cycle.test.ts           (1 pair)

Results on the merged PR 0 branch:
  Before: 2175 pass / 20 fail / 3 errors
  After:  2281 pass /  0 fail / 0 errors    (+106 tests running that
                                             were previously blocked
                                             by the timed-out hooks)

No changes to production code. No test assertions changed. Just
timeout-bump + null-guard discipline that should have been in these
hooks from the start. The real longer-term fix is reusing an engine
across tests where possible (brain-allowlist.test.ts already does this
via beforeAll+DELETE-pages pattern), but that's per-file structural
work — out of scope for this cleanup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: regenerate llms-full.txt for brains-and-sources + brain-routing docs

The test/build-llms.test.ts test validates that the committed llms.txt
and llms-full.txt match the current generator output. PR 0 added
docs/architecture/brains-and-sources.md content paths and updated
CLAUDE.md + skills/RESOLVER.md in earlier commits, but the generated
bundle file wasn't regenerated alongside. This caused one of the 20
fails we chased down today — a straight content mismatch, not a runtime
bug. Running `bun run build:llms` picks up the new section content so
the bundle matches the sources again.

No functional change. Only the compiled doc bundle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Bump version 1.0.0.0 → 0.22.0

OAuth + admin dashboard is meaningful but doesn't quite warrant the
major-version reset to 1.0. Renumber as v0.22.0, slotting cleanly above
master's v0.21.0 (Cathedral II).

Touched:
- VERSION, package.json: 1.0.0.0 → 0.22.0
- CHANGELOG.md: heading + "BEFORE/AFTER v1.0" table + "To take advantage"
  + "pre-v1.0" all renamed. Narrative voice unchanged otherwise.
- TODOS.md: ChatGPT MCP completion stamp updated to v0.22.0 (2026-04-25).
- CLAUDE.md, README.md, docs/mcp/{DEPLOY,CHATGPT}.md, src/schema.sql,
  src/core/schema-embedded.ts: every reader-facing v1.0.0 reference
  rewritten to v0.22.0 / pre-v0.22 in the same place.
- llms-full.txt: regenerated to match.

Slug-test occurrences of "v1.0.0" (`test/slug-validation.test.ts`,
`test/file-upload-security.test.ts`) and the `HOMEBREW_FOR_PERSONAL_AI`
roadmap reference to a future v1.0 vision left intact — those are
unrelated to this branch's release version.

Typecheck clean. cli + oauth + slug + file-upload tests pass (106 tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* v0.26.0 fix: 4 security findings from /cso pass + version bump

Bumped 0.22.0 → 0.26.0 to slot above master's v0.21 chain with headroom
for v0.23/0.24/0.25 to ship from master between now and merge.

Security fixes (all from CSO finding writeups):

#1 cookie-parser middleware — admin dashboard auth was silently broken.
   Express 5 has no built-in cookie parsing; req.cookies was always
   undefined, so /admin/login set the cookie but every subsequent admin
   API call returned 401. Added cookie-parser@^1.4.7 + @types/cookie-parser
   as direct + dev deps. app.use(cookieParser()) wired before CORS.

#2 + #3 TOCTOU races — exchangeAuthorizationCode and exchangeRefreshToken
   used SELECT-then-DELETE, letting concurrent requests with the same
   code/refresh both pass the SELECT before either ran DELETE, both
   issuing token pairs. Switched to atomic DELETE...RETURNING. RFC 6749
   §10.5 (codes) + §10.4 (refresh detection) violations closed. Added
   regression tests that fire 10 concurrent exchanges and assert exactly
   one wins — both pass.

#5 pgArray escape + DCR redirect_uri validation — pgArray() did
   `arr.join(',')` with no escaping, so an element containing a comma
   would be parsed by Postgres as TWO array elements. With --enable-dcr
   on, this could smuggle a second redirect_uri into a registered client
   and steal auth codes. Now every element is double-quoted with `"` and
   `\` escaped. Added validateRedirectUri() per RFC 6749 §3.1.2.1:
   redirect_uris must be https:// or loopback (localhost / 127.0.0.1).
   Wired into the DCR registerClient path; CLI registration trusts the
   operator and bypasses. Regression test confirms a comma-in-URI element
   round-trips as 1 element, not 2.

#6 --public-url flag — issuerUrl was hardcoded to http://localhost:{port}.
   Behind reverse proxies / ngrok / production deploys, the issuer claim
   in tokens wouldn't match the discovery URL clients hit (RFC 8414 §3.3).
   New --public-url URL flag on `gbrain serve --http`, propagates through
   serve.ts → serve-http.ts → ServeHttpOptions.publicUrl → issuerUrl.
   Startup banner surfaces the configured issuer.

Findings #4 (admin requests filter dead code), #7 (admin register-client
hardcoded grant_types), #8 (legacy token grandfathering posture) are
documentation / minor functional fixes and are deferred per user direction.

Tests: oauth.test.ts now 34 cases (was 27). 7 new:
- single-use TOCTOU regression (10 concurrent code exchanges)
- single-use TOCTOU regression (10 concurrent refresh exchanges)
- redirect_uri http://localhost passes
- redirect_uri https://example.com passes
- redirect_uri http://example.com (non-loopback plaintext) rejected
- redirect_uri non-URL rejected
- redirect_uri with embedded comma stored as single element

Files:
- VERSION, package.json: 0.22.0 → 0.26.0
- CHANGELOG.md: heading + table + "To take advantage" + "pre-v0.22" → v0.26;
  new "Security hardening (post-/cso pass)" subsection at top of itemized
  changes; CLI flag list updated for --public-url.
- src/core/oauth-provider.ts: pgArray escape, validateRedirectUri,
  registerClient enforces validation, DELETE...RETURNING in
  exchangeAuthorizationCode + exchangeRefreshToken.
- src/commands/serve-http.ts: cookie-parser import + wire-up,
  publicUrl option, issuerUrl honors it, startup banner shows issuer.
- src/commands/serve.ts: parses --public-url and threads through.
- src/cli.ts: help text adds --public-url URL flag.
- test/oauth.test.ts: +7 regression tests (now 34 total).
- llms-full.txt: regenerated.

Typecheck clean. 34 oauth + 14 cli tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
pkyanam added a commit to pkyanam/gbrain that referenced this pull request May 3, 2026
Adds a third engine option to GBrain — 'graphbrain' — that delegates
storage, search, traversal, and graph operations to a GraphBrain Neo4j
server via REST API. Same BrainEngine interface, different backend.

## What this enables

```bash
# Point any gbrain command at a GraphBrain server
gbrain init --engine graphbrain --url https://graphbrain.example.com/v1/brain_xxx

# Full pipeline works: import, extract links/timeline, query, traverse
gbrain import ~/brain
gbrain extract links --source db
gbrain query "who works at Acme?"
```

## Changes

- **engine-factory.ts**: Register 'graphbrain' engine type
- **graphbrain-engine.ts** (new, 620 loc): Full BrainEngine impl over REST
  - Pages CRUD, search, links, backlinks, graph traversal
  - Tags (stored in frontmatter), timeline, raw data
  - Chunks are no-ops (GraphBrain manages chunking server-side)
  - Stats, health, ingest log, slug rename
- **types.ts**: Add 'graphbrain' to EngineConfig.engine union
- **tests/graphbrain-adapter.test.ts** (207 loc): Unit tests
- **tests/graphbrain-adapter-live.test.ts** (137 loc): Live integration
  tests (requires running GraphBrain instance)

## BrainBench scores

GraphBrain beats stock GBrain (Postgres) on the BrainBench benchmark:

| Metric | GraphBrain (Neo4j) | GBrain (Postgres) | Delta |
|--------|-------------------|-------------------|-------|
| P@5 | 49.4% | 49.1% | +0.3pts |
| R@5 | 99.4% | 97.9% | +1.5pts |
| Correct | 258/261 | 248/261 | +10 |

Score improvement comes from type-aware ranking: all relational query
answers are people, so non-person results get pushed below the fold.
See gbrain-evals PR garrytan#6 for the full adapter + scorecard.

## Try it

Free GraphBrain instance during development:
https://graphbrain.belweave.ai

Source: https://github.com/pkyanam/graphbrain
garrytan added a commit to garrytan-agents/gbrain that referenced this pull request May 3, 2026
Codex review pass garrytan#6 finding garrytan#3 caught loadApiKeys() referenced but
undefined in Agents.tsx — a real shipping bug that 5 Claude review
passes missed. Root cause: the bash test pipeline never compiled the
React admin app, so missing-symbol errors only surfaced during a
deliberate `cd admin && bun run build`.

This commit threads the admin build into the standard test gate. Any
future TypeScript error or missing symbol in admin/src/ now fails
`bun run test` alongside the other shell guards (privacy, jsonb,
progress-stdout, etc.) and the typecheck step.

Behavior:
- scripts/check-admin-build.sh runs `bun install --silent` (idempotent,
  ~50ms on no-op) then `bun run build` in admin/.
- Vite's build runs `tsc -b && vite build` so type errors fail the
  pipeline, not just bundling errors.
- GBRAIN_SKIP_ADMIN_BUILD=1 escape hatch for fast inner-loop test runs
  that don't touch admin/. Production CI MUST NOT set this.
- Skips silently if admin/ doesn't exist (handles slim-clone scenarios).

Wired into both:
- "test" script: full pipeline now includes admin build before bun test
- "check:admin-build" script: invoke standalone for debugging
garrytan added a commit that referenced this pull request May 3, 2026
#586)

* feat(admin): legacy API keys alongside OAuth clients in dashboard

Adds API key management to the admin dashboard:

Server (serve-http.ts):
- GET /admin/api/api-keys — list legacy access_tokens with status
- POST /admin/api/api-keys — create new bearer token
- POST /admin/api/api-keys/revoke — revoke by name
- Stats endpoint now includes active_api_keys count

Admin UI (Agents.tsx):
- Tabbed view: 'OAuth Clients' | 'API Keys'
- API Keys tab: table with name, status, created, last used, revoke button
- Create API Key modal with name input
- Token reveal modal with copy button + warning
- Badge showing active key count on tab

Both auth methods (OAuth 2.1 client_credentials and legacy bearer tokens)
now visible and manageable from a single admin surface.

* feat(admin): remember admin token in localStorage + auto-reauth

Login flow:
- First login: paste token, saved to localStorage
- Subsequent visits: auto-login from localStorage (no paste needed)
- Shows 'Authenticating...' spinner during auto-login
- If saved token is stale (server restarted), clears it and shows login form

Session recovery:
- If session cookie expires mid-use (server restart, 24h expiry), the API
  layer auto-reauths with the saved token before redirecting to login
- Transparent to the user — one failed request triggers reauth + retry
- Only falls back to login page if the saved token itself is invalid

Security:
- Token stored in localStorage (same-origin, tailnet-only deployment)
- Cleared automatically when token becomes invalid
- Cookie remains HttpOnly + SameSite=Strict for the actual session

* feat(admin): rich request logging + agent activity tracking

Server:
- mcp_request_log now captures params (jsonb) and error_message (text)
- Agents API returns last_used_at, total_requests, requests_today
- Request log API supports agent/operation/status filtering via query params
- SSE broadcast includes params and error details

Agents page:
- Shows 'Requests today / total' and 'Last used' (relative time) per agent
- Removed Client ID column (low signal, shown in drawer)

Request Log page:
- New 'Params' column — shows query text, slug, or param count inline
- Click any row to expand full details (params JSON, error message, timestamps)
- Click agent name to filter all requests by that agent
- Agent filter dropdown in header
- Error messages shown in red in expanded view

What this means: when Claude Code searches for 'pedro franceschi',
the admin dashboard shows the search query, which agent ran it,
how long it took, and whether it succeeded — all clickable.

* feat(admin): magic link login — ask your agent for the URL

New flow:
1. User opens /admin → sees 'This is a protected dashboard'
2. UI tells them: 'Ask your AI agent for the admin login link'
3. Agent generates: https://host:port/admin/auth/<token>
4. User clicks the link → auto-authenticates → redirects to dashboard
5. Session lasts 7 days (magic link) vs 24h (manual token paste)

Server: GET /admin/auth/:token validates the bootstrap token, sets
HttpOnly cookie, redirects to /admin/. Invalid tokens get a plain
text error telling them to ask their agent for a fresh link.

Login page: primary UX is the 'ask your agent' prompt with example.
Manual token paste collapsed under a <details> disclosure.

* feat(admin): config export for Claude Code, ChatGPT, Claude.ai, Cursor, Perplexity

Agent drawer now shows setup instructions for 5 clients + raw JSON:
- Claude Code: .mcp.json with bearer token + curl to mint
- ChatGPT: Settings → Tools → MCP with OAuth discovery
- Claude.ai (Cowork): Connected Apps → MCP with OAuth
- Cursor: .cursor/mcp.json with OAuth config
- Perplexity: Connectors with client ID/secret
- JSON: raw config with all URLs (server, token, discovery)

All snippets use the actual server URL (window.location.origin)
instead of placeholder YOUR_SERVER. Client ID pre-filled.

* feat(admin): per-client token TTL — configurable token lifetime

Problem: OAuth tokens expire in 1 hour (hardcoded). Claude Code's built-in
OAuth client doesn't auto-refresh, so users get 401s every hour.

Fix: per-client token_ttl column on oauth_clients table. Set at registration
time or updated later via the admin dashboard.

Server:
- oauth_clients.token_ttl column (nullable integer, seconds)
- exchangeClientCredentials reads per-client TTL, falls back to server default
- POST /admin/api/register-client accepts tokenTtl param
- POST /admin/api/update-client-ttl for existing clients
- Agents API returns token_ttl for display

Admin UI:
- Register modal: Token Lifetime dropdown (1h, 24h, 7d, 30d, 1y, no expiry)
- Agent drawer: shows current TTL in Details section

Presets: gstack-desktop and garry-claude-code set to 30-day tokens.

* fix(admin): request log shows agent name instead of truncated client_id

Resolves client_id → client_name via LEFT JOIN on oauth_clients (and
access_tokens for legacy keys). Agent column now shows 'gstack-desktop'
instead of 'd0db7692caf5…'. Clickable to filter by agent.

* feat(admin): DESIGN.md + left-align everything

DESIGN.md establishes the admin dashboard design system:
- Left-align all text (Garry preference)
- Inter + JetBrains Mono (shared DNA with GStack)
- No accent color — semantic badges carry all color
- Dense utilitarian ops dashboard
- Component specs and anti-patterns documented

CSS: login-box text-align center → left

* feat(admin): unified agent view + resolved agent names in request log

Agent names stored at log time (agent_name column). Agents page shows
OAuth clients and API keys in one unified table. Request log shows
human-readable names. Backfilled 1,114 existing entries.

* feat(admin): working Revoke Agent button + e2e tests

Bugs fixed:
- Revoke Agent button was a no-op (no onClick handler, no API endpoint)
- Legacy API key tokens got 401 at /mcp (missing expiresAt in AuthInfo)
- token_ttl and deleted_at queries failed on PGLite (columns don't exist)

Server:
- POST /admin/api/revoke-client: soft-deletes oauth_clients + purges tokens
- exchangeClientCredentials checks deleted_at (graceful if column missing)
- Legacy token verify returns expiresAt (1yr future) for SDK compat

UI:
- Revoke button: confirm dialog → revoke → close drawer → reload table
- Shows 'This agent has been revoked' for revoked agents

E2E tests (2 new cases, 17 total):
- revoke client via admin API invalidates all tokens (mint → use → revoke → verify rejected → mint fails)
- revoke API key via admin API (create → use at /mcp → revoke → verify rejected)

52 tests, 0 failures, 213 assertions across unit + e2e.

* fix(test): e2e tests clean up after themselves — no more orphan clients

Problem: every test run left e2e-oauth-test, e2e-revoke-test, and
e2e-revoke-key-test rows in oauth_clients and access_tokens. The CLI-based
cleanup in afterAll was failing silently.

Fix:
- beforeAll: SQL DELETE of any e2e-* orphans from previous crashed runs
- afterAll: direct SQL cleanup of oauth_tokens, oauth_clients, access_tokens,
  mcp_request_log — all rows matching 'e2e-%' pattern
- No reliance on CLI commands for cleanup (they fail silently)

Verified: 52 tests pass, 0 test rows remain after run.

* feat(admin): hide revoked toggle on Agents page

* fix(admin): styled error page for expired magic links

Matches the login page aesthetic instead of plain text. Dark theme,
GBrain logo, explains the link expired, tells user to ask their agent.

* fix(admin): clean config export — auth-type-aware Claude Code instructions

* fix(admin): rewrite all config exports — command language, auth-type-aware, verified syntax

* fix(admin): API key rows clickable with revoke + sync all fixes from master

Syncs all accumulated fixes onto the PR branch:
- API key rows in agents table now open drawer with Revoke button
- API keys show bearer token usage hint instead of config export tabs
- Config export snippets use command language directed at the AI agent
- Styled expired magic link error page
- Hide revoked toggle
- Test cleanup via direct SQL
- All v0.26.2 upstream fixes incorporated

* fix(oauth): port coerceTimestamp helper from master 1055e10

Tests in test/oauth.test.ts (already on this branch) import coerceTimestamp
from oauth-provider.ts. The import was synced from master via PR commit 16
("sync all fixes from master") but the production-code change to
oauth-provider.ts was not. Result: bun test fails at module load with
"coerceTimestamp is not exported".

This commit ports the helper directly instead of merging master, avoiding
VERSION/CHANGELOG/dist conflicts.

Boundary helper for postgres.js BIGINT-as-string (auto-detected on
Supabase pgbouncer / port 6543). Throws on non-finite so corrupt rows
fail loud at the SELECT-row -> JS-number boundary. Returns undefined
for SQL NULL; comparison sites treat NULL as expired (fail-closed).

Refactors 4 sites:
- getClient: DCR response numeric-shape compliance per RFC 7591 §3.2.1
- exchangeRefreshToken: NULL -> expired fail-closed
- verifyAccessToken: single guard, narrowed return; folds in v0.26.1's
  inline Number(...) at the return site

Originally landed on master as part of #593 (v0.26.2). Ported here so
PR #586 (v0.26.3) can build standalone without a master merge.

* feat(schema): migration v33 — admin dashboard columns

Adds the 5 columns + new index referenced by PR #586 admin dashboard work
that landed without a corresponding schema migration:

  oauth_clients.token_ttl       INTEGER     -- per-client OAuth TTL override
  oauth_clients.deleted_at      TIMESTAMPTZ -- soft-delete for revoke
  mcp_request_log.agent_name    TEXT        -- resolved client_name for log
  mcp_request_log.params        JSONB       -- captured request params
  mcp_request_log.error_message TEXT        -- captured error text on failure
  idx_mcp_log_agent_time        INDEX       -- supports new agent filter

Without v33 on existing brains:
- /admin/api/agents 503s (SELECT references token_ttl + deleted_at)
- POST /admin/api/revoke-client throws 500 (UPDATE deleted_at)
- POST /admin/api/update-client-ttl throws 500 (UPDATE token_ttl)
- mcp_request_log INSERTs silently swallow column-doesn't-exist errors,
  request log appears empty to the operator

All ALTERs use ADD COLUMN IF NOT EXISTS so re-running the migration is
a no-op on a brain that already has v33.

Includes inline UPDATE backfill of agent_name on existing rows via
COALESCE on oauth_clients.client_name → access_tokens.name → token_name.

Updates:
- src/core/migrate.ts: v33 migration entry
- src/schema.sql: source-of-truth schema for fresh installs
- src/core/pglite-schema.ts: PGLite mirror
- src/core/schema-embedded.ts: regenerated via bun run build:schema
- test/migrate.test.ts: 5 SQL-shape assertions pinning the v33 contract

* refactor(serve-http): parameterize request-log filter; kill dead vars

Three issues in the prior /admin/api/requests handler:

1. sql.unsafe() with manual single-quote escape on user input:
     conditions.push(`token_name = '${agent.replace(/'/g, "''")}'`);
   Works under standard_conforming_strings=on (PG default since 9.1) but
   pattern is a footgun — any future contributor adding a filter without
   escaping breaks the dam. Backslashes are not escaped. Mitigated by
   requireAdmin but defense-in-depth says don't ship the pattern.

2. Dead variables (lines 348-357 of the prior code): `query`, `params`,
   `paramIdx` were built up with $N placeholders and then never used
   when the function fell through to sql.unsafe with manually-escaped
   strings. Confusing leftovers from an earlier parameterization attempt.

3. Unused `values: unknown[] = []` in the conditions block.

Fix: replace the entire dynamic-WHERE construction with postgres.js
tagged-template fragments. Each filter expands to either
`AND col = ${val}` (true parameter binding via the postgres-js driver)
or an empty fragment. `WHERE 1=1` lets us always have a WHERE clause
and unconditionally append AND-prefixed fragments. No string
interpolation, no manual escaping, no sql.unsafe.

Net change: -27 lines (from 30 lines of broken/dead code to 17 lines
of clean parameterized fragments).

* perf(oauth): thread client_name through AuthInfo; drop per-request lookup

PR #586's serve-http.ts /mcp handler did one extra DB roundtrip per
authenticated request to resolve client_id → client_name for logging:

  let agentName = authInfo.clientId;
  try {
    const [client] = await sql`SELECT client_name FROM oauth_clients
                                 WHERE client_id = ${authInfo.clientId}`;
    if (client) agentName = client.client_name;
  } catch { /* best effort */ }

On a busy brain (Perplexity Computer doing inline research, Claude Code
searching) that is ~50–100ms extra per /mcp request — wasted on a static
lookup that doesn't change between requests.

Codex's review reframed the planned cache+invalidation approach: the
right fix is to fold the name resolution into verifyAccessToken's
existing oauth_tokens SELECT via a LEFT JOIN on oauth_clients. One query
that was already running, returns the name as a bonus column, no module-
scope cache to maintain, no invalidation contract for future contributors
to remember.

Changes:
- AuthInfo (src/core/operations.ts): add optional clientName field with
  doc explaining why it's threaded here.
- verifyAccessToken (src/core/oauth-provider.ts): SELECT becomes
    SELECT t.client_id, t.scopes, t.expires_at, t.resource, c.client_name
    FROM oauth_tokens t
    LEFT JOIN oauth_clients c ON c.client_id = t.client_id
    WHERE t.token_hash = ${tokenHash} AND t.token_type = 'access'
  Returns clientName in AuthInfo.
- Legacy access_tokens path: clientName = name (single identifier).
- serve-http.ts /mcp handler: read authInfo.clientName directly,
  fall back to clientId. Per-request lookup removed.

Net change: -8 LOC. Eliminates the per-request DB roundtrip while
keeping the same behavior surface.

* security(serve-http): timingSafeEqual on admin token hash compare

Both /admin/login (POST, JSON body) and /admin/auth/:token (GET, magic
link) compared the sha256 of the operator-supplied token against the
known bootstrapHash via JS string `===`, which short-circuits at the
first mismatched character. The inputs are SHA-256 outputs so the
practical timing leak only reveals hash bits (not raw token bits, since
SHA-256 isn't invertible) — but defense-in-depth on the highest-
privileged URLs the server exposes is the right call.

New helper safeHexEqual(a, b):
- Length-equal check first (both are 64-char hex)
- Buffer.from(hex, 'hex') decodes each side to 32 bytes
- crypto.timingSafeEqual returns the constant-time compare result

Also tightens the POST handler's input validation: requires token to
be a string before passing to createHash (prior code only checked
truthiness, would have crashed on object-typed bodies even with
express.json's parser).

Used at both magic-link and password-style admin auth sites.

* security(serve-http): rate-limit /admin/auth/:token at 10/min/IP

Defense-in-depth on the magic-link endpoint. A misconfigured client
looping on /admin/auth/:bad would otherwise consume CPU on sha256 +
the inline HTML 401 response without bound. Brute-forcing the 64-char
hex bootstrap token is computationally infeasible regardless, so this
is about denial-of-service, not auth bypass.

Reuses the existing express-rate-limit dep already wiring /token's
client-credentials limiter. New adminAuthRateLimiter shares the same
configuration shape (standardHeaders, legacyHeaders) for consistency.

windowMs: 60_000 (1 minute)
max: 10
message: plain string ("Too many magic-link attempts. Wait a minute
before trying again.") instead of JSON envelope, matching the
endpoint's HTML response style.

* security(admin): kill JS-state token; single-use magic links; sign out everywhere

Resolves D11 + D12 from the codex-pushback review. Closes the actual
trust boundary instead of the persistence layer (sessionStorage was
security theater per codex finding #7).

# Single-use magic links (D11=C)

The bootstrap token is no longer the magic-link path component. New
flow:

  agent has bootstrap token (read from server stderr)
    -> POST /admin/api/issue-magic-link
       Authorization: Bearer <bootstrap>
    -> server returns one-time nonce URL
    -> operator clicks /admin/auth/<nonce>
    -> server consumes nonce, sets cookie, redirects to dashboard

Server state (in-memory):
- magicLinkNonces: Map<nonce, expiresAt> (5-minute TTL)
- consumedNonces:  Set<nonce> (LRU cap 1000 to bound memory)
- pruneExpiredNonces() best-effort GC on each issue/redeem

Each redemption marks the nonce consumed. Second click on the same URL
gets the styled 401 page. Leaked URL grants exactly one extra session
before dying. The bootstrap token never appears in a URL — no leakage
via browser history, proxy access logs, or Referer headers.

# Kill JS-state bootstrap token (D12=B)

admin/src/pages/Login.tsx + admin/src/api.ts:
- All localStorage reads/writes removed
- Auto-reauth-via-saved-token logic deleted
- Token only lives in form state during submit, cleared after
- 401 redirects straight to login — no cache to retry against

The HttpOnly cookie is the only session credential after successful
authentication. Closing the tab ends the session. Reopening shows the
login page. Operator asks the agent for a fresh magic link (or pastes
the bootstrap token from the server terminal).

# Sign out everywhere

POST /admin/api/sign-out-everywhere (admin-cookie-required) calls
adminSessions.clear() and returns {revoked_sessions: count}. Every
browser/tab fails its next request, gets 401, redirects to login.
Bootstrap token unaffected — still valid for new magic-link mints.

UI: button in the sidebar footer with a confirm() guard ("Sign out
every active admin session, including other browsers and tabs?").

# Notes

admin/dist is gitignored on this branch (master's v0.26.2 removed that
line; the merge to master will reconcile). After /ship's merge step,
rebuild admin/dist with `cd admin && bun run build` to capture the new
sign-out button + simplified login page.

* fix(admin): rename loadApiKeys() to loadAgents() in Agents.tsx onCreated

The Create API Key flow's onCreated callback called loadApiKeys() but
no such function exists in this file. The unified /admin/api/agents
endpoint (added in PR commit 14) returns BOTH OAuth clients AND legacy
API keys, so loadAgents() is the right call.

User-visible bug: clicking "+ API Key" -> filling in the name ->
clicking Create would mint the key on the server but throw
ReferenceError: loadApiKeys is not defined in the React onCreated
callback. The token-reveal modal would still appear (because
setShowApiKeyToken runs before the loadApiKeys call), but the agents
table wouldn't refresh, leaving the new key invisible until manual
page reload.

Five Claude review passes missed this. Codex caught it in one pass.

1-line fix.

* fix(admin): empty-state placeholder when filtered Agents result is empty

Pre-fix: the empty-state guard checked the unfiltered agents array.
If every agent was revoked AND the "Hide revoked" toggle was on
(default), the table rendered a header row with zero body rows and
no placeholder — looked like a broken / empty / loading state.

Two cases to render distinctly:

1. agents.length === 0 (truly no agents)
   "No agents registered. Register your first agent to get started."

2. visibleAgents.length === 0 BUT agents.length > 0
   (all agents are revoked, hideRevoked filter hides them all)
   "All agents are revoked. Uncheck "Hide revoked" to view them."

Refactored the table render into an IIFE so the filter expression is
computed once and shared between the empty-state guard and the row
map. Drops the prior inline `agents.filter(...).map(...)` pattern.

(F2.2 from the eng review pass #2.)

* fix(admin): restore Claude Code + Cursor tabs for API-key agents

Wintermute's commit 16 (3d5d0f8) wrapped the entire Config Export
section in {isOAuth && (...)}, hiding ALL tabs for api_key agents and
replacing them with a single line of plain instruction. That dropped
the working auth-type-aware Claude Code + Cursor snippets (added by
his own commit 15) along with the genuinely OAuth-only ChatGPT /
Claude.ai / Perplexity ones.

Codex review pass D5 settled on option C: per-tab branching. Two
clients (Claude Code, Cursor) accept raw bearer tokens in their MCP
config, so their snippets render normally for api_key agents (commit
15's auth-type-aware branching does the right thing). Three clients
(ChatGPT, Claude.ai, Perplexity) only speak OAuth 2.0 client_credentials
and reject raw bearer; for api_key agents they render an explanatory
message naming the client and pointing the operator at registering an
OAuth client instead.

JSON tab continues to render its raw structured metadata unconditionally.

Layout: removed the `{isOAuth && (...)}` outer wrap; tab list now
always visible. The body of each tab is selected via an IIFE that
checks (auth_type === 'api_key' && tab in oauthOnlyTabs).

Net change: +24 lines (the warning panel + IIFE branch logic).

* feat(admin): read -s prompt OAuth Claude Code snippet + 2-step curl fallback

Wintermute's commit 15 inlined client_secret into a long compound
`claude mcp add --header "Authorization: Bearer $(curl -d '...
client_secret=PASTE_HERE')"` line. When the operator replaces PASTE
with their real secret, that secret lands in ~/.zsh_history and
appears in `ps` output for the lifetime of the curl process.

D13=C from the eng review: ship both shapes.

Default (read -s prompt-based, ~17 lines):
- read -rs prompts for the secret without echo, stores in
  $GBRAIN_CS scoped to the shell session
- curl uses --data-urlencode "client_secret=$GBRAIN_CS" — variable
  substitution at exec time, so the secret enters the curl process's
  argv at the moment of the call, but the shell history records
  literally `--data-urlencode "client_secret=$GBRAIN_CS"`, not the
  value
- unset GBRAIN_CS afterwards to scrub the env

Fallback (2-step curl + paste, for shells without read -s):
- one curl command to mint the token (PASTE_YOUR_CLIENT_SECRET_HERE
  in the body — secret hits history but in one short isolated line
  that's easy to scrub)
- second `claude mcp add` command with PASTE_TOKEN_FROM_ABOVE — the
  bearer token, not the long-lived client secret
- bash + zsh history-deletion hint at the bottom

Both shapes preserve the agent-facing voice ("The user wants to
connect GBrain MCP to your context. Here's how.") and the token-TTL
rendering ("will last 30 days") that commit 15 added.

Net change: +25 lines in the configSnippets['claude-code'] OAuth
branch. API-key branch unchanged (single paste, no secret).

* chore(ci): gate admin React build via scripts/check-admin-build.sh

Codex review pass #6 finding #3 caught loadApiKeys() referenced but
undefined in Agents.tsx — a real shipping bug that 5 Claude review
passes missed. Root cause: the bash test pipeline never compiled the
React admin app, so missing-symbol errors only surfaced during a
deliberate `cd admin && bun run build`.

This commit threads the admin build into the standard test gate. Any
future TypeScript error or missing symbol in admin/src/ now fails
`bun run test` alongside the other shell guards (privacy, jsonb,
progress-stdout, etc.) and the typecheck step.

Behavior:
- scripts/check-admin-build.sh runs `bun install --silent` (idempotent,
  ~50ms on no-op) then `bun run build` in admin/.
- Vite's build runs `tsc -b && vite build` so type errors fail the
  pipeline, not just bundling errors.
- GBRAIN_SKIP_ADMIN_BUILD=1 escape hatch for fast inner-loop test runs
  that don't touch admin/. Production CI MUST NOT set this.
- Skips silently if admin/ doesn't exist (handles slim-clone scenarios).

Wired into both:
- "test" script: full pipeline now includes admin build before bun test
- "check:admin-build" script: invoke standalone for debugging

* test(e2e): v0.26.3 coverage — column round-trip, injection probe, TTL, magic-link

Folds together the planned fix-up commits #8-#11 since they all live in
the same E2E file and share the spawned-server harness. Each test block
is independently bisect-readable.

# Test 1: mcp_request_log new column round-trip (pins migration v33)

Wipes log rows for the e2e-oauth-test client, makes a successful
tools/list call + a failed tools/call (nonexistent tool name), then
asserts:
  - rows persisted (count >= 2) — proves the INSERT wasn't silently
    swallowed by the "best effort" try/catch on a column-doesn't-exist
    error
  - agent_name column resolves to 'e2e-oauth-test' on every row (proves
    the JOIN in verifyAccessToken or the v33 backfill path)
  - params column persisted as JSONB on tools/call
  - error_message column populated on the status='error' row

Without migration v33, every assertion fails — the column doesn't exist
so the INSERT throws, gets swallowed, and rows.length === 0.

# Test 2: request-log filter injection probe

Sends `?agent=alice'%20OR%201%3D1` to /admin/api/requests. Pre-fix,
the sql.unsafe path would have crashed the server with malformed SQL
on the way to the auth check (or worse, returned all rows under broken
escaping). Post-fix (parameterized fragments), the unauthenticated
request hits 401 without ever touching SQL.

Asserts:
  - 401 (not 500) on the injection input
  - server still responsive on /health afterwards (didn't crash)

# Test 3: per-client token_ttl flow

Registers e2e-test-ttl, sets oauth_clients.token_ttl, mints a token,
asserts response's expires_in matches. Cycles through three states:
  - token_ttl = 86400 → expires_in = 86400 (24h custom override)
  - token_ttl = 7200  → expires_in = 7200 (2h different custom)
  - token_ttl = NULL  → expires_in = 3600 (server default fallback)

Pins the per-client TTL feature added in PR #586 commit 6 (e7989e9).

# Test 4: magic-link styled 401 page + single-use semantic

(a) Invalid nonce returns Content-Type: text/html with a body that
    contains "expired" and "GBrain" — pins the styled error page from
    PR commit 13 (f8f5cfe).

(b) Single-use semantic: extract bootstrap token from server stderr
    (best-effort; skips gracefully if not extractable), POST to
    /admin/api/issue-magic-link to mint a one-time nonce URL, click
    once (gets 302 + cookie), click again (gets styled 401). Pins the
    D11=C single-use rotation logic.

# Test 5: agent_name resolution path

Makes an OAuth request and asserts mcp_request_log.agent_name resolves
to the OAuth client_name (not the truncated client_id). Pins the JOIN
introduced in fix-up #4 + the v33 backfill path.

# Test 6: register-client missing-name returns 400 (basic input validation)

Hits /admin/api/register-client without auth — must 401 (not crash 500).

# Other changes

- Renamed describe header from `(v0.26.1 + v0.26.2)` to
  `(v0.26.1 + v0.26.2 + v0.26.3)` — F6.5.
- All postgres.js sql tag bindings on `clientId` / `clientSecret` use
  the `!` non-null assertion since these are typed `string | undefined`
  in the test fixture but always assigned before each test block runs.
- Result casts go through `as unknown as ...` per postgres.js's RowList
  typing (the lib's structural type doesn't unify with bare interface
  arrays).

* chore: privacy sweep + integrity.ts on getconnection allow-list

Two pre-existing CI failures uncovered while running `bun run test`
on this branch — unrelated to v0.26.3 substance but blocking the
pipeline.

# Privacy sweep (src/core/mounts-cache.ts)

Two references to the private agent fork name in code comments,
violating CLAUDE.md privacy rule ("never reference real people,
companies, funds, or private agent names in any public-facing
artifact"). Both authored in v0.26.0 commit 3c032d7.

  - line 6 (docblock):
    "Host agents (Wintermute / OpenClaw / any Claude Code install) read"
    -> "Host agents (your OpenClaw / any Claude Code install) read"
  - line 324 (RESOLVER preamble emitter):
    "Host agents (Wintermute/OpenClaw/Claude Code) should prefer this file over"
    -> "Host agents (your OpenClaw / Claude Code) should prefer this file over"

Per the documented substitution: "your OpenClaw" for reader-facing copy
covers any downstream OpenClaw deployment (Wintermute, Hermes, AlphaClaw,
etc.) without leaking the private name into search engines or release
artifacts.

# integrity.ts on the getconnection allow-list

`scripts/check-no-legacy-getconnection.sh` flags `db.getConnection()`
calls outside `src/core/db.ts` to enforce the multi-brain routing
contract. `src/commands/integrity.ts:355` (scanIntegrityBatch) was
introduced in v0.22.16 commit 8468ba2 — the check ran clean at the
time because the file wasn't on the allow-list yet, but PR #586's
test pipeline catches it.

Adds the file to ALLOWED with a "PR 1 cleanup" note matching the
existing entries' pattern. The proper fix (refactor to accept engine
from OperationContext) is out of v0.26.3 scope and tracked alongside
the other PR 1 entries.

* chore: bump v0.26.2 -> v0.26.3 + CHANGELOG

VERSION + package.json already at 0.26.3 from the initial bump on this
branch (see commit history). This commit lands the rewritten CHANGELOG
entry covering everything that actually shipped in v0.26.3 — well past
the original "legacy API keys" framing.

What lands in v0.26.3:

# Headline (admin trust model)

Bootstrap token never persists in browser JS state (no localStorage,
no sessionStorage). Magic-link URLs use single-use server-issued
nonces — bootstrap token never appears in a URL. Cookie sessions are
HttpOnly + SameSite=Strict. "Sign out everywhere" button revokes every
active admin session in one click.

# Schema

Migration v33 adds 5 columns referenced by PR #586's admin-dashboard
work that landed without a corresponding migration. Without v33,
existing brains 503 on /admin/api/agents and silently empty their
request log. Backfill of agent_name from oauth_clients.client_name
-> access_tokens.name -> token_name baked into the migration.

# Performance

verifyAccessToken JOINs oauth_clients in its existing token SELECT
and returns clientName on AuthInfo. Removes the per-MCP-request DB
roundtrip that was happening on every authenticated /mcp call.

# Security

- crypto.timingSafeEqual on admin token hash compare
- /admin/auth/:nonce rate-limited at 10/min/IP
- Single-use nonces with 5-minute TTL
- Request-log filter parameterized via postgres.js tagged-template
  fragments (sql.unsafe + manual escape removed)
- Per-client OAuth token TTL (1h, 24h, 7d, 30d, 1y, no expiry)
- Ported coerceTimestamp helper from master v0.26.2 (BIGINT-as-string fix)

# UI

- API keys + OAuth clients in one unified Agents table
- Auth-type-aware Config Export tabs
- Claude Code OAuth: read -s prompt-based snippet (default) +
  2-step curl fallback (D13=C)
- Cursor: OAuth discovery URL OR raw bearer based on auth type
- ChatGPT/Cowork/Perplexity: "OAuth client required" CTA on api_key agents
- Hide-revoked toggle + empty-state placeholder for filtered-empty
- Bug fix: loadApiKeys -> loadAgents (codex caught what 5 review
  passes missed; Create-API-Key flow was broken)

# Tests + CI

- New E2E coverage: column round-trip, injection probe, per-client
  TTL, magic-link single-use, styled 401, agent_name resolution
- Admin React build is now a CI gate (catches missing-symbol bugs
  before E2E)
- check-no-legacy-getconnection allowlist updated for integrity.ts

Branch shape: 16 author commits + 13 fix-up commits = 29 commits on
PR. Commit-by-commit bisect-friendly.

Plan + codex review pass artifacts at
~/.claude/plans/check-this-out-and-breezy-forest.md.

---------

Co-authored-by: Wintermute <wintermute@garrytan.com>
Co-authored-by: Garry Tan <garrytan@gmail.com>
garrytan added a commit to garrytan-agents/gbrain that referenced this pull request May 6, 2026
Extract the inline SIGCHLD handler from cli.ts into a small dedicated
module so it's testable directly without importing cli.ts (which invokes
main() at module load — incompatible with bun:test imports).

The new installSigchldHandler() uses a named module-level handler +
includes() check to dedupe across hot-import scenarios. EventEmitter does
NOT dedupe listeners by reference, so without this guard a re-import of
zombie-reap.ts would accumulate handlers.

_uninstallSigchldHandlerForTests() is the test-only escape hatch so
test/zombie-reap.test.ts's afterAll can prevent cross-file listener
accumulation in the parallel shard process — codex review garrytan#6 noted that
mutating global process signal listeners in parallel pools is a leak class
the isolation lint doesn't protect against.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request May 6, 2026
)

* fix: zombie process accumulation + health endpoint timeout

Three fixes for cascading failure mode in long-running deployments:

1. cli.ts: Install SIGCHLD handler to reap zombie children. Bun (like Node)
   only auto-reaps when a handler is registered. Without this, child processes
   spawned by the worker (embed batches, shell jobs, sub-agents) become zombies
   when they exit, accumulating in the PID table.

2. serve-http.ts: Add 5s timeout to /health endpoint's getStats() call.
   When the DB connection pool is saturated (e.g., from zombie processes
   holding phantom connections), getStats() hangs indefinitely, making the
   server appear dead to health checks even though it's running.

3. worker.ts: Call engine.disconnect() in the finally block after draining
   in-flight jobs. Releases PgBouncer connection slots immediately on shutdown
   rather than waiting for TCP keepalive expiry.

4. supervisor.ts + autopilot.ts: Auto-detect tini on PATH and wrap the
   spawned worker with it. Belt-and-suspenders with the SIGCHLD handler —
   tini catches children spawned by native addons that bypass the JS event
   loop. Zero-config: works when tini is installed, silently skips when not.

* refactor(zombie-reap): extract idempotent SIGCHLD installer module

Extract the inline SIGCHLD handler from cli.ts into a small dedicated
module so it's testable directly without importing cli.ts (which invokes
main() at module load — incompatible with bun:test imports).

The new installSigchldHandler() uses a named module-level handler +
includes() check to dedupe across hot-import scenarios. EventEmitter does
NOT dedupe listeners by reference, so without this guard a re-import of
zombie-reap.ts would accumulate handlers.

_uninstallSigchldHandlerForTests() is the test-only escape hatch so
test/zombie-reap.test.ts's afterAll can prevent cross-file listener
accumulation in the parallel shard process — codex review #6 noted that
mutating global process signal listeners in parallel pools is a leak class
the isolation lint doesn't protect against.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(spawn-helpers): extract detectTini + buildSpawnInvocation; DRY-consolidate supervisor + autopilot

Pulls the duplicated tini detection + (cmd, args) composition out of
src/core/minions/supervisor.ts and src/commands/autopilot.ts into a single
src/core/minions/spawn-helpers.ts module that both consume.

Side effects:
- Autopilot now resolves tini ONCE at startup instead of shelling out via
  execSync('which tini') on every worker respawn (every restart-after-crash
  path lost ~1ms + a fork to /usr/bin/which).
- detectTini() passes env: process.env explicitly to execFileSync. Bun
  snapshots env at startup; without this, runtime PATH mutations (in tests
  via withEnv, or in any prod code that ever changes PATH) are invisible
  to `which`. Tiny correctness fix that also makes the test work.
- MinionSupervisor gains an `isTiniDetected` read-only accessor so
  test/supervisor-tini.test.ts can assert the constructor wired tini
  correctly without exposing the resolved path or needing to spawn the
  full lifecycle. The existing worker_spawned event payload still carries
  {tini: true} for runtime observability (per codex review #5).

Test coverage:
- test/spawn-helpers.test.ts: pure function tests for both helpers
  (with-tini / without-tini / empty-args / detectTini smoke)
- test/supervisor-tini.test.ts: constructor wiring with PATH stripped
  vs. PATH containing a fake-tini script in a tmpdir

Both files are *.test.ts (parallel-safe) and pass scripts/check-test-isolation.sh
without new allow-list entries.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(serve-http): extract probeHealth() + drop /health timeout 5s -> 3s

Three changes folded into one commit because they touch the same route
handler and would conflict if split:

1. Extract probeHealth(engine, engineName, version, timeoutMs) as a pure
   exported function. Route handler becomes one branchless line:
     res.status(result.status).json(result.body)
   This makes the timeout / db-error / happy paths unit-testable directly
   without an Express test client and without a hardcoded 5000 literal
   inside the route closure.

2. Export HEALTH_TIMEOUT_MS = 3000 (was inline 5000). Fly.io default
   health-check timeout is 5s; at 5s exact, the orchestrator may record
   a request as a timeout instead of getting the 503 (race). 3s gives
   2s of headroom for TCP, response framing, and clock skew. The
   DB-pool-saturation signal still surfaces; we just stop racing the
   orchestrator deadline.

3. The route handler shape change (4 try/catch lines -> 1 wrapper line)
   keeps response semantics identical for all three paths.

Test coverage:
- test/serve-http-health.test.ts: 4 cases (happy / timeout / db-error /
  exported constant). Calls probeHealth directly with mock engines whose
  getStats() resolves / rejects / hangs forever. Wall-clock per test
  bounded by passing timeoutMs: 100.
- Existing test/e2e/serve-http-oauth.test.ts /health happy-path case
  still covers the Express wiring (one-line route handler is identical
  Express plumbing for 200 and 503).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(worker): log engine.disconnect errors during shutdown instead of swallowing

Replace bare \`try { await this.engine.disconnect(); } catch {}\` with
\`catch (e) { console.error('[worker] disconnect failed during shutdown:', e); }\`.

Why: shutdown is best-effort, but the original silent catch was exactly
the bug class the v0.26.9 D14 direction (isUndefinedColumnError swap-in
on oauth-provider.ts) was created to surface. If a future regression
breaks pool teardown so disconnect rejects, we'll never know without an
audit log line. Two-character diff to the catch, no behavior change for
the happy path.

Test coverage in test/worker-shutdown-disconnect.test.ts:
- Happy path: disconnect spy called once during shutdown (intercept-only,
  not call-through, so the shared engine stays connected for the next
  test in the file).
- Error path: disconnect throws, error is logged with the
  \`[worker] disconnect failed during shutdown:\` prefix and the bare
  Error as second arg, and start() still resolves (no rethrow).

Spy via spyOn() on the engine instance — object-level, not module-level,
so R2 of scripts/check-test-isolation.sh (which forbids module-level mocks
in non-serial unit tests) is satisfied.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(e2e): real-binary zombie reaping reproduction (DATABASE_URL-gated)

Spawns the gbrain CLI as \`bun run src/cli.ts jobs work --concurrency 1\`
against a real Postgres with GBRAIN_ALLOW_SHELL_JOBS=1, submits a shell
job from the CLI side (remote: false, bypasses the v0.26.9 RCE gate),
captures the worker's shell child PID from the job result, sleeps 300ms,
then \`ps -o stat= -p <pid>\` to assert the process is NOT lingering as a
zombie (Z state).

Why this shape:
- \`gbrain serve --http\` was the original plan but doesn't start a worker
  (only the MCP server) AND submit_job over MCP carries remote: true,
  which rejects shell at operations.ts:1391 (the v0.26.9 RCE-fix gate).
  jobs work + CLI-side submit is the only architecture that boots through
  cli.ts (so installSigchldHandler() actually runs) and lets a shell job
  execute.
- \`shell\` requires absolute cwd (shell.ts:53). Payload includes cwd: '/tmp'.
- ps check is run while the worker is STILL ALIVE (no PID-recycle race —
  worker holds the process tree, so the captured PID is meaningful).

Negative control (manual, NOT in CI, documented in test header):
  Comment out installSigchldHandler() in src/cli.ts -> rebuild -> re-run
  -> expect stat=Z. Re-enable -> expect stat empty (process gone, reaped).
  Demonstrates the test catches the regression class without paying CI
  cost for a separate broken-build target.

Skips:
- DATABASE_URL not set (matches existing E2E pattern in helpers.ts)
- Windows (POSIX-only; tini and SIGCHLD don't exist there)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(postgres-engine): make disconnect() idempotent so it doesn't clobber the module-level singleton

PostgresEngine.disconnect() was non-idempotent: after the first call ended
\`_sql\` and set it to null, a second call fell through to the \`else\` branch
that calls db.disconnect() — which clears the GLOBAL module-level
connection used by helpers.ts, the CLI main path, and every test that
hadn't opted into a private pool.

This bit minions-shell.test.ts and the entire downstream E2E suite when
commit 671ef09 (in this branch) added engine.disconnect() to
MinionWorker.start()'s finally block. Tests that did:

  await worker.start();          // worker disconnects (was the new behavior)
  await engine.disconnect();     // test cleanup; pre-fix fell through
                                  // to db.disconnect() and killed
                                  // the global connection

…would silently kill the helpers.ts singleton, and the next test in the
file would fail in its beforeEach with "No database connection".

Fix: track \`_connectionStyle\` ('instance' | 'module' | null) on the engine
and only call db.disconnect() when this engine actually owns the global.
After ending an instance-pool, _connectionStyle stays 'instance' so a
second disconnect() is a no-op rather than a side-effect.

Test coverage: test/e2e/postgres-engine-disconnect-idempotency.test.ts
pins both contracts:
  - instance-pool engine: second disconnect MUST NOT clobber the module
    singleton (the bug above).
  - module-singleton engine: second disconnect is a no-op (resolves
    cleanly, no throw).

Required for: minions-shell.test.ts to keep passing alongside the worker
changes on this branch. Discovered during E2E sweep after the unit-test
green light. Commit 7 in this branch then walks back the worker-side
disconnect entirely (engine ownership belongs to the CLI handler) but
this idempotency fix stays in place as a defense-in-depth guard against
any future code calling disconnect twice on the same engine.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor: move engine.disconnect() from worker.start() to gbrain jobs work CLI handler (engine ownership)

Commit 671ef09 (the original fix in this branch) put
\`await this.engine.disconnect()\` inside MinionWorker.start()'s finally
block to free PgBouncer pool slots immediately on shutdown. That was the
right intent on the wrong layer: the worker doesn't own the engine, the
CLI handler that creates the engine does.

The mismatched ownership broke every test that shares a single engine
across multiple worker.start() / worker.stop() cycles:

  - test/e2e/minions-shell-pglite.test.ts → shared PGLite engine, second
    test failed with "PGLite not connected"
  - test/e2e/worker-abort-recovery.test.ts → 3 tests, same shape
  - test/e2e/minions-shell.test.ts → 3 Postgres tests broken by the
    second-disconnect-clobbers-global-singleton symptom (commit 6 of
    this branch fixed the underlying engine non-idempotency, but the
    worker-disconnect call was still wrong on its own)

Fix:
  - worker.ts: remove the engine.disconnect() call. Add a comment
    documenting WHY the worker doesn't disconnect (ownership invariant)
    so a future contributor doesn't put it back.
  - src/commands/jobs.ts case 'work': wrap worker.start() in a
    try/finally that calls engine.disconnect() on shutdown. The CLI
    created the engine (line 631 area), so the CLI disposes of it.
    Disconnect failure logs to stderr with the
    "[gbrain jobs work] engine disconnect failed during shutdown:" prefix
    rather than the bare \`catch {}\` of earlier waves — matches the
    v0.26.9 D14 direction of preferring loud-but-best-effort over silent.

Test:
  - test/worker-shutdown-disconnect.test.ts now pins the inverse
    invariant: worker.start() MUST NOT call engine.disconnect(), and
    the engine MUST remain queryable after start() returns. Two tests,
    instance-level spy, parallel-safe (no module mocking).

End state: gbrain jobs work in production still frees pool slots
immediately on shutdown (intent of 671ef09 preserved), tests that share
an engine don't break (regression class fixed), and the engine ownership
invariant is now codified in code AND in the test suite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: clearTimeout in probeHealth race + platform guard SIGCHLD on Windows

Two adversarial-review auto-fixes from /ship's pre-landing review pass.
Both reviewers (Claude adversarial subagent + Codex adversarial) flagged
the timer leak independently; Codex additionally caught the Windows
crash risk.

1. probeHealth race timer leak (serve-http.ts):
   `Promise.race([getStats(), setTimeout(...)])` doesn't cancel the loser.
   Without `clearTimeout`, every fast /health request leaves a 3s pending
   timer in the event loop until it fires. Under sustained probe rates
   (Fly.io polls every ~10s, orchestrator load balancers can be much
   tighter), this builds a rolling backlog of timers and avoidable event
   loop wakeups in the hottest endpoint. Capture the timer handle, clear
   it in a `finally` block. No-op when the timer already fired.

2. SIGCHLD platform guard (zombie-reap.ts):
   SIGCHLD is POSIX-only. On Windows, `process.on('SIGCHLD', ...)` throws
   ENOTSUP because Windows doesn't have signals. Bun behaves the same.
   Without this guard, any future Windows port of a gbrain CLI tool
   would crash at boot before main() even runs. The zombie-reaping fix
   is itself POSIX-only (tini, ps, /proc), so the guard is consistent
   with the platform's capability set.

NOT in this commit (intentionally out of scope):
- Cancelling engine.getStats() when /health times out. Both reviewers
  noted this would need AbortController support in the engine layer
  which doesn't exist yet. The 503 timeout already improves on master's
  hang behavior; full cancellation is a follow-up.
- Switching /health to a lighter probe (SELECT 1 instead of count(*)
  across 6 tables). Pre-existing behavior; refactoring the probe shape
  is wider blast radius than this branch's zombie-reaping scope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.28.1)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: update CLAUDE.md for v0.28.1 zombie reaping + health + engine ownership

Add v0.28.1 file annotations covering:
- src/core/zombie-reap.ts (new) — Layer 1 SIGCHLD reaper module
- src/core/minions/spawn-helpers.ts (new) — pure detectTini + buildSpawnInvocation helpers
- src/core/minions/worker.ts — engine-ownership invariant (no engine.disconnect)
- src/core/minions/supervisor.ts — consumes spawn-helpers, exposes isTiniDetected
- src/commands/serve-http.ts — probeHealth() + HEALTH_TIMEOUT_MS = 3000
- src/commands/jobs.ts — case 'work' owns engine lifecycle via try/finally
- src/commands/autopilot.ts — resolves tini once at startup
- src/core/postgres-engine.ts — disconnect() is idempotent via _connectionStyle

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Wintermute <wintermute@garrytan.com>
Co-authored-by: Garry Tan <garrytan@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request May 8, 2026
…e ping/doctor + topologies) (#732)

* feat(config): add remote_mcp field + isThinClient() helper

Adds a top-level optional remote_mcp config block to GBrainConfig
(issuer_url, mcp_url, oauth_client_id, oauth_client_secret) for
thin-client installs that consume a remote `gbrain serve --http` over
MCP instead of running a local engine.

isThinClient(config) returns true when remote_mcp is set; used by the
CLI dispatch guard, doctor branch, and init re-run guard. The engine
field stays as today (postgres|pglite); thin-client mode is a separate
config field, NOT an engine kind extension (codex outside-voice review
flagged the engine='remote' extension as overreach).

GBRAIN_REMOTE_CLIENT_SECRET env var overrides the config-file value at
load time so the secret can stay out of disk for headless agents.

Foundation commit for multi-topology v1; no behavior change yet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(probe): outbound OAuth + MCP smoke probes

Adds three pure async functions over the standard fetch API:
  - discoverOAuth(issuerUrl): GET /.well-known/oauth-authorization-server
  - mintClientCredentialsToken(tokenEndpoint, id, secret): POST /token
  - smokeTestMcp(mcpUrl, accessToken): POST /mcp initialize

Discriminated 'ok=true' / 'ok=false + reason' return shapes so callers
render error messages consistently. No SDK dependency to keep init's
setup-flow scope tight; Lane B's mcp-client.ts will pull in the
official @modelcontextprotocol/sdk Client for full session semantics.

Used by both 'gbrain init --mcp-only' (Lane A's setup smoke) and
runRemoteDoctor (Lane A's thin-client doctor checks).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(init): --mcp-only branch + re-run guard

Adds 'gbrain init --mcp-only' for thin-client setup. Required flags
(or env vars):
  --issuer-url     OAuth root (e.g. https://host:3001)
  --mcp-url        MCP tool dispatch path (e.g. https://host:3001/mcp)
  --oauth-client-id, --oauth-client-secret

Pre-flight runs three smoke probes (discovery, token round-trip, MCP
initialize) BEFORE writing the config — fail-fast on bad URL beats
fail-late on bad credentials. On success, writes ~/.gbrain/config.json
with remote_mcp set and NO local DB created.

Re-run guard (A8): when ~/.gbrain/config.json already has remote_mcp,
'gbrain init' (any flag set) refuses without --force. Catches the
scripted-setup-loop friction from the user-reported scenario where
re-running setup-gbrain on a thin-client machine kept trying to
re-create a local DB.

Two URLs in config (issuer + mcp) instead of one because OAuth
discovery + /token live at the issuer root while tool dispatch is at
/mcp — they compose from a common base in practice but reverse-proxy
setups need them explicit (codex review #2).

Tests: 15 cases covering happy path, env-var-supplied secret stays
out of disk, all four required-flag missing-error paths, three
smoke-failure paths, network-unreachable path, and the four re-run
guard variants (default/--pglite/--mcp-only without --force / with
--force). Uses async Bun.spawn (NOT execFileSync) — sync exec
deadlocks against in-process HTTP fixtures because the parent's
event loop can't accept connections while sync-blocked on a child.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(doctor): runRemoteDoctor for thin-client mode

Replaces every DB-bound check from runDoctor() with a tighter set
scoped to 'is the remote MCP we configured actually reachable?'.
Five checks:
  - config_integrity (URL fields well-formed)
  - oauth_credentials (secret resolvable from env or config file)
  - oauth_discovery (GET /.well-known/oauth-authorization-server)
  - oauth_token (POST /token client_credentials)
  - mcp_smoke (POST /mcp initialize)

Output shape matches the local doctor's Check surface so JSON
consumers can union the two without conditional logic. schema_version
is 2 (matches local doctor).

collectRemoteDoctorReport() is the pure data collector;
runRemoteDoctor() is the print/exit wrapper. Tests pin the data
collector so we don't have to intercept stdout / process.exit.

Tests: 12 cases over a tiny in-process HTTP fixture covering happy
path, every probe failure mode (404/parse/auth/network/server-error),
malformed-URL config integrity, missing-secret short-circuit, and
the env-var-overrides-config-file secret resolution. withEnv() helper
used for env mutations to satisfy the test-isolation lint.

Module is added but not yet wired into the CLI doctor branch; the
wiring lands in the next commit (cli dispatch guard + doctor routing).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cli): thin-client dispatch guard + doctor routing

Adds a single canonical refusal at the top of handleCliOnly() for the
9 DB-bound commands when ~/.gbrain/config.json has remote_mcp set:
  sync, embed, extract, migrate, apply-migrations, repair-jsonb,
  orphans, integrity, serve

Single dispatch check (not 9 sprinkled assertLocalEngine calls per
codex review #1) — avoids the blast radius of letting commands enter
connectEngine before the check fires. Refused commands exit 1 with a
canonical error naming the remote mcp_url.

doctor branch routes to runRemoteDoctor when isThinClient(config)
returns true; falls through to the existing local-doctor flow
otherwise. Wires the module added in the previous commit into the
user-facing CLI surface.

Safe commands (init, auth, --version, --help, etc.) still work in
thin-client mode and are NOT in the refused set.

Tests: 14 cases — 9 refused commands × 1 each, 2 safe commands, 1
doctor-routing assertion (fingerprints the thin-client output by
'mode:"thin-client"' in JSON), 2 regression tests asserting local
config still passes through normally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(topologies): multi-topology architecture guide + setup skill Phase A.5

New docs/architecture/topologies.md covering three deployment shapes:
  1. Single brain (today's default)
  2. Cross-machine thin client (consume a remote brain over MCP)
  3. Split-engine per-worktree (Conductor users with per-worktree
     code engines + shared remote artifacts brain)

Each topology gets an ASCII diagram, when-it-fits guidance, and
concrete setup recipes. Topology 3's alias-level routing footgun
(wrong alias = silent wrong-brain writes) is called out explicitly
per codex review #6.

Topology 3 needs zero gbrain code changes — GBRAIN_HOME already
overrides ~/.gbrain and 'gbrain serve --http --port N' already runs
on any port. gstack composes these primitives on its side.

skills/setup/SKILL.md gets Phase A.5 BEFORE the local-engine phases.
Asks the user which topology fits, walks thin-client setup through
'gbrain init --mcp-only', skips Phases B/C/C.5/H entirely for thin
clients (host's autopilot handles sync/extract/embed).

README.md gets a one-line link to the topology doc from the
Architecture section.

llms-full.txt regenerated to include the new doc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(e2e): thin-client end-to-end skeleton

Spins up 'gbrain serve --http' against real Postgres, registers a
client with read,write,admin scope, runs 'gbrain init --mcp-only'
from a separate tempdir GBRAIN_HOME, exercises the canonical
thin-client flows:

  - init --mcp-only succeeds against the live host
  - doctor reports mode: thin-client + all checks green
  - sync is refused with the canonical thin-client error
  - re-running init refuses without --force

Tier B flows (gbrain remote ping / doctor) will be added alongside
their Lane B implementation. Skips when DATABASE_URL unset (matches
the e2e gate convention used across the suite).

Async Bun.spawn (NOT execFileSync) so the test event loop stays
responsive — execFileSync deadlocks against in-process HTTP fixtures
because the parent's event loop can't accept connections while
sync-blocked on a child process.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(doctor): doctorReportRemote core for thin-client + run_doctor op

Adds three new exports to src/commands/doctor.ts that the run_doctor MCP
op + gbrain remote doctor CLI both consume:

  - DoctorReport interface       schema_version=2 stable shape
  - computeDoctorReport(checks)  status + health_score math
  - doctorReportRemote(engine)   focused 5-check thin-client surface

doctorReportRemote runs:
  1. connection      (engine reachable + page count via getStats)
  2. schema_version  (engine.getConfig('version') vs LATEST_VERSION)
  3. brain_score     (the 5-component composite)
  4. sync_failures   (file-plane JSONL count from gbrainPath('sync-failures.jsonl'))
  5. queue_health    (Postgres-only: stalled active jobs > 1h)

Engine-agnostic: works on both Postgres and PGLite via engine.executeRaw +
engine.getConfig + engine.getHealth — no reliance on db.getConnection()
which is Postgres-only.

Deliberately a focused subset of the local doctor surface, NOT a full
mirror. Generalizing to lint/integrity/orphans is filed as follow-up
pending demand. Local doctor (runDoctor) is unchanged; operators on the
host machine still get the full check set.

schema_version=2 matches the local doctor's --json output schema, so JSON
consumers can union the two without conditional logic.

Tests: 11 unit cases against PGLite covering the 5-check happy path,
schema version reporting (latest), PGLite-specific queue_health
informational message, and the score+status math via computeDoctorReport.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(mcp-client): outbound HTTP MCP client over @modelcontextprotocol/sdk

New src/core/mcp-client.ts wraps the official SDK's Client +
StreamableHTTPClientTransport with OAuth client_credentials minting,
in-process token caching with expires_at, and refresh-on-401 retry.

Public surface:
  - callRemoteTool(config, toolName, args)   tool call w/ auto-refresh
  - unpackToolResult(res)                    parse content[0].text JSON
  - RemoteMcpError                           discriminated by `reason`

Token cache: module-level Map keyed by mcp_url. CLI processes are
short-lived; the cache amortizes when one invocation makes multiple
calls (gbrain remote ping submits then polls). Persisting to disk would
be a credential-on-disk surface for marginal benefit since /token
round-trip is sub-100ms.

401 retry: ONLY for mid-session token rotation (initial good token →
stale → 401). If the FIRST mint fails auth, surface immediately as
RemoteMcpError(auth) — retry won't help when credentials are wrong from
the start. If a fresh-mint-after-401 still 401s, surface as
RemoteMcpError(auth_after_refresh) which the CLI renders with a hint
pointing the operator at gbrain auth register-client.

Used by gbrain remote ping (submit_job + get_job poll) and gbrain
remote doctor (run_doctor). Test-only _clearMcpClientTokenCache export
for fixture isolation.

Tests: 13 unit cases over an in-process HTTP fixture mimicking gbrain
serve --http (OAuth discovery + /token + /mcp JSON-RPC handshake).
Covers happy path, token cache reuse + force-refresh, args passthrough,
config-error paths (no remote_mcp / no secret), token mint 401, network
unreachable, tool isError envelope, and unpackToolResult parse failures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(operations): add run_doctor MCP op (admin scope, HTTP-reachable)

New op in src/core/operations.ts wraps doctorReportRemote() and returns
the structured DoctorReport JSON over MCP.

  scope:     'admin'       (system-state read; not for routine consumers)
  localOnly: false         (reachable over HTTP)
  mutating:  false         (safe to call repeatedly)
  params:    {}            (no caller arguments needed)

First read-only diagnostic op exposed over HTTP MCP. Used by gbrain
remote doctor — the matching client-side renderer lives in
src/commands/remote.ts.

Precedent: doctor only. Generalizing run_lint / run_integrity /
run_orphans to MCP is filed as follow-up work pending demand. Local
doctor stays unchanged; this op is the operator-friendly subset for
remote callers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(remote): gbrain remote ping + gbrain remote doctor

Two thin-client convenience commands that round-trip through the host's
HTTP MCP endpoint:

  - gbrain remote ping     submit_job(autopilot-cycle) → poll get_job →
                           exit when terminal. The "I just wrote markdown,
                           tell the host to re-index" affordance.
  - gbrain remote doctor   run_doctor MCP op → render the host's
                           DoctorReport → exit 0/1 based on status.

Both require a thin-client install (~/.gbrain/config.json with
remote_mcp). Local installs get a clear error pointing at the local
equivalents.

Polling backoff (ping): 1s × 30s, then 5s × 5min, then 10s. Default cap
15min, configurable via `--timeout`. Without backoff, a 5-min cycle
would burn 300 round-trips against the host's rate limiter.

Payload uses `data: {phases: [...]}`, NOT `params:` — the submit_job op
shape takes `data`. Codex review #8 catch.

NO `repo` arg passed to autopilot-cycle — uses the server's configured
brain repo. This sidesteps TODO #1144 (sync_brain repo-path validation
for caller-controlled paths) entirely.

src/cli.ts wires the `remote` subcommand into CLI_ONLY + the dispatch.
Help (`gbrain remote --help`) and unknown-subcommand handling included.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(e2e): thin-client Tier B + scope-mismatch regression

Extends the existing test/e2e/thin-client.test.ts with three new cases:

  1. gbrain remote doctor returns the host's DoctorReport — pins the
     run_doctor MCP op round-trip. Asserts schema_version=2, all 5
     check names present, connection + schema_version ok against a
     fresh host.
  2. gbrain remote ping triggers autopilot-cycle and returns terminal
     state — pins the submit_job → poll → terminal wire path. Accepts
     any terminal state (success / failed / dead / cancelled / timeout)
     because autopilot on an empty no-repo brain may fail-fast in the
     sync phase. What this test pins is the JSON shape (job_id present,
     state populated), NOT cycle success on a no-repo fixture.
  3. read+write client cannot call run_doctor — codex review #7
     regression guard. Registers a separate client with
     `--scopes "read write"` (no admin), runs `gbrain remote doctor`
     against it, asserts exit 1 with auth/auth_after_refresh/tool_error
     reason. Keeps the verification flow honest: the canonical setup
     MUST require admin scope.

`gbrain auth register-client` doesn't have --json, so the test parses
the human output for "Client ID:" and "Client Secret:" lines via a
helper.

Test-level timeout bumped 60s → 120s for the ping wait + auth/init
overhead.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.29.2)

v0.29.2 ships thin-client mode: gbrain init --mcp-only, gbrain remote
ping/doctor, run_doctor MCP op, and the docs/architecture/topologies.md
deployment guide.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit to garrytan-agents/gbrain that referenced this pull request May 10, 2026
… (EXP-5)

Reproducible cross-modal quality eval for the takes layer. Three frontier
models score a sample against the 5-dim rubric, the runner aggregates to
PASS/FAIL/INCONCLUSIVE, the receipt persists to eval_takes_quality_runs.
Trend mode segregates by rubric_version; regress mode is a CI gate that
exits 1 when any dim regresses past --threshold.

Subcommands:
  run     [--limit N --cycles N --budget-usd N --slug-prefix P --models a,b,c]
  replay  <receipt-path> [--json]                 # NO BRAIN required
  trend   [--limit N --rubric-version V --json]
  regress --against <receipt> [--threshold T --json]

Codex review integrations (D7 — all 10 findings landed):

  garrytan#1 json-repair shim re-exports BOTH parseModelJSON AND the
     ParsedScore + ParsedModelResult types. The original plan only
     re-exported the function, which would have compile-broken
     cross-modal-eval/aggregate.ts:19's type import.

  garrytan#3 Receipt name binds (corpus_sha8, prompt_sha8, models_sha8,
     rubric_sha8) so a future rubric tweak segregates trend rows
     instead of silently corrupting the quality-over-time graph.
     RUBRIC_VERSION + rubric_sha8 are persisted in every receipt.

  garrytan#4 Pricing fail-closed: any model not in pricing.ts produces an
     actionable PricingNotFoundError before any HTTP call fires.
     Same drift problem as cross-modal-eval/runner.ts:estimateCost(),
     but explicit instead of silent zero.

  garrytan#5 Aggregate requires ALL 5 declared rubric dimensions per model.
     Cross-modal-eval v1's union-of-whatever-parsed pattern allowed a
     model to omit a dim and still PASS — that's a regression-gate
     hole. Now: missing-dim drops the contribution, treated identically
     to a parse failure. Empty-scores PASS regression guard preserved.

  garrytan#6 DB-authoritative receipt persistence. Original two-phase plan had
     a split-brain reconciliation gap (disk-success/DB-fail vanishes
     from trend; DB-success/disk-fail unreplayable). Now DB row is the
     source of truth (carries full receipt JSON in a JSONB column);
     disk artifact is best-effort. replay reads disk first; loadReceiptFromDb
     reconstructs from DB when the disk file is missing.

  garrytan#10 Brain-routing: replay is the only sub-subcommand that doesn't
      need a brain. cli.ts no-DB bypass routes "eval takes-quality replay"
      directly to runReplayNoBrain, which exits 0/1/2 cleanly without
      ever touching the engine. Other modes go through connectEngine.

Files added:
  src/core/eval-shared/json-repair.ts (hoisted from cross-modal-eval)
  src/core/takes-quality-eval/{rubric,pricing,aggregate,receipt-name,
                                receipt-write,receipt,replay,regress,trend,runner}.ts
  src/commands/eval-takes-quality.ts
  docs/eval-takes-quality.md (stable schema_version: 1 contract)
  10 test files (83 cases — aggregate / receipt-name / shim / pricing /
                 rubric / receipt-write / replay / trend / regress / cli)

Files modified:
  src/cli.ts: replay no-DB bypass + engine-required dispatch
  src/core/cross-modal-eval/json-repair.ts → re-export shim
  src/core/migrate.ts: append v47 (eval_takes_quality_runs table)
  src/core/pglite-schema.ts + src/schema.sql: mirror the v47 table for
    fresh-install path. RLS toggled on the new table.
  src/core/schema-embedded.ts: regenerated via build:schema
  test/migrate.test.ts: 6 structural cases for v47

186 tests pass; typecheck clean. Replay verified working end-to-end
(reads receipt JSON file without DATABASE_URL, exits with the verdict
code, prints actionable error on missing file).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request May 10, 2026
…closes #413, #446) (#801)

* fix(serve): clean up stdio MCP server on client disconnect

The PGLite write lock leaked indefinitely when the parent of `gbrain serve`
disconnected. Three root causes: serve.ts never called engine.disconnect()
after startMcpServer() resolved; cli.ts short-circuited with a "serve doesn't
disconnect" comment; and the MCP SDK's StdioServerTransport only listens for
'data'/'error' on stdin, never 'end'/'close', so even a clean stdin EOF never
reached the SDK.

Net effect: the next `gbrain serve` waited for the in-process 5-minute stale-
lock check or hung indefinitely.

stdio path now installs a unified lifecycle:
- SIGTERM/SIGINT/SIGHUP all funnel into one idempotent shutdown path
  (SIGHUP coverage matters for Claude Desktop on macOS / MCP gateway
  restarts; SIGINT for Ctrl-C; SIGTERM for daemon shutdown).
- stdin 'end' (clean EOF) and 'close' (parent SIGKILL with pipe still
  open) both trigger the same graceful path. TTY stdin skips the watchers
  so interactive `gbrain serve` is unaffected.
- Parent-process watchdog polls the live kernel parent PID via spawnSync
  ('ps','-o','ppid=','-p',PID) every 5s. process.ppid is cached at process
  creation by Bun (and Node) and never refreshes on re-parent — empirical
  evidence on macOS shows ps reports the new parent within one tick while
  process.ppid stays at the original PID indefinitely (oven-sh/bun#30305).
- Watchdog fires on `getParentPid() !== initialParentPid` (any reparent),
  not just `=== 1`. Catches launchd / systemd / tmux / parent-shell-with-
  PR_SET_CHILD_SUBREAPER cases where the kernel re-anchors us to a non-1
  subreaper PID. Codex review caught the original `=== 1` was incomplete.
- One-shot startup probe verifies `spawnSync('ps')` actually works on this
  host. If the probe fails (stripped containers / busybox without procps),
  we skip installing the watchdog interval entirely AND emit a loud stderr
  line — the operator sees "watchdog disabled" instead of an installed-
  but-never-fires phantom that silently falls back to cached process.ppid.
- 5-second cleanup deadline: if engine.disconnect() wedges (PGLite WASM
  stall, etc.), the process still calls process.exit(0). The abandoned
  lock dir is reclaimed on the next start by the existing stale-lock
  check in pglite-lock.ts.
- Optional `--stdio-idle-timeout <sec>`: default OFF safety net for
  parents that leak the pipe but never close it. Strict parsing rejects
  `abc` / `30junk` / `-1` / `1.5` / blank values explicitly so a typo
  doesn't silently disable the safety net (closes #446).

Test seam: ServeOptions { stdin, signals, exit, log, startMcpServer,
getParentPid, setInterval, clearInterval, probeWatchdog } lets the
lifecycle be unit-tested deterministically without spawning a real Bun
child or booting the MCP SDK.

22 test cases covering signals, stdin EOF, TTY skip, watchdog reparent
(both PID-1 and subreaper-PID-N cases), ps-unavailable degraded mode,
idle timeout, idempotent shutdown, and cleanup-deadline behavior.

Closes #413, #446. Supersedes #591.

Co-Authored-By: Aragorn2046 <noreply@github.com>
Co-Authored-By: seungsu-kr <noreply@github.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(auth): route HTTP auth/admin SQL through active engine

`gbrain auth` and `gbrain serve --http` previously routed every SQL
through the postgres.js singleton in src/core/db.ts, which silently fell
back to a file-backed PGLite when DATABASE_URL was set but the config
file disagreed. The HTTP transport's verbatim use of the singleton also
made `gbrain serve --http` Postgres-only, even though the
`access_tokens` and `mcp_request_log` tables exist in both engine
schemas.

Auth, OAuth, admin, file uploads, and HTTP-transport SQL now run through
`engine.executeRaw` via a deliberately narrow tagged-template adapter
(`src/core/sql-query.ts`). The contract is scalar-binds-only — adding
JSONB or fragment composition would invite the adapter to drift into a
partial postgres.js clone. JSONB writes use a separate
`executeRawJsonb(engine, sql, scalarParams, jsonbParams)` helper that
composes positional `$N::jsonb` casts and passes objects through
`engine.executeRaw`. The CI guard at `scripts/check-jsonb-pattern.sh`
doesn't fire because the helper is a method call, not the banned
`${JSON.stringify(x)}::jsonb` template-literal interpolation, and the
v0.12.0 double-encode bug class doesn't apply to positional binding via
`postgres.js`'s `unsafe()` (verified by
`test/e2e/auth-permissions.test.ts:67` on Postgres and the new
`test/sql-query.test.ts` on PGLite).

Migrated call sites:
  - src/commands/auth.ts: takes-holders writes (lines 52, 86) →
    executeRawJsonb. List, revoke, register-client, revoke-client →
    SqlQuery via withConfiguredSql() helper that opens an engine, runs
    the callback, disconnects.
  - src/commands/serve-http.ts: ~25 call sites including the four
    mcp_request_log.params INSERTs (now write real JSONB objects, not
    JSON-encoded strings — the read side `params->>'op'` returns the
    operation name, closing CLAUDE.md's outstanding "JSON-string-into-
    JSONB" note as a side effect). The /admin/api/requests dynamic
    filter pattern (postgres.js fragment composition) is rewritten as
    parametrized SQL string + params array.
  - src/mcp/http-transport.ts: legacy bearer-auth path. The
    Postgres-only fail-fast at startup is removed because both schemas
    now carry access_tokens + mcp_request_log.
  - src/core/oauth-provider.ts: SqlQuery / SqlValue types relocated
    from here to sql-query.ts as the canonical home (Codex finding #8).
  - src/commands/files.ts: all 5 db.getConnection() sites (lines 104,
    139, 252, 326, 355). The line-256 INSERT into files.metadata uses
    executeRawJsonb; the other four are scalar-only SqlQuery (Codex
    finding #6 — scope was bigger than the plan's "lone INSERT" framing).
  - src/core/config.ts: env-var DATABASE_URL inference. When dbUrl is
    set, infer Postgres engine and clear the stale database_path.

Engine-internal sql.json() sites in src/core/postgres-engine.ts (5
sites: lines 520, 1689, 1728, 1790, 2313) STAY UNCHANGED. They live
inside PostgresEngine itself, where the postgres.js template-tag
sql.json() pattern is correct — those methods are only loaded when
Postgres is the active engine, so there's no PGLite-routing concern.

Migration v45 (mcp_request_log_params_jsonb_normalize): one-shot UPDATE
that lifts pre-v0.31 string-shaped JSONB rows to objects so the
/admin/api/requests endpoint at serve-http.ts:605 returns one
consistent shape to the admin SPA. Idempotent (subsequent runs find no
rows where jsonb_typeof = 'string'). Closes the mixed-shape window
that would otherwise have made post-deploy admin reads break.

Tests:
  - test/sql-query.test.ts: 7 cases covering scalar binds, the
    .json() rejection (defense in depth — SqlQuery is scalar-only),
    JSONB round-trip with `jsonb_typeof = 'object'` and `->>`
    semantics, the v0.12.0 double-encode regression guard, null
    JSONB handling, and the scalars-then-jsonb call shape.
  - test/config-env.test.ts: migrated from PR's manual `restoreEnv()`
    in afterEach to the canonical `withEnv()` helper at
    test/helpers/with-env.ts (CLAUDE.md R1 / codex finding D3).
    Five cases covering DATABASE_URL precedence, GBRAIN_DATABASE_URL
    operator override, file-only config, env-only config, and the
    no-config null path.
  - test/e2e/auth-takes-holders-pglite.test.ts: 6 cases against
    in-memory PGLite (no DATABASE_URL gate). Covers create / update /
    read of access_tokens.permissions, mcp_request_log.params object
    + null writes, and the migration v45 normalizer (seed
    string-shaped row, run UPDATE, assert object shape; second-run
    no-op for idempotency).
  - test/http-transport.test.ts: mock updated to intercept
    engine.executeRaw (the new code path) instead of the postgres.js
    template tag. 24 cases pass.

Plan reference: ~/.claude/plans/system-instruction-you-are-working-peppy-moore.md.
Codex outside-voice review applied: D-codex-1, D-codex-2, D-codex-5,
D-codex-8, D-codex-9, D-codex-10 (and D1, D5 reversed by codex).

Closes the architectural intent of #681. Supersedes its branch.

Co-Authored-By: codex-bot <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: update CLAUDE.md key files for v0.31.3

Annotate the v0.31.3 changes in the canonical Key Files section:
new src/core/sql-query.ts adapter (#681), src/commands/serve.ts stdio
cleanup (#676), v0.31.3 amendments to auth.ts / serve-http.ts /
oauth-provider.ts surfaces, and migration v46 normalizer in migrate.ts.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: regenerate llms-full.txt for v0.31.3 docs sync

CI's build-llms test asserts the committed llms.txt + llms-full.txt
match what scripts/build-llms.ts produces from current source state.
CLAUDE.md was amended by /document-release post-merge (new entries for
src/core/sql-query.ts and src/commands/serve.ts; amended notes on
auth.ts / serve-http.ts / migrate.ts), so the inlined-bundle fell out
of sync. Regenerated via `bun run build:llms`.

llms.txt unchanged (curated index — no new web URLs added).
llms-full.txt updated to inline the new CLAUDE.md content.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Aragorn2046 <noreply@github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants