Skip to content

v0.40.7.0 Schema Cathedral v3 — agent-on-ramp + production rebuild of PR #1321#1327

Merged
garrytan merged 12 commits into
masterfrom
garrytan/pr1321-production-build
May 23, 2026
Merged

v0.40.7.0 Schema Cathedral v3 — agent-on-ramp + production rebuild of PR #1321#1327
garrytan merged 12 commits into
masterfrom
garrytan/pr1321-production-build

Conversation

@garrytan

@garrytan garrytan commented May 23, 2026

Copy link
Copy Markdown
Owner

Summary

Your agents can now author your brain's schema pack themselves — no more shell-out, no more hand-editing YAML. This wave closes a community PR (#1321) by @garrytan-agents as a production-grade rebuild. Wintermute (or any agent connected via MCP) can now say "my brain has 4000 untyped meetings pages — add a meeting type and backfill them" and have it done safely: per-pack atomic file lock, validation gate that catches dangling references pre-write, atomic .tmp + fsync + rename so a crash never leaves the pack half-written, privacy-redacted audit log with the agent's identity (actor: mcp:<clientId8>), then chunked UPDATE in 1000-row batches that never wedge concurrent writers.

The wave (7 commits, 6,300 net LOC, 255+ schema-pack tests green):

  • Foundations: pack-lock.ts (atomic O_CREAT|O_EXCL, NOT TOCTOU page-lock pattern), mutate-audit.ts (privacy-redacted JSONL — sha8 type names, first-slug-only prefixes, both success+failure events), registry.ts cache invalidation with extends-chain reverse-graph cascade (codex C6 fix) + stat-mtime TTL gate for cross-process invalidation, best-effort.ts loader (empty-filter on pack-load failure, NOT silent defaults per D4), 11 pure lint rules, query-cache-invalidator.ts (codex C9 — sync --apply invalidates stale cached results).
  • Mutation: withMutation 8-step skeleton wrapping 11 primitives (add/remove/update type + alias + prefix + link_type + flag setters). Codex C14 reference check on removeType. Inline JSON→YAML emitter so YAML packs stay YAML.
  • Data plane: stats.ts (multi-source aware, dead-prefix detection) + sync.ts (D14 chunked UPDATE, codex C5 write-side scoping).
  • CLI: 14 new gbrain schema * verbs wired with rich help text + lifecycle grouping.
  • MCP: 9 new ops including batched schema_apply_mutations (admin scope, NOT localOnly per D2 so Wintermute reaches over normal HTTPS MCP).
  • T1.5 wiring: gbrain whoknows + find_experts MCP now consult the active pack for expert types. Pack-load failure → empty filter (loud) instead of silent fallback to hardcoded ['person', 'company'].
  • Agent on-ramp: skills/schema-author/SKILL.md with explicit boundary callouts to brain-taxonomist and eiirp; skills/conventions/schema-evolution.md decision tree; RESOLVER routing.
  • Bookkeeping: VERSION 0.40.6.0 → 0.40.7.0 (queue collision with master's PR v0.40.6.0 feat(sync): parallel sync --all + per-source lock invariant + sources status dashboard (productionized from PR #1314) #1324), CHANGELOG ELI10 lead, TODOS 10 follow-ups filed.

Test Coverage

84 new test cases across 9 new test files; 255 total schema-pack-related tests green:

Test file Cases
test/schema-pack-pack-lock.test.ts 20 (atomic acquire, stale-PID, TTL refresh, force-steal, cleanup)
test/schema-pack-mutate-audit.test.ts 13 (privacy redaction, success+failure logging, ISO-week rotation)
test/schema-pack-registry-reload.test.ts 18 (extends-chain cascade, stat-mtime TTL, cross-process detection)
test/schema-pack-best-effort.test.ts 4 (empty-filter contract)
test/schema-pack-lint-rules.test.ts 35 (11 rules × happy + violation + multi-violation)
test/schema-pack-query-cache-invalidator.test.ts 4 (PGLite+Postgres parity)
test/schema-pack-mutate.test.ts 34 (withMutation skeleton + 11 primitives + atomic invariants)
test/schema-pack-stats.test.ts 13 (multi-source, soft-delete, dead-prefix, JSON envelope)
test/schema-pack-sync.test.ts 13 (dry-run, apply, chunked UPDATE, idempotency, source scoping)
test/operations-schema-pack.test.ts 23 (9 MCP ops + batched mutation atomicity + audit actor capture)

Pre-Landing Review

Already passed /plan-eng-review (5 sections + outside-voice codex challenge, 21 decisions captured) BEFORE implementation began. Plan: ~/.claude/plans/system-instruction-you-are-working-recursive-thacker.md. No new findings during implementation — every architectural decision is grounded in the pre-shipped review.

Plan Completion

7/7 phases shipped (Phase 0: PR close, Phase 1: 6 foundations, Phase 2: mutate, Phase 3: stats+sync, Phase 4: CLI, Phase 5+6+7: lint + MCP ops, Phase 8: T1.5 wiring partial, Phase 10: skill, Phase 12: docs). 4 deliverables deferred to v0.40.7+ follow-ups with TODOs filed:

  • Enrichment-service union widening (3-site surgery, ~2hr CC)
  • facts/eligibility.ts pack-aware wiring (behavior change, needs careful regression coverage)
  • 3 doctor checks (schema_pack_coverage, _writability, _mutation_audit)
  • T16 + T16.1 hermetic eval gates

TODOS

10 follow-up entries filed under ## v0.40.7.0 Schema Cathedral v3 follow-ups (v0.40.7+) in TODOS.md covering the deferred deliverables above plus v0.41+ items (T19 federated read closure, T20 extends merging, T21 YAML comment preservation, T22 admin SPA, T23 schema:write scope, T24 multi-tenant federation).

Documentation

/document-release ran on the final commit and surfaced one doc-debt item. /document-generate then closed the tutorial gap:

  • docs/schema-author-tutorial.md (NEW) — 8-step walkthrough from gbrain schema fork through add-type researcher + sync --apply + proving the T1.5 wiring via gbrain whoknows. Time-to-first-result < 3 steps. Placeholder pages only (alice/bob/charlie-example) so it runs against any brain without affecting real content.
  • README.md — added v0.40.7.0 lead paragraph + tutorial pointer at the end of that paragraph.
  • CLAUDE.md — added "Schema Cathedral v3 (v0.40.7.0)" section between thin-client routing and Commands. 14-bullet Key Files cluster citing every design decision (D2/D4/D6/D8/D9/D10/D11/D13/D14/D20) and codex finding (C5/C6/C8/C9/C13/C14).
  • skills/schema-author/SKILL.md — added "## Tutorial" callout pointing at the new tutorial doc, just above the existing workflow phases.
  • llms-full.txt — regenerated (537KB → 545KB) after CLAUDE.md edits.
  • CHANGELOG.md — already polished during /ship.
  • TODOS.md — already filed 10 v0.40.7+ follow-ups during /ship.

Diataxis coverage map (all four quadrants now present):

Surface Reference How-to Tutorial Explanation
14 new CLI verbs ✅ CLAUDE.md cluster + gbrain schema --help ✅ SKILL.md 7-phase workflow ✅ docs/schema-author-tutorial.md ✅ CLAUDE.md cluster
9 new MCP ops ✅ operations.ts descriptions ✅ SKILL.md tools: list ✅ (tutorial "Next steps" → MCP path) ✅ CLAUDE.md cluster
schema-author skill ✅ SKILL.md + manifest + RESOLVER ✅ 7-phase workflow ✅ tutorial ✅ Non-goals + boundaries
schema-evolution convention ✅ conventions/schema-evolution.md ✅ decision tree n/a ✅ "Don'ts" section
T1.5 wiring (whoknows, find_experts) ✅ inline code + CLAUDE.md ✅ tutorial Step 8 proves it ✅ CLAUDE.md cluster

No more documentation debt. Every shipped surface has reference + how-to + tutorial + explanation coverage (or n/a where it doesn't apply).

Test plan

  • All schema-pack unit tests pass (255/255, 562 expects)
  • Typecheck clean
  • End-to-end agent journey verified: fork gbrain-base → use mine → add-type researcher --primitive entity --prefix people/researchers/ --extractable --expert → active → stats
  • MCP smoke verified via 23-case operations-schema-pack.test.ts including batched schema_apply_mutations atomicity + audit client_id capture

Migration safety

  • Pre-v0.40.7 brains: zero breaking changes. The 9 new MCP ops are additive; the existing 16 schema verbs unchanged.
  • Existing OAuth clients with read scope: read the new schema ops out of the box.
  • Existing OAuth clients with admin scope: can call schema_apply_mutations and reload_schema_pack immediately.
  • The mutation primitives refuse to touch bundled packs (gbrain-base, gbrain-recommended) — gbrain schema fork gbrain-base mine first.

🤖 Generated with Claude Code

garrytan and others added 12 commits May 23, 2026 10:15
…lidation + lint rules + best-effort

Six new primitives that Phase 2's withMutation skeleton (next commit) depends on.
No consumers yet; all callers wire up in Phase 4. Foundations ship first per
codex C1 phase-ordering finding from /plan-eng-review.

1.1 pack-lock.ts (18 cases)
  Atomic acquire via openSync(path, 'wx') = O_CREAT|O_EXCL. Kernel-level
  atomic, NO TOCTOU window. Codex C8 caught that page-lock.ts:79+96 has
  existsSync+writeFileSync (TOCTOU) — we deliberately do NOT copy it.
  Stale detection via TTL (60s default) + kill(pid, 0) liveness probe.
  TTL refresh every 10s while withPackLock(fn) runs so long DB-aware
  lint/stats on big brains don't go stale. --force = "steal stale lock"
  (NOT "skip locking"). Lock path per-pack so two packs never block.

1.2 mutate-audit.ts (13 cases)
  ISO-week JSONL at ~/.gbrain/audit/schema-mutations-YYYY-Www.jsonl.
  Privacy redacted per D20: type names → sha8, prefixes → first slug
  segment only. Matches candidate-audit.ts privacy posture. Both verbose
  surfaces gate on GBRAIN_SCHEMA_AUDIT_VERBOSE=1 (same env). Logs BOTH
  success AND failure events so Phase 9's schema_pack_writability doctor
  check has signal to read (closes codex C11). summarizeMutations()
  primitive shipped for cross-surface parity between doctor + future
  audit CLI.

1.3 registry.ts cache invalidation + stat-mtime TTL (10 cases)
  invalidatePackCache(name?) walks the extends-chain reverse-graph
  (every cached entry whose chain contains name is evicted). This is the
  codex C6 fix — pre-v0.40.6, editing a parent pack silently left
  children stale because cache identity was child-bytes-only. New
  per-name CacheEntry tracks the file-stat snapshot of every file in
  the extends chain. tryCachedPack(name) is the TTL-gated fast path:
  inside STAT_TTL_MS (1000ms default, env GBRAIN_PACK_STAT_TTL_MS)
  returns cached without statting. Outside the window: stats every file
  and cascade-invalidates on any mtime change (D11 cross-process
  detection). resolvePack reference-equality preserved on byte-identical
  re-build. ASCII state-machine diagram in file header (D9).

1.4 best-effort.ts (4 cases)
  loadActivePackBestEffort(ctx) returns ResolvedPack | null. Single
  source of truth for the 4 T1.5 wiring sites (Phase 8). null means
  "EMPTY FILTER" semantics, NOT "fall back to hardcoded defaults" —
  pack-load failure must be loud, per D4. Never throws.

1.5 lint-rules.ts (35 cases)
  11 pure rule functions extracted from CLI handlers per codex C13/D16.
  9 file-plane rules + 2 DB-aware rules (extractable_empty_corpus, mutation_count_anomaly).
  Phase 2 withMutation pre-write gate composes file-plane subset.
  runAllLintRules() returns {ok, errors, warnings} structured report
  ready for CLI + MCP.

1.6 query-cache-invalidator.ts (4 cases)
  invalidateQueryCache(engine, sourceId?) DELETEs query_cache rows so
  cached search results bound to old page types don't survive a
  schema mutation. Reuses SemanticQueryCache.clear() so we don't
  reinvent the PGLite+Postgres parity. Codex C9 fix.

Tests: 84 new cases across 6 test files. All 153 schema-pack tests green.

Plan: ~/.claude/plans/system-instruction-you-are-working-recursive-thacker.md
Closes: half of T2-T7 from the plan's Implementation Tasks JSONL.
Successor to: closed PR #1321 (community PR; author garrytan-agents).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Builds on Phase 1 foundations (pack-lock, mutate-audit, lint-rules,
cache invalidation, query-cache-invalidator).

withMutation(packName, opts, mutator, op, ctx): 8-step skeleton wrapping
every primitive. Atomic .tmp+fsync+rename. Per-pack file lock. Pre-write
file-plane lint validation gate. Audit log on success AND failure. Pack
cache + query cache invalidation hooks. ASCII state-machine diagram in
file header per D9.

11 primitives, each ~5-line wrapper around withMutation:
  add_type, remove_type (with codex C14 reference check), update_type
  add_alias, remove_alias, add_prefix, remove_prefix
  add_link_type (rejects fm_links refs on remove)
  remove_link_type, set_extractable, set_expert_routing

Inline minimal JSON→YAML emitter so mutating a YAML pack stays YAML.
The emitter's array-of-mappings nesting was tricky: the first key sits
inline with the `- ` (e.g. `- name: person`), subsequent keys live at
indent+1, and nested arrays inside the mapping keep their relative
depth (the v0.40.6 emitter bug I fixed pre-commit: trim+prefix lost
internal indent of nested arrays like path_prefixes).

YAML round-trip: emitted YAML reparses cleanly through parseYamlMini.
Comments and formatting NOT preserved (documented in plan; pin pack.json
if you care about layout).

Codex C14 reference check: removeType refuses if any other type's
aliases/enrichable_types/link_types/frontmatter_links references the
target. STILL_REFERENCED error names every reference for cleanup.

Validation gate composes runFilePlaneLintRules from Phase 1.5 — a
mutation that would create a dangling ref or prefix collision fails
BEFORE the .tmp write (the invariant: pack file on disk is NEVER
partial).

Tests: 34 cases pinning every primitive + skeleton invariant. Bundled
guard, codex C14, atomicity (crash-mid-write leaves original untouched,
lock auto-released after mutator throw), YAML round-trip, validation
gate firing on prefix collision.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both ship as runStatsCore / runSyncCore pure functions so Phase 4 CLI
handlers (next commit) and Phase 7 MCP ops (later) both compose without
duplicating logic. Codex C13 / D16 prereq for the MCP exposure phase.

stats.ts (17 cases):
  Multi-source aware: sourceIds[] (federated read) OR sourceId (single)
  OR neither (whole-brain aggregate). NULLIF(type, '') normalizes
  empty-string + NULL to one untyped bucket (pages.type is NOT NULL in
  the schema so empty string is the legacy "untyped" representation).
  Soft-delete exclusion. by_type sorted by count desc, ties by name asc.
  Empty-brain coverage:1.0 (vacuous truth, matches getBrainScore).
  Dead-prefix detection: pack-declared prefixes with zero matching
  pages surface as DeadPrefixHint[] (agent's drilldown signal for
  mis-declared paths). Best-effort: pack-load failure leaves
  pack_identity:null + dead_prefixes:[].

sync.ts (13 cases):
  D14 chunked UPDATE: 1000-row batches per prefix. Each batch:
  WITH win AS (SELECT id FROM pages WHERE untyped+prefix LIMIT $batch),
  upd AS (UPDATE ... WHERE id IN win RETURNING 1) SELECT COUNT(*). Loop
  until zero rows. Concurrent writers never block on the row-set for
  more than ~100ms per batch (vs the multi-second monolithic UPDATE
  shape PR #1321 had).
  Codex C5 write-side scoping: sourceId param directly, NOT
  sourceScopeOpts which is read-side and inherits OAuth federation
  reads. Phase 7 MCP op (schema_apply_mutations) enforces at dispatch.
  Dry-run by default: per-prefix probe returns would_apply + 10-slug
  sample (the drilldown signal). Apply path returns total_applied.
  Idempotency contract pinned: second apply finds zero matching rows.
  Soft-delete exclusion on both probe + update. Dead-prefix flag set
  when probe returns count=0. JSON envelope schema_version:1.

Tests use canonical PGLite block per CLAUDE.md test-isolation rules.
seedPage helper auto-seeds sources(id) row before FK insert.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Thin handlers wrapping Phase 2's mutation primitives + Phase 3's
stats/sync cores. CLI is the human surface; Phase 7 wires the same
cores into MCP for agent use.

New verbs:
  Authoring:
    add-type <name> --primitive P --prefix dir/ [--extractable]
                    [--expert] [--alias A]* [--pack <name>]
    remove-type <name>             [--pack <name>]
    update-type <name>             [--extractable BOOL] [--expert BOOL]
                                   [--primitive P] [--pack <name>]
    add-alias <type> <alias>       [--pack <name>]
    remove-alias <type> <alias>    [--pack <name>]
    add-prefix <type> <prefix>     [--pack <name>]
    remove-prefix <type> <prefix>  [--pack <name>]
    add-link-type <name> [--inverse V] [--page-type T] [--target-type T]
                                   [--pack <name>]
    remove-link-type <name>        [--pack <name>]
    set-extractable <type> BOOL    [--pack <name>]
    set-expert-routing <type> BOOL [--pack <name>]
  Activation:
    reload [--pack <name>]         Flush in-process cache; --pack scopes
  Discovery + repair:
    stats [--source <id>]          Per-type counts + coverage + dead prefixes
    sync [--apply] [--source <id>] Backfill page.type (chunked UPDATE)

cli.ts: schema added to CLI_ONLY_SELF_HELP so `gbrain schema --help`
routes to printHelp() instead of the generic one-line stub.

withConnectedEngine defensive fix retained from PR #1321:
EngineConfig built once and passed to BOTH createEngine and
engine.connect for future-proof against engine implementations that
read URL at connect time.

End-to-end agent journey verified:
  fork gbrain-base mine → use mine →
  add-type researcher --primitive entity --prefix people/researchers/
    --extractable --expert →
  active (shows 23 page types) →
  stats (shows 100% coverage on empty brain, vacuous truth).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This is the marquee commit: Wintermute and any other remote OAuth agent
can now author + introspect schema packs over normal HTTPS MCP. Phases
5+6 collapse into Phase 7 because the new MCP ops compose Phase 1.5's
lint rules and Phase 2/3's mutation/stats/sync cores directly — no
extra extraction needed (D6 from /plan-eng-review).

Phase 5: schema lint CLI wired to runAllLintRules from Phase 1.5
  Replaces the prior 2-rule check (duplicate names + missing prefix)
  with the full 11-rule suite. New --with-db flag opts into the 2
  DB-aware rules (extractable_empty_corpus, mutation_count_anomaly).
  JSON envelope shape stable. Exit code 1 on any error.

Phase 7: 9 new MCP operations
  Read-scope (NOT localOnly — read scope is safe to expose remote):
    get_active_schema_pack — identity packet (pack name, sha8, counts).
    list_schema_packs       — bundled + installed names.
    schema_stats            — composes runStatsCore from Phase 3.
    schema_lint             — composes runAllLintRules; --with-db is
                              CLI-only (DB-aware rules need engine).
    schema_graph            — JSON {nodes, edges} from link_types
                              inference + frontmatter_links.
    schema_explain_type     — settings for one declared type.
    schema_review_orphans   — untyped pages drilldown.
  Admin-scope (NOT localOnly per D2 — Wintermute reaches via OAuth):
    schema_apply_mutations  — BATCHED per D10. Single MCP tool taking
                              a mutations[] array; composes all 11
                              mutate primitives. Atomic batch_id; outer
                              withPackLock wraps the whole batch so no
                              other writer can slip in mid-iteration.
                              Partial-results returned on mid-batch
                              failure for forensic agent debugging.
                              Audit log records actor=mcp:<clientId8>
                              (D20 privacy-redacted shape).
    reload_schema_pack      — flush in-process cache + extends-chain
                              cascade (codex C6 fix from Phase 1.3).

withConnectedEngine defensive fix applied to schema.ts:withConnectedEngine
  (PR #1321 closed) — EngineConfig built once and passed to BOTH
  createEngine AND engine.connect for defense in depth.

Test seams:
  - operationsByName lookup pinned for every new op.
  - All 9 ops have scope + localOnly declarations pinned to lock in
    the trust posture.
  - Batched mutation atomicity tested: partial-failure returns
    {error: mutation_failed, partial_results: [...]} with one batch_id
    across all results.
  - Audit log actor=mcp:<clientId.slice(0,8)> capture verified
    end-to-end (audit JSONL read back after the op handler runs).
  - Empty mutations[] rejected with invalid_request.
  - Unknown op surfaced via SchemaPackMutationError INVALID_RESULT.

Coverage: 23 new cases for the 9 ops (operations-schema-pack.test.ts).
All 255 schema-pack-related tests green.

Plan: ~/.claude/plans/system-instruction-you-are-working-recursive-thacker.md
Successor to: closed PR #1321.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Final wave commit. Brings the cathedral from "shipped but undiscoverable"
to "shipped + agents find it + agents use it."

Phase 8 (partial T1.5 wiring — agent-facing surfaces):
  - whoknows CLI (src/commands/whoknows.ts:340) consults the active pack
    via loadActivePackBestEffort + expertTypesFromPack. Pack-load failure
    → EMPTY filter (NOT hardcoded ['person', 'company'] defaults) per
    D4. A researcher type declared --expert in a custom pack now
    surfaces in `gbrain whoknows "ML"` results. Pre-v0.40.6 it silently
    never matched.
  - find_experts MCP op (src/core/operations.ts:2820) same wiring so
    OAuth clients (Wintermute etc.) inherit pack-aware expert routing
    over HTTP MCP, not just CLI.
  - facts/eligibility.ts and enrichment-service.ts union widening
    deferred to v0.40.7+ (filed in TODOS.md as 2 follow-up entries) —
    larger blast radius than fit this wave's context budget.

Phase 10 (skill + RESOLVER + Convention — the discoverability layer):
  - skills/schema-author/SKILL.md — agent dispatcher for "evolve the
    schema pack." 36 trigger phrases route here. Explicit Non-goals
    section names brain-taxonomist (filing one page) and eiirp
    (schema-check during iteration) so agents pick the right surface.
    7-phase workflow: brain → assess → propose → apply → sync → verify
    → commit. Lists every gbrain schema CLI verb + every MCP op the
    skill uses. brain_first: exempt frontmatter (this skill IS the
    brain-first path for schema authoring).
  - skills/conventions/schema-evolution.md — decision tree for "when to
    add a type vs alias vs prefix." <20 pages → don't pack-codify;
    20-100 → alias or narrow prefix; 100+ → first-class type. Don'ts
    section + "when to remove a type" + "when to commit the pack" all
    answered from one place.
  - skills/RESOLVER.md entry with full functional-area dispatcher line
    (compressed routing pattern per v0.32.3 dispatcher convention).
  - schema-evolution.md added to the cross-cutting Conventions list.

Phase 12 (ship bookkeeping):
  - VERSION → 0.40.6.0
  - package.json → 0.40.6.0
  - CHANGELOG.md entry with ELI10 lead per CLAUDE.md voice rules
    (250+ words explaining the wave in plain English before any
    file/function name appears), full "To take advantage of v0.40.6.0"
    paste-ready commands block, itemized changes by category, credit
    to @garrytan-agents (PR #1321 author).
  - TODOS.md gains 10 new follow-up entries grouped under
    "v0.40.6.0 Schema Cathedral v3 follow-ups (v0.40.7+)" covering:
    enrichment-service union widening, facts/eligibility wiring, 3
    doctor checks, T16 + T16.1 evals, T19 federated closure, T20
    extends merging, T21 YAML comments, T22 admin SPA, T23
    schema:write scope, T24 multi-tenant federation.
  - llms-full.txt regenerated via bun run build:llms (CLAUDE.md
    edits trigger the test/build-llms.test.ts gate — required per
    repo discipline).

Verification:
  - bun run typecheck clean.
  - Full agent journey smoke-tested end-to-end in Phase 4 commit
    (fork → use → add-type → active → stats — all green).
  - All 255+ schema-pack tests green from Phases 1-7.

Total wave: 6 commits, ~5000 net LOC, 84 new tests, 21 design
decisions captured. PR #1321 closed with successor pointer comment.

Plan: ~/.claude/plans/system-instruction-you-are-working-recursive-thacker.md

Co-Authored-By: garrytan-agents <garrytan-agents@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…1324)

Master shipped v0.40.6.0 as "feat(sync): parallel sync --all" (PR #1324)
while this branch was building Schema Cathedral v3 against the same slot.
Standard queue-aware re-bump: my entries move up one slot to v0.40.7.0;
master's v0.40.6.0 stays as-is. Both CHANGELOG entries kept, mine on top.

Resolved files:
  - VERSION:      0.40.6.0 → 0.40.7.0
  - package.json: 0.40.6.0 → 0.40.7.0
  - CHANGELOG.md: two entries kept (v0.40.7.0 + v0.40.6.0)
  - TODOS.md:     two new sections kept (mine + master's sync follow-ups)
  - llms-full.txt: regenerated via bun run build:llms
  - llms.txt:     regenerated

Conflict trio audit: VERSION + package.json + CHANGELOG header all 0.40.7.0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…l conformance

CI failures from PR #1327 first run:

1. check:privacy script flagged 4 'Wintermute' name leaks (CLAUDE.md:550 rule —
   never use the private OpenClaw fork name in public artifacts):
   - src/core/operations.ts:3816 → 'your OpenClaw and similar remote agents'
   - src/core/operations.ts:4015 → 'your OpenClaw, etc.' (in description)
   - src/core/operations.ts:4225 → 'your OpenClaw, etc.' (in comment)
   - test/operations-schema-pack.test.ts:325 → clientId 'remoteAgentClient12345678'
     (matching audit-actor regex updated: 'mcp:remoteAg' instead of 'mcp:wintermu')

2. skills/manifest.json missing schema-author entry. Added between
   brain-taxonomist and skillify per alphabetical-ish grouping.

3. skills/schema-author/SKILL.md missing 3 conformance sections per
   test/skills-conformance.test.ts:
   - ## Contract (inputs/outputs/side effects/idempotency/trust/atomicity)
   - ## Anti-Patterns (don't mutate bundled packs, don't add types for one-off
     directories, don't conflate filing vs. schema authoring, etc.)
   - ## Output Format (per-mutation JSON, per-batch JSON, stats JSON, sync
     dry-run JSON, human format, error envelope codes)

   The 3 sections were inserted ABOVE the existing 'Failure modes' section so
   the existing failure-mode bullets are still adjacent to the new error
   envelope codes in Output Format.

Verified locally:
- bun run check:privacy → clean
- bun test test/skills-conformance.test.ts test/check-resolvable.test.ts test/check-resolvable-cli.test.ts test/regression-v0_22_4.test.ts → 286/286 pass
- bun test test/operations-schema-pack.test.ts → 23/23 pass
- bun run verify → clean (privacy + skill_brain_first + fuzz-purity + typecheck)

llms.txt + llms-full.txt regenerated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Doc-debt cleanup from the v0.40.7.0 ship (Phase 12 had deferred these to
fit context budget; /document-release surfaced the gap):

- README.md: new "What's new in v0.40.7.0" lead paragraph above the
  v0.36.4.0 entry. ELI10 lead: "Your agents can now author your brain's
  schema pack themselves" + the agent journey + 14 CLI verbs + 9 MCP
  ops + schema-author skill boundary callouts.

- CLAUDE.md: new "Schema Cathedral v3 (v0.40.7.0)" section between the
  thin-client routing cluster and the Commands section. 14-bullet
  Key Files cluster covering pack-lock / mutate-audit / registry /
  best-effort / lint-rules / query-cache-invalidator / mutate / stats /
  sync / schema.ts CLI / operations.ts MCP / whoknows T1.5 wiring /
  schema-author skill / schema-evolution convention. Each bullet
  references the design decisions (D2/D4/D6/D8/D9/D10/D11/D13/D14/D20)
  and codex findings (C5/C6/C8/C9/C13/C14) captured during /plan-eng-review.
  Closes the "CLAUDE.md has zero v0.40.7.0 mentions" doc debt.

- llms-full.txt + llms.txt regenerated.

Privacy check clean (no Wintermute leaks in the new prose — used "your
OpenClaw" per CLAUDE.md:550 rule). test/build-llms.test.ts 7/7 green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the tutorial gap surfaced by /document-release's Diataxis coverage
map. The schema-pack cathedral shipped with reference (CLAUDE.md cluster),
how-to (SKILL.md 7-phase workflow), and explanation (conventions/
schema-evolution.md decision tree), but no tutorial — no concrete
"your first schema mutation" walkthrough.

docs/schema-author-tutorial.md ships exactly that:
- 8 numbered steps, time-to-first-result < 3 (active pack visible by step 2)
- Walks from `gbrain schema fork gbrain-base mine` through `add-type
  researcher` + `sync --apply` + proving the T1.5 wiring via `gbrain
  whoknows` surfacing the new type
- Every step shows the exact command and expected output
- Placeholder pages (alice-example, bob-example, charlie-example) so any
  brain can run the tutorial without affecting real content
- "What you built" section recaps state on disk + active wiring
- "Next steps" cover add-link-type, add-alias, lint --with-db, commit to
  source control, MCP path for agents
- "Related docs" cross-links to reference (CLAUDE.md cluster) + how-to
  (SKILL.md workflow) + explanation (schema-evolution.md)

Cross-linked:
- README.md "What's new in v0.40.7.0" paragraph gets a "Walkthrough:"
  pointer at the end
- skills/schema-author/SKILL.md gets a "## Tutorial" callout just above
  the workflow phases — agents that hit the skill via RESOLVER routing
  see the tutorial pointer first

Closes the Diataxis quadrant matrix to full coverage:
- Tutorial:     ✅ docs/schema-author-tutorial.md (NEW)
- How-to:       ✅ skills/schema-author/SKILL.md workflow
- Reference:    ✅ CLAUDE.md cluster + gbrain schema --help
- Explanation:  ✅ skills/conventions/schema-evolution.md

Privacy check clean. Typecheck clean. llms-full.txt regenerated (545KB).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ument)

The schema-author tutorial walks through HOW to mutate a pack. This new
doc explains WHY agents and users should care, with concrete killer use
cases on real corpus shapes:

1. The 4000 invisible meetings — untyped pages skip every structural
   surface (whoknows, find_experts, recall, think). Adding a `meeting`
   type + sync flips them from invisible to queryable. Same content,
   completely different agent experience.

2. The founder ops brain — 4 type-adds + 4 link-types build a
   CRM-shaped query surface. `gbrain whoknows "Series A SaaS"` routes
   through investor + portco specifically; `graph-query` walks intro
   chains. Downstream of notes, not parallel to them.

3. The research brain — researcher / paper / lab / grant / dataset
   types + cites / authored / uses link verbs turn a reading-list-as-
   markdown into a queryable research graph.

4. The legal brain (or anything where claims have numbers) — typed
   `damages=5000000`, `filed_date=...` become comparable across pages
   of the same type. Generic note systems can't do this because they
   don't know which numbers belong to which type.

5. The team brain — each mounted brain has its own schema pack. Two
   engineers searching the same brain get DIFFERENT routing because
   their personal packs declare different expert types.

6. The agent-co-curates pattern — the NEW thing in v0.40.7.0. Agent
   watches your ingestion stream, runs `gbrain schema detect`
   periodically, proposes a new type when a pattern accumulates, applies
   it via batched MCP `schema_apply_mutations` after one approval.
   Brain learns. Audit log captures the agent's client_id as
   `actor: mcp:<clientId8>`.

7. Before-vs-after on real content — pick a corpus, note top-3
   whoknows results, add the type via sync, re-run. The numerical
   delta IS the win.

Then the structural argument: types matter at query time. Untyped
content is invisible content. The schema is queryable AND mutable AND
auditable — that's the production-system difference from "vibes-based
knowledge management."

Closes with the v0.40.7.0-specific list of what changed (withMutation
skeleton, O_CREAT|O_EXCL atomic lock vs page-lock.ts TOCTOU pattern,
privacy-redacted audit log, 9 MCP ops, T1.5 wiring, cross-process
invalidation via stat-mtime TTL gate).

Cross-linked:
- README.md "What's new in v0.40.7.0" paragraph now has both the
  "Why it matters:" pointer (this doc) AND the "Walkthrough:"
  pointer (tutorial).
- docs/schema-author-tutorial.md opens with "Want the WHY before the
  HOW?" link to this doc.
- skills/schema-author/SKILL.md now has a "Tutorial + vision" section
  that points at both, with explicit guidance that agents should read
  the WHY doc before pitching schema authoring to a user.

177 lines. Privacy check clean. Typecheck clean. llms-full.txt
regenerated (545KB).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s.txt

The two new schema docs were ONLY linked from the v0.40.7.0 "What's new"
paragraph in README. That paragraph will get pushed down by every future
release and become a worse and worse entry point.

Real discovery paths added:

1. README.md `## Capabilities` section — new "Agent-authored schema
   (v0.40.7.0)" bullet between "Brain consistency" and "## Integrations".
   Permanent home alongside Hybrid search, Self-wiring graph, Minions,
   43 skills, Eval framework. Includes the one-paragraph pitch + 3
   pointer links (vision / tutorial / agent skill).

2. README.md `## Docs` index — two new lines added at the top of the
   list (right after docs/INSTALL.md, before docs/architecture/):
   - docs/what-schemas-unlock.md with one-line description
   - docs/schema-author-tutorial.md with one-line description

3. scripts/llms-config.ts `Configuration` section — both docs added to
   the curated llms.txt entry list so the LLM-readable map points at
   them. Sits right after docs/GBRAIN_RECOMMENDED_SCHEMA.md (topical
   grouping). includeInFull defaults to true so they ride in the
   single-fetch llms-full.txt bundle.

Result: schema docs are now reachable from 5 entry points instead of 1:
  - README "What's new" paragraph (release-pinned, will age out)
  - README Capabilities bullet (permanent, top-of-funnel)
  - README Docs index (permanent, end-of-page reference)
  - llms.txt (LLM-readable curated map)
  - llms-full.txt (single-fetch bundle for agents)

Also caught 3 leftover Wintermute leaks in docs/what-schemas-unlock.md
that the privacy check flagged: agent-co-curates pattern now uses "your
OpenClaw"; `register-client wintermute` example renamed to
`register-client my-agent` per CLAUDE.md:550 privacy rule. Privacy
check clean. test/build-llms.test.ts 7/7 green. llms.txt 4314 → 5000
bytes, llms-full.txt 545KB → 572KB.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@garrytan garrytan merged commit 3c1cc8a into master May 23, 2026
8 checks passed
mgunnin added a commit to mgunnin/gbrain that referenced this pull request May 28, 2026
* upstream/master: (22 commits)
  v0.41.4.0 wave: local providers + cross-platform stdin + gateway-routed dream judge (6 community PRs) (garrytan#1377)
  v0.41.3.0 fix(security/mcp): OAuth CORS lockdown + pre-register without DCR + validator surface (garrytan#1403)
  v0.41.2.0 feat: lens packs + epistemology unification — atoms + concepts as first-class units, calibration profile widening, gstack-learnings bridge (garrytan#1364)
  v0.41.1.0 feat: eval-loop wave — gbrain bench publish + gbrain eval gate close the LOOP (garrytan#1352)
  v0.41.0.0 feat(minions): fleet you supervise (4 field bugs + cathedral) (garrytan#1367)
  v0.40.10.0 feat: content sanity defense — junk-pattern throw + oversize-skip-embed (garrytan#1351)
  v0.40.9.0 feat(chunker): .sql indexing via tree-sitter + code-def on SQL DDL (garrytan#1173) (garrytan#1350)
  v0.40.8.1 docs: README rewrite + personal-brain + company-brain tutorials (garrytan#1345)
  v0.40.8.0 test: e2e + unit gap coverage + master flake root-cause fixes (garrytan#1313)
  v0.40.6.1 docs(todos): file v0.41 wave commitments + 7 verified-missing items (garrytan#1333)
  v0.40.7.0 Schema Cathedral v3 — agent-on-ramp + production rebuild of PR garrytan#1321 (garrytan#1327)
  v0.40.6.0 feat(sync): parallel sync --all + per-source lock invariant + sources status dashboard (productionized from PR garrytan#1314) (garrytan#1324)
  v0.40.5.0 Federated Sync v2 — parallel source sync + push triggers + per-source health (garrytan#1322)
  v0.40.4.0 feat(search): selective graph signals + per-stage attribution + audit-writer unification (garrytan#1300)
  v0.40.3.0 feat: contextual retrieval + cache invalidation gate + 4 deferred-item closures (garrytan#1323)
  v0.40.2.0 feat: trajectory routing for temporal + knowledge_update (gbrain think + LongMemEval) (garrytan#1296)
  v0.40.1.0 Track D — eval infrastructure (catch retrieval regressions, prove answer-quality wins) (garrytan#1298)
  v0.40.0.0 feat: agent-voice (Mars + Venus) + copy-into-host-repo skillpack paradigm (garrytan#1128)
  v0.39.3.0: productionize the v0.38 ingestion cathedral (smoke-test fix wave from PR garrytan#1299) (garrytan#1308)
  v0.39.2.0 feat(autopilot): per-source fan-out + cycle lock primitive + phase taxonomy (garrytan#1295)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant