feat(schema): cathedral v2 — add-type, remove-type, stats, sync + withConnectedEngine fix#1321
Conversation
…hConnectedEngine fix
Critical bug fix:
withConnectedEngine passed {} to engine.connect() instead of the
resolved config. Every DB-backed verb (detect, suggest, review-candidates,
review-orphans) silently failed with 'No database URL'. Now passes the
resolved EngineConfig to both createEngine and engine.connect.
New commands:
add-type — append a page_type to the active pack (validates before write)
remove-type — remove a page_type from the active pack
stats — per-type page counts + typed-coverage from the DB
sync — backfill page.type for rows matching pack prefixes (dry-run default)
Improvements:
lint — catches alias collisions, dangling enrichable refs,
link-verb inference refs to unknown types, frontmatter_links
pointing at undeclared types/verbs
graph — now shows link-verb edges between types, not just a flat list
help text — updated from stale v0.38 deferred list to full command reference
cli.ts — schema routes to its own rich printHelp() instead of the
generic self-help short-circuit
Tests: 10 new mutate tests (JSON + YAML round-trip, rejection guards).
All 7 existing CLI contract tests pass unchanged.
|
Hi @garrytan-agents — thanks for this work. Closing in favor of a successor wave on branch Why the rebuild instead of merging. Three production gaps emerged during eng review (
What this PR gets credit for (named in the v0.40.6.0 CHANGELOG and commit messages):
All of these survive into v0.40.6.0 — hardened, tested, MCP-exposed, and skill-discoverable. Plan file: Schema Cathedral v3 plan (review-cleared, 13 phases, 20 implementation tasks, 5 cross-model tensions resolved). Thank you for the PR — the design conversation was load-bearing for the production rebuild. |
… PR #1321 (#1327) * v0.40.6.0 Phase 1 foundations — pack-lock + mutate-audit + cache invalidation + lint rules + best-effort Six new primitives that Phase 2's withMutation skeleton (next commit) depends on. No consumers yet; all callers wire up in Phase 4. Foundations ship first per codex C1 phase-ordering finding from /plan-eng-review. 1.1 pack-lock.ts (18 cases) Atomic acquire via openSync(path, 'wx') = O_CREAT|O_EXCL. Kernel-level atomic, NO TOCTOU window. Codex C8 caught that page-lock.ts:79+96 has existsSync+writeFileSync (TOCTOU) — we deliberately do NOT copy it. Stale detection via TTL (60s default) + kill(pid, 0) liveness probe. TTL refresh every 10s while withPackLock(fn) runs so long DB-aware lint/stats on big brains don't go stale. --force = "steal stale lock" (NOT "skip locking"). Lock path per-pack so two packs never block. 1.2 mutate-audit.ts (13 cases) ISO-week JSONL at ~/.gbrain/audit/schema-mutations-YYYY-Www.jsonl. Privacy redacted per D20: type names → sha8, prefixes → first slug segment only. Matches candidate-audit.ts privacy posture. Both verbose surfaces gate on GBRAIN_SCHEMA_AUDIT_VERBOSE=1 (same env). Logs BOTH success AND failure events so Phase 9's schema_pack_writability doctor check has signal to read (closes codex C11). summarizeMutations() primitive shipped for cross-surface parity between doctor + future audit CLI. 1.3 registry.ts cache invalidation + stat-mtime TTL (10 cases) invalidatePackCache(name?) walks the extends-chain reverse-graph (every cached entry whose chain contains name is evicted). This is the codex C6 fix — pre-v0.40.6, editing a parent pack silently left children stale because cache identity was child-bytes-only. New per-name CacheEntry tracks the file-stat snapshot of every file in the extends chain. tryCachedPack(name) is the TTL-gated fast path: inside STAT_TTL_MS (1000ms default, env GBRAIN_PACK_STAT_TTL_MS) returns cached without statting. Outside the window: stats every file and cascade-invalidates on any mtime change (D11 cross-process detection). resolvePack reference-equality preserved on byte-identical re-build. ASCII state-machine diagram in file header (D9). 1.4 best-effort.ts (4 cases) loadActivePackBestEffort(ctx) returns ResolvedPack | null. Single source of truth for the 4 T1.5 wiring sites (Phase 8). null means "EMPTY FILTER" semantics, NOT "fall back to hardcoded defaults" — pack-load failure must be loud, per D4. Never throws. 1.5 lint-rules.ts (35 cases) 11 pure rule functions extracted from CLI handlers per codex C13/D16. 9 file-plane rules + 2 DB-aware rules (extractable_empty_corpus, mutation_count_anomaly). Phase 2 withMutation pre-write gate composes file-plane subset. runAllLintRules() returns {ok, errors, warnings} structured report ready for CLI + MCP. 1.6 query-cache-invalidator.ts (4 cases) invalidateQueryCache(engine, sourceId?) DELETEs query_cache rows so cached search results bound to old page types don't survive a schema mutation. Reuses SemanticQueryCache.clear() so we don't reinvent the PGLite+Postgres parity. Codex C9 fix. Tests: 84 new cases across 6 test files. All 153 schema-pack tests green. Plan: ~/.claude/plans/system-instruction-you-are-working-recursive-thacker.md Closes: half of T2-T7 from the plan's Implementation Tasks JSONL. Successor to: closed PR #1321 (community PR; author garrytan-agents). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.40.6.0 Phase 2 — mutate.ts withMutation skeleton + 11 primitives Builds on Phase 1 foundations (pack-lock, mutate-audit, lint-rules, cache invalidation, query-cache-invalidator). withMutation(packName, opts, mutator, op, ctx): 8-step skeleton wrapping every primitive. Atomic .tmp+fsync+rename. Per-pack file lock. Pre-write file-plane lint validation gate. Audit log on success AND failure. Pack cache + query cache invalidation hooks. ASCII state-machine diagram in file header per D9. 11 primitives, each ~5-line wrapper around withMutation: add_type, remove_type (with codex C14 reference check), update_type add_alias, remove_alias, add_prefix, remove_prefix add_link_type (rejects fm_links refs on remove) remove_link_type, set_extractable, set_expert_routing Inline minimal JSON→YAML emitter so mutating a YAML pack stays YAML. The emitter's array-of-mappings nesting was tricky: the first key sits inline with the `- ` (e.g. `- name: person`), subsequent keys live at indent+1, and nested arrays inside the mapping keep their relative depth (the v0.40.6 emitter bug I fixed pre-commit: trim+prefix lost internal indent of nested arrays like path_prefixes). YAML round-trip: emitted YAML reparses cleanly through parseYamlMini. Comments and formatting NOT preserved (documented in plan; pin pack.json if you care about layout). Codex C14 reference check: removeType refuses if any other type's aliases/enrichable_types/link_types/frontmatter_links references the target. STILL_REFERENCED error names every reference for cleanup. Validation gate composes runFilePlaneLintRules from Phase 1.5 — a mutation that would create a dangling ref or prefix collision fails BEFORE the .tmp write (the invariant: pack file on disk is NEVER partial). Tests: 34 cases pinning every primitive + skeleton invariant. Bundled guard, codex C14, atomicity (crash-mid-write leaves original untouched, lock auto-released after mutator throw), YAML round-trip, validation gate firing on prefix collision. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.40.6.0 Phase 3 — stats + sync pure core functions Both ship as runStatsCore / runSyncCore pure functions so Phase 4 CLI handlers (next commit) and Phase 7 MCP ops (later) both compose without duplicating logic. Codex C13 / D16 prereq for the MCP exposure phase. stats.ts (17 cases): Multi-source aware: sourceIds[] (federated read) OR sourceId (single) OR neither (whole-brain aggregate). NULLIF(type, '') normalizes empty-string + NULL to one untyped bucket (pages.type is NOT NULL in the schema so empty string is the legacy "untyped" representation). Soft-delete exclusion. by_type sorted by count desc, ties by name asc. Empty-brain coverage:1.0 (vacuous truth, matches getBrainScore). Dead-prefix detection: pack-declared prefixes with zero matching pages surface as DeadPrefixHint[] (agent's drilldown signal for mis-declared paths). Best-effort: pack-load failure leaves pack_identity:null + dead_prefixes:[]. sync.ts (13 cases): D14 chunked UPDATE: 1000-row batches per prefix. Each batch: WITH win AS (SELECT id FROM pages WHERE untyped+prefix LIMIT $batch), upd AS (UPDATE ... WHERE id IN win RETURNING 1) SELECT COUNT(*). Loop until zero rows. Concurrent writers never block on the row-set for more than ~100ms per batch (vs the multi-second monolithic UPDATE shape PR #1321 had). Codex C5 write-side scoping: sourceId param directly, NOT sourceScopeOpts which is read-side and inherits OAuth federation reads. Phase 7 MCP op (schema_apply_mutations) enforces at dispatch. Dry-run by default: per-prefix probe returns would_apply + 10-slug sample (the drilldown signal). Apply path returns total_applied. Idempotency contract pinned: second apply finds zero matching rows. Soft-delete exclusion on both probe + update. Dead-prefix flag set when probe returns count=0. JSON envelope schema_version:1. Tests use canonical PGLite block per CLAUDE.md test-isolation rules. seedPage helper auto-seeds sources(id) row before FK insert. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.40.6.0 Phase 4 — wire 14 new schema CLI verbs Thin handlers wrapping Phase 2's mutation primitives + Phase 3's stats/sync cores. CLI is the human surface; Phase 7 wires the same cores into MCP for agent use. New verbs: Authoring: add-type <name> --primitive P --prefix dir/ [--extractable] [--expert] [--alias A]* [--pack <name>] remove-type <name> [--pack <name>] update-type <name> [--extractable BOOL] [--expert BOOL] [--primitive P] [--pack <name>] add-alias <type> <alias> [--pack <name>] remove-alias <type> <alias> [--pack <name>] add-prefix <type> <prefix> [--pack <name>] remove-prefix <type> <prefix> [--pack <name>] add-link-type <name> [--inverse V] [--page-type T] [--target-type T] [--pack <name>] remove-link-type <name> [--pack <name>] set-extractable <type> BOOL [--pack <name>] set-expert-routing <type> BOOL [--pack <name>] Activation: reload [--pack <name>] Flush in-process cache; --pack scopes Discovery + repair: stats [--source <id>] Per-type counts + coverage + dead prefixes sync [--apply] [--source <id>] Backfill page.type (chunked UPDATE) cli.ts: schema added to CLI_ONLY_SELF_HELP so `gbrain schema --help` routes to printHelp() instead of the generic one-line stub. withConnectedEngine defensive fix retained from PR #1321: EngineConfig built once and passed to BOTH createEngine and engine.connect for future-proof against engine implementations that read URL at connect time. End-to-end agent journey verified: fork gbrain-base mine → use mine → add-type researcher --primitive entity --prefix people/researchers/ --extractable --expert → active (shows 23 page types) → stats (shows 100% coverage on empty brain, vacuous truth). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.40.6.0 Phase 5+6+7 — schema lint with rich rules + 9 new MCP ops This is the marquee commit: Wintermute and any other remote OAuth agent can now author + introspect schema packs over normal HTTPS MCP. Phases 5+6 collapse into Phase 7 because the new MCP ops compose Phase 1.5's lint rules and Phase 2/3's mutation/stats/sync cores directly — no extra extraction needed (D6 from /plan-eng-review). Phase 5: schema lint CLI wired to runAllLintRules from Phase 1.5 Replaces the prior 2-rule check (duplicate names + missing prefix) with the full 11-rule suite. New --with-db flag opts into the 2 DB-aware rules (extractable_empty_corpus, mutation_count_anomaly). JSON envelope shape stable. Exit code 1 on any error. Phase 7: 9 new MCP operations Read-scope (NOT localOnly — read scope is safe to expose remote): get_active_schema_pack — identity packet (pack name, sha8, counts). list_schema_packs — bundled + installed names. schema_stats — composes runStatsCore from Phase 3. schema_lint — composes runAllLintRules; --with-db is CLI-only (DB-aware rules need engine). schema_graph — JSON {nodes, edges} from link_types inference + frontmatter_links. schema_explain_type — settings for one declared type. schema_review_orphans — untyped pages drilldown. Admin-scope (NOT localOnly per D2 — Wintermute reaches via OAuth): schema_apply_mutations — BATCHED per D10. Single MCP tool taking a mutations[] array; composes all 11 mutate primitives. Atomic batch_id; outer withPackLock wraps the whole batch so no other writer can slip in mid-iteration. Partial-results returned on mid-batch failure for forensic agent debugging. Audit log records actor=mcp:<clientId8> (D20 privacy-redacted shape). reload_schema_pack — flush in-process cache + extends-chain cascade (codex C6 fix from Phase 1.3). withConnectedEngine defensive fix applied to schema.ts:withConnectedEngine (PR #1321 closed) — EngineConfig built once and passed to BOTH createEngine AND engine.connect for defense in depth. Test seams: - operationsByName lookup pinned for every new op. - All 9 ops have scope + localOnly declarations pinned to lock in the trust posture. - Batched mutation atomicity tested: partial-failure returns {error: mutation_failed, partial_results: [...]} with one batch_id across all results. - Audit log actor=mcp:<clientId.slice(0,8)> capture verified end-to-end (audit JSONL read back after the op handler runs). - Empty mutations[] rejected with invalid_request. - Unknown op surfaced via SchemaPackMutationError INVALID_RESULT. Coverage: 23 new cases for the 9 ops (operations-schema-pack.test.ts). All 255 schema-pack-related tests green. Plan: ~/.claude/plans/system-instruction-you-are-working-recursive-thacker.md Successor to: closed PR #1321. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v0.40.6.0 Phase 8 + 10 + 12 — T1.5 wiring, schema-author skill, ship Final wave commit. Brings the cathedral from "shipped but undiscoverable" to "shipped + agents find it + agents use it." Phase 8 (partial T1.5 wiring — agent-facing surfaces): - whoknows CLI (src/commands/whoknows.ts:340) consults the active pack via loadActivePackBestEffort + expertTypesFromPack. Pack-load failure → EMPTY filter (NOT hardcoded ['person', 'company'] defaults) per D4. A researcher type declared --expert in a custom pack now surfaces in `gbrain whoknows "ML"` results. Pre-v0.40.6 it silently never matched. - find_experts MCP op (src/core/operations.ts:2820) same wiring so OAuth clients (Wintermute etc.) inherit pack-aware expert routing over HTTP MCP, not just CLI. - facts/eligibility.ts and enrichment-service.ts union widening deferred to v0.40.7+ (filed in TODOS.md as 2 follow-up entries) — larger blast radius than fit this wave's context budget. Phase 10 (skill + RESOLVER + Convention — the discoverability layer): - skills/schema-author/SKILL.md — agent dispatcher for "evolve the schema pack." 36 trigger phrases route here. Explicit Non-goals section names brain-taxonomist (filing one page) and eiirp (schema-check during iteration) so agents pick the right surface. 7-phase workflow: brain → assess → propose → apply → sync → verify → commit. Lists every gbrain schema CLI verb + every MCP op the skill uses. brain_first: exempt frontmatter (this skill IS the brain-first path for schema authoring). - skills/conventions/schema-evolution.md — decision tree for "when to add a type vs alias vs prefix." <20 pages → don't pack-codify; 20-100 → alias or narrow prefix; 100+ → first-class type. Don'ts section + "when to remove a type" + "when to commit the pack" all answered from one place. - skills/RESOLVER.md entry with full functional-area dispatcher line (compressed routing pattern per v0.32.3 dispatcher convention). - schema-evolution.md added to the cross-cutting Conventions list. Phase 12 (ship bookkeeping): - VERSION → 0.40.6.0 - package.json → 0.40.6.0 - CHANGELOG.md entry with ELI10 lead per CLAUDE.md voice rules (250+ words explaining the wave in plain English before any file/function name appears), full "To take advantage of v0.40.6.0" paste-ready commands block, itemized changes by category, credit to @garrytan-agents (PR #1321 author). - TODOS.md gains 10 new follow-up entries grouped under "v0.40.6.0 Schema Cathedral v3 follow-ups (v0.40.7+)" covering: enrichment-service union widening, facts/eligibility wiring, 3 doctor checks, T16 + T16.1 evals, T19 federated closure, T20 extends merging, T21 YAML comments, T22 admin SPA, T23 schema:write scope, T24 multi-tenant federation. - llms-full.txt regenerated via bun run build:llms (CLAUDE.md edits trigger the test/build-llms.test.ts gate — required per repo discipline). Verification: - bun run typecheck clean. - Full agent journey smoke-tested end-to-end in Phase 4 commit (fork → use → add-type → active → stats — all green). - All 255+ schema-pack tests green from Phases 1-7. Total wave: 6 commits, ~5000 net LOC, 84 new tests, 21 design decisions captured. PR #1321 closed with successor pointer comment. Plan: ~/.claude/plans/system-instruction-you-are-working-recursive-thacker.md Co-Authored-By: garrytan-agents <garrytan-agents@users.noreply.github.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ci): rename Wintermute → 'your OpenClaw' + add schema-author skill conformance CI failures from PR #1327 first run: 1. check:privacy script flagged 4 'Wintermute' name leaks (CLAUDE.md:550 rule — never use the private OpenClaw fork name in public artifacts): - src/core/operations.ts:3816 → 'your OpenClaw and similar remote agents' - src/core/operations.ts:4015 → 'your OpenClaw, etc.' (in description) - src/core/operations.ts:4225 → 'your OpenClaw, etc.' (in comment) - test/operations-schema-pack.test.ts:325 → clientId 'remoteAgentClient12345678' (matching audit-actor regex updated: 'mcp:remoteAg' instead of 'mcp:wintermu') 2. skills/manifest.json missing schema-author entry. Added between brain-taxonomist and skillify per alphabetical-ish grouping. 3. skills/schema-author/SKILL.md missing 3 conformance sections per test/skills-conformance.test.ts: - ## Contract (inputs/outputs/side effects/idempotency/trust/atomicity) - ## Anti-Patterns (don't mutate bundled packs, don't add types for one-off directories, don't conflate filing vs. schema authoring, etc.) - ## Output Format (per-mutation JSON, per-batch JSON, stats JSON, sync dry-run JSON, human format, error envelope codes) The 3 sections were inserted ABOVE the existing 'Failure modes' section so the existing failure-mode bullets are still adjacent to the new error envelope codes in Output Format. Verified locally: - bun run check:privacy → clean - bun test test/skills-conformance.test.ts test/check-resolvable.test.ts test/check-resolvable-cli.test.ts test/regression-v0_22_4.test.ts → 286/286 pass - bun test test/operations-schema-pack.test.ts → 23/23 pass - bun run verify → clean (privacy + skill_brain_first + fuzz-purity + typecheck) llms.txt + llms-full.txt regenerated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: v0.40.7.0 — schema cathedral v3 README + CLAUDE.md annotations Doc-debt cleanup from the v0.40.7.0 ship (Phase 12 had deferred these to fit context budget; /document-release surfaced the gap): - README.md: new "What's new in v0.40.7.0" lead paragraph above the v0.36.4.0 entry. ELI10 lead: "Your agents can now author your brain's schema pack themselves" + the agent journey + 14 CLI verbs + 9 MCP ops + schema-author skill boundary callouts. - CLAUDE.md: new "Schema Cathedral v3 (v0.40.7.0)" section between the thin-client routing cluster and the Commands section. 14-bullet Key Files cluster covering pack-lock / mutate-audit / registry / best-effort / lint-rules / query-cache-invalidator / mutate / stats / sync / schema.ts CLI / operations.ts MCP / whoknows T1.5 wiring / schema-author skill / schema-evolution convention. Each bullet references the design decisions (D2/D4/D6/D8/D9/D10/D11/D13/D14/D20) and codex findings (C5/C6/C8/C9/C13/C14) captured during /plan-eng-review. Closes the "CLAUDE.md has zero v0.40.7.0 mentions" doc debt. - llms-full.txt + llms.txt regenerated. Privacy check clean (no Wintermute leaks in the new prose — used "your OpenClaw" per CLAUDE.md:550 rule). test/build-llms.test.ts 7/7 green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: tutorial — Build your first schema pack (closes v0.40.7+ doc-debt) Closes the tutorial gap surfaced by /document-release's Diataxis coverage map. The schema-pack cathedral shipped with reference (CLAUDE.md cluster), how-to (SKILL.md 7-phase workflow), and explanation (conventions/ schema-evolution.md decision tree), but no tutorial — no concrete "your first schema mutation" walkthrough. docs/schema-author-tutorial.md ships exactly that: - 8 numbered steps, time-to-first-result < 3 (active pack visible by step 2) - Walks from `gbrain schema fork gbrain-base mine` through `add-type researcher` + `sync --apply` + proving the T1.5 wiring via `gbrain whoknows` surfacing the new type - Every step shows the exact command and expected output - Placeholder pages (alice-example, bob-example, charlie-example) so any brain can run the tutorial without affecting real content - "What you built" section recaps state on disk + active wiring - "Next steps" cover add-link-type, add-alias, lint --with-db, commit to source control, MCP path for agents - "Related docs" cross-links to reference (CLAUDE.md cluster) + how-to (SKILL.md workflow) + explanation (schema-evolution.md) Cross-linked: - README.md "What's new in v0.40.7.0" paragraph gets a "Walkthrough:" pointer at the end - skills/schema-author/SKILL.md gets a "## Tutorial" callout just above the workflow phases — agents that hit the skill via RESOLVER routing see the tutorial pointer first Closes the Diataxis quadrant matrix to full coverage: - Tutorial: ✅ docs/schema-author-tutorial.md (NEW) - How-to: ✅ skills/schema-author/SKILL.md workflow - Reference: ✅ CLAUDE.md cluster + gbrain schema --help - Explanation: ✅ skills/conventions/schema-evolution.md Privacy check clean. Typecheck clean. llms-full.txt regenerated (545KB). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: what-schemas-unlock — the WHY doc (7 use cases + structural argument) The schema-author tutorial walks through HOW to mutate a pack. This new doc explains WHY agents and users should care, with concrete killer use cases on real corpus shapes: 1. The 4000 invisible meetings — untyped pages skip every structural surface (whoknows, find_experts, recall, think). Adding a `meeting` type + sync flips them from invisible to queryable. Same content, completely different agent experience. 2. The founder ops brain — 4 type-adds + 4 link-types build a CRM-shaped query surface. `gbrain whoknows "Series A SaaS"` routes through investor + portco specifically; `graph-query` walks intro chains. Downstream of notes, not parallel to them. 3. The research brain — researcher / paper / lab / grant / dataset types + cites / authored / uses link verbs turn a reading-list-as- markdown into a queryable research graph. 4. The legal brain (or anything where claims have numbers) — typed `damages=5000000`, `filed_date=...` become comparable across pages of the same type. Generic note systems can't do this because they don't know which numbers belong to which type. 5. The team brain — each mounted brain has its own schema pack. Two engineers searching the same brain get DIFFERENT routing because their personal packs declare different expert types. 6. The agent-co-curates pattern — the NEW thing in v0.40.7.0. Agent watches your ingestion stream, runs `gbrain schema detect` periodically, proposes a new type when a pattern accumulates, applies it via batched MCP `schema_apply_mutations` after one approval. Brain learns. Audit log captures the agent's client_id as `actor: mcp:<clientId8>`. 7. Before-vs-after on real content — pick a corpus, note top-3 whoknows results, add the type via sync, re-run. The numerical delta IS the win. Then the structural argument: types matter at query time. Untyped content is invisible content. The schema is queryable AND mutable AND auditable — that's the production-system difference from "vibes-based knowledge management." Closes with the v0.40.7.0-specific list of what changed (withMutation skeleton, O_CREAT|O_EXCL atomic lock vs page-lock.ts TOCTOU pattern, privacy-redacted audit log, 9 MCP ops, T1.5 wiring, cross-process invalidation via stat-mtime TTL gate). Cross-linked: - README.md "What's new in v0.40.7.0" paragraph now has both the "Why it matters:" pointer (this doc) AND the "Walkthrough:" pointer (tutorial). - docs/schema-author-tutorial.md opens with "Want the WHY before the HOW?" link to this doc. - skills/schema-author/SKILL.md now has a "Tutorial + vision" section that points at both, with explicit guidance that agents should read the WHY doc before pitching schema authoring to a user. 177 lines. Privacy check clean. Typecheck clean. llms-full.txt regenerated (545KB). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: surface schema docs from README Capabilities + Docs index + llms.txt The two new schema docs were ONLY linked from the v0.40.7.0 "What's new" paragraph in README. That paragraph will get pushed down by every future release and become a worse and worse entry point. Real discovery paths added: 1. README.md `## Capabilities` section — new "Agent-authored schema (v0.40.7.0)" bullet between "Brain consistency" and "## Integrations". Permanent home alongside Hybrid search, Self-wiring graph, Minions, 43 skills, Eval framework. Includes the one-paragraph pitch + 3 pointer links (vision / tutorial / agent skill). 2. README.md `## Docs` index — two new lines added at the top of the list (right after docs/INSTALL.md, before docs/architecture/): - docs/what-schemas-unlock.md with one-line description - docs/schema-author-tutorial.md with one-line description 3. scripts/llms-config.ts `Configuration` section — both docs added to the curated llms.txt entry list so the LLM-readable map points at them. Sits right after docs/GBRAIN_RECOMMENDED_SCHEMA.md (topical grouping). includeInFull defaults to true so they ride in the single-fetch llms-full.txt bundle. Result: schema docs are now reachable from 5 entry points instead of 1: - README "What's new" paragraph (release-pinned, will age out) - README Capabilities bullet (permanent, top-of-funnel) - README Docs index (permanent, end-of-page reference) - llms.txt (LLM-readable curated map) - llms-full.txt (single-fetch bundle for agents) Also caught 3 leftover Wintermute leaks in docs/what-schemas-unlock.md that the privacy check flagged: agent-co-curates pattern now uses "your OpenClaw"; `register-client wintermute` example renamed to `register-client my-agent` per CLAUDE.md:550 privacy rule. Privacy check clean. test/build-llms.test.ts 7/7 green. llms.txt 4314 → 5000 bytes, llms-full.txt 545KB → 572KB. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: garrytan-agents <garrytan-agents@users.noreply.github.com>
* upstream/master: (22 commits) v0.41.4.0 wave: local providers + cross-platform stdin + gateway-routed dream judge (6 community PRs) (garrytan#1377) v0.41.3.0 fix(security/mcp): OAuth CORS lockdown + pre-register without DCR + validator surface (garrytan#1403) v0.41.2.0 feat: lens packs + epistemology unification — atoms + concepts as first-class units, calibration profile widening, gstack-learnings bridge (garrytan#1364) v0.41.1.0 feat: eval-loop wave — gbrain bench publish + gbrain eval gate close the LOOP (garrytan#1352) v0.41.0.0 feat(minions): fleet you supervise (4 field bugs + cathedral) (garrytan#1367) v0.40.10.0 feat: content sanity defense — junk-pattern throw + oversize-skip-embed (garrytan#1351) v0.40.9.0 feat(chunker): .sql indexing via tree-sitter + code-def on SQL DDL (garrytan#1173) (garrytan#1350) v0.40.8.1 docs: README rewrite + personal-brain + company-brain tutorials (garrytan#1345) v0.40.8.0 test: e2e + unit gap coverage + master flake root-cause fixes (garrytan#1313) v0.40.6.1 docs(todos): file v0.41 wave commitments + 7 verified-missing items (garrytan#1333) v0.40.7.0 Schema Cathedral v3 — agent-on-ramp + production rebuild of PR garrytan#1321 (garrytan#1327) v0.40.6.0 feat(sync): parallel sync --all + per-source lock invariant + sources status dashboard (productionized from PR garrytan#1314) (garrytan#1324) v0.40.5.0 Federated Sync v2 — parallel source sync + push triggers + per-source health (garrytan#1322) v0.40.4.0 feat(search): selective graph signals + per-stage attribution + audit-writer unification (garrytan#1300) v0.40.3.0 feat: contextual retrieval + cache invalidation gate + 4 deferred-item closures (garrytan#1323) v0.40.2.0 feat: trajectory routing for temporal + knowledge_update (gbrain think + LongMemEval) (garrytan#1296) v0.40.1.0 Track D — eval infrastructure (catch retrieval regressions, prove answer-quality wins) (garrytan#1298) v0.40.0.0 feat: agent-voice (Mars + Venus) + copy-into-host-repo skillpack paradigm (garrytan#1128) v0.39.3.0: productionize the v0.38 ingestion cathedral (smoke-test fix wave from PR garrytan#1299) (garrytan#1308) v0.39.2.0 feat(autopilot): per-source fan-out + cycle lock primitive + phase taxonomy (garrytan#1295) ...
Problem
The
gbrain schemaCLI surface shipped in v0.38 with five read-only verbs (active, list, show, validate, use). v0.39 wired up discovery verbs (detect, suggest, review-candidates, review-orphans) but two issues remained:Critical:
withConnectedEnginepassed{}toengine.connect()Every DB-backed verb — detect, suggest, review-candidates, review-orphans — failed silently with "No database URL: database_url is missing from config." The bug:
withConnectedEnginebuilt a properEngineConfigwithdatabase_urlfromloadConfig(), passed it tocreateEngine(), then threw it away and calledengine.connect({}). Postgres engines need the URL at connect-time. The resolved config was one line away but never forwarded.No authoring verbs
Operators with custom schema packs had to hand-edit YAML/JSON to add or remove page types. No command existed to check how the DB's actual content mapped onto the pack (coverage, per-type counts), or to retroactively set
typeon existing pages that match pack prefixes but were ingested before the pack was created.Stale help text + incomplete lint/graph
Help text still read "detect, suggest, init, fork, edit... land in v0.39" even though those commands had been wired up for two minor versions.
schema lintonly checked duplicate names and missing prefixes.schema graphprinted a flat type list with no relationship edges.Solution
Bug fix (1 line, high leverage)
Pass the resolved
EngineConfig(withdatabase_url) to bothcreateEngine()andengine.connect(). All DB-backed verbs now work out of the box.New commands
schema add-type <name> --primitive <p> --prefix <dir/>--extractable,--expert,--alias <a>(repeatable),--pack <name>.schema remove-type <name>schema statsschema sync [--apply]Improvements to existing commands
lint — Now catches:
enrichable_typesreferencing types not in the packlink_typesinference.page_type / inference.target_type referencing unknown typesfrontmatter_linksreferencing undeclared types or link verbsgraph — Now shows link-verb edges derived from
link_typesinference andfrontmatter_links:help — Complete rewrite: categorized into Inspection / Activation / Authoring / Discovery+repair. All 22 verbs documented with flags.
cli.ts —
schemanow routes to its own richprintHelp()instead of the generic CLI self-help short-circuit.File organization
src/core/schema-pack/mutate.tssrc/core/schema-pack/stats.tssrc/core/schema-pack/sync.tssrc/core/schema-pack/index.tssrc/commands/schema.tsTesting
17 tests total, 0 failures:
test/schema-pack-mutate.test.ts): JSON round-trip, YAML round-trip, duplicate rejection, invalid primitive rejection, bundled pack read-only guard, unknown type removal, malformed slug rejection, missing prefix rejection, add→remove→add cycletest/schema-cli-contract.test.ts): all pass, updatedNEW_VERBSsetProduction validation (357K-page brain):
schema stats: 100% typed coverage, 357,560 pages, per-type breakdown correctschema detect: scanned 193K pages, generated candidate typesschema lint: clean (3 pre-existing warnings for types without prefixes)schema graph: 10 edges derived from link_type inferenceschema validate garry-brain: still valid