v0.36.0.0 feat(skillpack): scaffold + reference + harvest (retire managed-block install)#1130
Merged
Conversation
Pure file-copy primitive for scaffold (gbrain→host) and harvest (host→gbrain). Atomic-refusal contract: symlink-reject + canonical-path containment validate every item before any write. Used by both directions of the v0.33 loop. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
New scaffold.ts replaces the managed-block installer. One-time additive copy into the user's repo via copyArtifacts; refuses to overwrite existing files (user owns them). Partial-state policy: copies missing paired sources even when the skill dir already exists. bundle.ts extended with loadSkillSources + enumerateScaffoldEntries — paired source files declared in each SKILL.md's frontmatter sources: array, not in openclaw.plugin.json. Single source of truth, co-located with the skill. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
reference is the read-only diff lens with an agent-readable framing line. Pure-JS unified-diff producer + parser + applier (no patch(1) dependency). Two-way merge with documented limitation: without scaffold-time base tracking, applied hunks align everything to gbrain. The agent dry-runs reference first, then decides. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
migrate-fence is the one-shot transition from the pre-v0.36 managed-block model. Strips begin/end markers and the cumulative-slugs receipt comment; preserves fence rows verbatim as user-owned routing during the transition to frontmatter discovery. Receipt-then-row fallback (F-CDX-8) covers stale/missing receipts. scrub-legacy-fence-rows is the opt-in cleanup after migrate-fence. Two-condition gate: removes a row only when skills/<slug>/ exists AND that skill's frontmatter declares non-empty triggers (proof frontmatter discovery covers it). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The inverse loop: lift a proven skill from a host repo (~/git/wintermute, etc.) back into gbrain so other clients can scaffold it. --from <host-repo-root> is symmetric with scaffold's --workspace. Security: symlink rejection + canonical-path containment (mirrors validateUploadPath). Privacy: default-on linter scans harvested files against ~/.gbrain/harvest-private-patterns.txt plus built-in defaults (Wintermute, email, Slack channel patterns). Any match rolls back the copy and exits non-zero. --no-lint bypasses for the editorial workflow after a manual scrub. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
autoDetectSkillsDir now walks up from cwd looking for any skills/ directory, ahead of the implicit ~/.openclaw/workspace fallback. cd ~/git/wintermute && gbrain skillpack scaffold ... finds wintermute automatically without requiring a RESOLVER.md/AGENTS.md to exist yet. R5 regression preserved: $OPENCLAW_WORKSPACE still wins when explicitly set. +5 test cases in test/repo-root.test.ts pin the new tier order and the R5 guard. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… T10) skillpack.ts dispatcher rewritten for the v0.36 contract: scaffold, reference (+ --apply-clean-hunks), migrate-fence, scrub-legacy-fence-rows, harvest, plus the existing list / diff / check. install and uninstall are gone — both exit non-zero with a hint pointing at scaffold / migrate-fence. Clean break, no deprecated alias. skillpack-check gains --strict for CI gating. When invoked as the subcommand `gbrain skillpack check`, default is informational (exit 0 even with drift); --strict opts back into the cron-friendly exit-1-on-issues behavior. Top-level gbrain skillpack-check preserves its existing exit semantics for backwards compat. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…(T8) The companion editorial skill for the gbrain skillpack harvest CLI. Walks the genericization checklist (scrub fork names, generalize triggers, lift fork- specific conventions to references) before the CLI runs. Routing-eval fixtures use paraphrased intents to avoid the intent_copies_trigger lint. Wires the new slug into openclaw.plugin.json#skills, skills/manifest.json, and skills/RESOLVER.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Spawns gbrain as a subprocess against tempdir workspaces. Covers: scaffold first-run + re-run no-op, reference diff + --apply-clean-hunks, migrate-fence, scrub-legacy-fence-rows, harvest privacy-lint catch + --no-lint bypass, and the install removed-error path. No DATABASE_URL needed — skillpack is filesystem-only. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Skillpacks as scaffolding, not amber. v0.36 retires the managed-block install model. Six new subcommands replace install + uninstall: scaffold, reference (with --apply-clean-hunks), migrate-fence, scrub-legacy-fence-rows, harvest, plus the existing list / diff / check (check gains --strict for CI gating). Routing comes from each skill's frontmatter triggers — gbrain does not touch your RESOLVER.md or AGENTS.md. Companion editorial skill skillpack-harvest drives the genericization checklist; default-on privacy linter catches Wintermute / email / Slack references before they leak into gbrain core. New docs guide at docs/guides/skillpacks-as-scaffolding.md walks the model and the migration path for pre-v0.36 installs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…a-v1 # Conflicts: # CHANGELOG.md # VERSION # package.json
…acing fork-name references CI's check-privacy.sh and check-test-real-names.sh both flagged the literal fork name across the v0.36 skillpack diff. Two failure modes, two fixes: 1. **Meta-rule-enforcement files** added to both allow-lists. The harvest privacy linter's whole job is to catch the banned literal leaking into gbrain; its source has the regex pattern, its tests verify the linter fires by feeding it the banned string, and the skill markdown documents the substitution policy. Same exception status as check-privacy.sh and check-proposal-pii.sh themselves. Files allow-listed: - src/core/skillpack/harvest-lint.ts - test/skillpack-harvest-lint.test.ts - test/skillpack-harvest.test.ts - test/e2e/skillpack-flow.test.ts - skills/skillpack-harvest/SKILL.md 2. **User-facing references** swapped for canonical phrasing per CLAUDE.md's responsible-disclosure rule. README + new docs guide + 4 src docstrings + 1 test now say 'your OpenClaw' / 'host agent repo' / 'agentRepo' var name. Behavior unchanged — only documentation strings touched. Verify gate (the script CI runs) passes locally: EXIT=0. Tests still pass: 60/60 across the affected files. llms-full.txt regenerated. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sister fix to the test/repo-root.test.ts update in commit a31418e. The new v0.33 cwd_walk_up tier fires before repo_root when running from inside the gbrain repo — same skills/ dir matched, different source label. Behavior unchanged; the legacy repo_root tier is now functionally subsumed (kept in the type union for back-compat). CI shard 3 failure: test/check-resolvable-cli.test.ts:171. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The 24h and 72h exact-boundary tests scheduled last_sync_at relative to Date.now() at construction time, then let the check call Date.now() again internally. CI scheduler jitter between the two reads pushed ageMs past the strict > thresholds by microseconds, dropping the 72h-boundary case into the fail branch instead of warn. Fix: add an optional `opts.now` test seam to checkSyncFreshness. The two boundary tests now capture t0 once and pass it both to the timestamp constructor and to the check, making ageMs deterministically equal to the boundary. The non-boundary tests (4d, 30h, 2h, etc.) don't need pinning — they're comfortably away from the > comparison. CI shard 1 flake: test/doctor.test.ts:479. Locally 48/48 doctor tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…a-v1 # Conflicts: # CHANGELOG.md # VERSION # package.json
… CLI surface (DX review)
DX audit of the v0.36 scaffold model surfaced one structural gap and four
output gaps. When scaffolded files land on a downstream agent's disk, the
agent had no agent-facing manifest telling it what to do — no routing
contract, no upgrade flow, no two-way merge warning at the right surface.
Fixes:
1. **New shared dep: skills/_AGENT_README.md.** Lands on every scaffold +
migrate-fence alongside the existing _brain-filing-rules.md and
_output-rules.md. Short, agent-readable contract: walk *.SKILL.md
frontmatter triggers: for routing, gbrain is reference not law on
upgrade, no managed-block fence anymore, two-way merge has known
limitations. Single source of truth for the agent operating contract.
2. **scaffold stdout** prints a next-action hint pointing at the readme
(with absolute path) and the reference --all upgrade-sweep command.
3. **reference stdout** adds per-category decision policy:
- missing → scaffold again
- differs → was edit intentional? keep it. Accidental? patch by hand or
apply-clean-hunks after reading the two-way warning.
4. **reference --apply-clean-hunks** prints the two-way merge WARNING
BEFORE the apply (to stderr, survives stdout redirect). Spells out
that gbrain has no scaffold-time base and local edits in differing
sections WILL be aligned to gbrain. Skipped in --json mode for
machine consumers. On conflicts, prints how to inspect and patch.
5. **migrate-fence stdout** tells the agent its routing model just
changed (fence gone, walk frontmatter now) and points at
scrub-legacy-fence-rows as the eventual cleanup. References the new
_AGENT_README for fresh-install agents.
Smoke verified end-to-end: 16 files land (was 15, +1 for _AGENT_README),
hint prints with absolute path, readme lands on disk. Tests + verify gate
pass clean.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…sion filter (DX deferred items) Closes the last two DX gaps from the v0.36 audit: 1. **Post-upgrade reference sweep.** New `postUpgradeReferenceSweep` helper called at the end of `gbrain post-upgrade`. After migrations apply, auto-runs `reference --all` against the detected host workspace and prints a one-line-per-skill summary of drift. Five gates: GBRAIN_SKIP_REFERENCE_SWEEP env-var bypass, no detected workspace (silent), workspace IS gbrain repo (dev-mode silent), zero drift (silent), and pure-missing skills the host never scaffolded are filtered out as noise. All errors swallowed — never blocks post-upgrade. Helper accepts test-seam opts (gbrainRoot, targetWorkspace) for unit testability. 2. **`reference --all --since <version>`.** Filters the sweep to skills whose source actually changed in gbrain between <version> and HEAD, using a new `changedSlugsSinceVersion` helper in bundle.ts. Pure-JS git wrapper (spawnSync), no deps. Accepts bare '0.X.Y.Z' or 'v0.X.Y.Z' or commit SHA. Falls back loudly to full sweep when git can't resolve the ref (tarball install, missing tag). Test coverage added — total +32 new test cases: UNIT (15 cases): - test/skillpack-changed-since-version.test.ts (9 cases): git-aware filter against a fixture git repo. Covers null on non-repo, null on bad tag, empty array on no changes, single + multi-slug drift (deduped + sorted), bare + v-prefix version forms, non- skills/ path filtering, SHA-prefix ref form. - test/upgrade-reference-sweep.test.ts (6 cases): gate logic. Covers env-var bypass, zero drift, empty-host suppression, drift-detected output shape, dev-mode workspace==gbrain guard, error-swallowing contract. E2E (8 new cases in test/e2e/skillpack-flow.test.ts): - 10: scaffold lands skills/_AGENT_README.md - 11: scaffold stdout prints the Next: hint - 12: scaffold re-run (skipped-existing) suppresses the hint - 13: reference stdout prints per-category decision policy - 14: --apply-clean-hunks WARNING on stderr, not stdout - 15: --apply-clean-hunks --json suppresses the WARNING (bug fix surfaced here: code originally printed unconditionally, now gated on !json) - 16: migrate-fence stdout points at the new routing model - 17: --since with a bad tag falls back to full sweep with warn Local sweep: 579/579 pass across 18 affected test files, verify gate EXIT=0, llms regenerated. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…a-v1 # Conflicts: # CHANGELOG.md # VERSION # package.json
garrytan
added a commit
that referenced
this pull request
May 18, 2026
Three open PRs were claiming v0.36.0.0 (#1130 skillpack, #1139 hindsight, #1136 this PR). Ship-aware queue allocator says this branch lands at v0.36.2.0. Trio audit: VERSION 0.36.2.0 package.json 0.36.2.0 CHANGELOG ## [0.36.2.0] - 2026-05-17 Updates: VERSION, package.json, CHANGELOG header + body refs, README "New default in v0.36.2.0" announcement + credit line, skills/migrations/v0.36.0.0.md renamed to v0.36.2.0.md with frontmatter + body refs updated. llms-full.txt regenerated.
… MECE structure The README had drifted into a changelog dumping ground. Four 'New in vX.Y' paragraphs competed for the lead, 16 version tags scattered through headings, the production-numbers hook (17,888 pages, 4,383 people) was six months stale, and skills were described in three places (Skills section, Commands section, inline marketing prose). Zero-based rewrite: **Refreshed catalog** (surveyed live brain + live agent fork, broad strokes per CLAUDE.md privacy rules): - ~100K total brain items (was 17,888 in the old README — 6x stale) - ~16K people (was 4,383) - ~5K companies (was 723) - ~8K concepts, ~4K originals, ~3.5K daily notes - ~31K media (30K tweets, 179 books, papers/films/games/interviews) - 108 cron jobs running (was 21) - 273 skills in the live agent fork (35 bundled + 238 user-built) **Structure** — MECE, single source of truth per concept: 1. Hook + at-a-glance table (refreshed numbers) 2. Install (3 paths, terse) 3. What it does (5 capability areas — replaces 12 scattered sections) 4. Skills (categorized one-liners — 35 lines, was ~200) 5. How it works (one coherent flow — replaces 4 overlapping sections: Architecture, Knowledge Model, Knowledge Graph, Search, Why It Works) 6. Commands (terse cheatsheet — every command, one line each) 7. Docs (link map — points to docs/ for the heavy stuff) 8. Origin / Contributing / License **Cut entirely** (moved or deleted): - 4 'New in vX.Y' leads (→ CHANGELOG.md is the changelog) - 16 (vX.Y) version tags in section headings - Minions stats subsection (subsumed into hook + 'durable background work') - Voice section (was 12 lines of brand prose) - Engine Architecture detail (→ docs/architecture/) - File Storage section (→ docs/guides/storage-tiering.md) - Per-skill marketing prose (one-liner per skill in the table) The README is no longer the changelog. Future releases append to CHANGELOG.md; the README only changes when a structural capability does. llms-full.txt regenerated. Privacy check + verify gate pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two fixes in one:
1. **Markdown bug fix.** The OAuth 2.1 paragraph had `+ PKCE,` on a line
start (column 1), which GitHub-flavored markdown interprets as a list
marker — the line break before it broke the paragraph and rendered as
an orphan first line followed by a bullet. Rewrote the OAuth 2.1
capabilities as inline-comma-separated, escaped the `+` semantics.
Swept the whole file for the same bug class — no other instances.
2. **Maximum-sell mode for evals.** Surveyed every published benchmark
in both this repo and ~/git/gbrain-evals. Strongest evidence pulled
to the top:
- **97.60% R@5 on the public LongMemEval _s (500 questions).** No LLM
in the retrieval loop. $0.50 per 1000 queries. Beats MemPalace raw
by a point on the same dataset, beats every academic dense
retriever (Stella, Contriever, BM25). Mastra/Supermemory measure
a different metric (QA accuracy with LLM judge) — flagged honestly.
- **+31.4 points P@5 from the self-wiring knowledge graph** on
BrainBench v0.20.0 (240-page rich-prose corpus, 145 relational
gold queries). Separable, measured, load-bearing. Zero retrieval
regression across seven releases (v0.16 → v0.20).
New '## Benchmarks' section after Install:
- Public benchmark table with cross-system comparison
- In-house BrainBench scorecard with per-adapter Δ vs gbrain
- Source-swamp resistance result (93.3% top-1 vs 80% grep-only)
- Skill/prompt compression: 25KB → 13KB AGENTS.md, +13-17pp accuracy
across Opus 4.7 / Sonnet 4.6 / Haiku 4.5
- 'Run your own evals' subsection with copy-pasteable commands for
every eval surface (longmemeval, cross-modal, eval capture/replay,
BrainBench)
Tightened the lead's cost-comparison claim to what's defensible per
the underlying eval doc (MemPal LLM-rerank $0.001/q vs gbrain
$0.0005/q; dropped the overstated '6x' I'd written initially).
Privacy + verify gate + build-llms test all pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…o 'Receipts on the evals' Previous lead dumped metric acronyms (R@5, P@5, P@5 deltas, MemPalace, Stella, Contriever, BM25) before the reader knew what gbrain does. A 'somewhat technical' reader hits the wall of jargon and bounces. Rewritten: **Lead (jargon-free, 3 paragraphs)** — describes the value in plain English, with two anchor numbers: - 'right answer in top 5 results 97.6% of the time' (not 'R@5 97.60%') - 'roughly 4x more relevant than plain vector RAG' (not '+31.4 pts P@5') - 'better than every comparable system that doesn't pay for a language- model call on every retrieval' (the load-bearing honest framing, without naming the competitors mid-hook) - ends with '[Receipts on the evals →]' linking down **'## Benchmarks' renamed '## Receipts on the evals'** with a glossary at the top defining R@5, P@5, and 'no LLM in the loop' in one line each. Then the full tables: LongMemEval cross-system (with the metric-mismatch flag for Mastra/Supermemory), in-house BrainBench scorecard, source-swamp resistance, and prompt compression. The competitor names + metrics stay here where readers who want the receipts can find them, with the glossary so the acronyms don't tax cold readers. Net: lead reads as 'here's what it does and the proof' instead of 'here are the benchmark numbers, figure out what they mean.' Comparison facts unchanged. Privacy + verify gate + build-llms test all pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…a-v1 # Conflicts: # CHANGELOG.md # VERSION # package.json # src/commands/doctor.ts # test/doctor.test.ts
Two specific edits from user feedback:
1. 'the standard public benchmark for AI memory systems' → 'LongMemEval'
(linked to the HuggingFace dataset). The benchmark has a name; use it.
2. 'Built by the President and CEO of Y Combinator to run his own AI
agents' (passive third-person) → 'I'm the President and CEO of Y
Combinator, and I use this 16 hours a day' (active first-person).
Carried the voice change through the rest of the README — the
downstream 'Garry's personal agent' line and the Origin section's
'Garry Tan needed... he'd ever drafted... so he built one' all flip
to first person ('my personal agent', 'I needed', 'I'd ever drafted',
'so I built one'). The README is now consistently first-person from
the author's voice instead of a hagiographic third-person framing.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three deployment patterns documented: 1. Single GBrain server + thin MCP clients (recommended). Tailscale private networking, OAuth scope, source-scoped clients, exhaustive what-clients-can/cannot-do lists. 2. Local PGLite + GStack for per-worktree code search. 3. Federated repos (advanced) — multiple servers indexing the same brain repo. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ceral query example Self-eval as a cold reader surfaced four gaps blocking a 10/10 first read: 1. Lead never says WHAT it is technically — CLI? service? cloud? local? Added a "What it is, technically" block right after the hook: open-source MIT, Bun CLI + MCP server, local-first, data stays on disk, MCP-native. 2. Install path optimized for committed users not evaluators. The old "recommended" path (deploy OpenClaw on Render, 8GB RAM) blocked anyone trying gbrain for the first time. Reordered into 3 paths by commitment: 60-second standalone CLI first, MCP for Claude Code / Cursor second, full agentic install third. 3. No example output showing what success looks like. Added a real sample `gbrain query` invocation with the hybrid-search result format so a reader can feel the experience before they install. 4. Privacy / data-locality unaddressed in lead. Now stated up front: embedding calls only hit external APIs if you configure them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan
added a commit
that referenced
this pull request
May 18, 2026
Master shipped v0.36.0.0 (skillpack scaffold / reference / harvest; retired managed-block install, #1130) — naming overlap with this branch's slot. This branch's slot stays 0.36.1.0 (already higher); master's v0.36.0.0 entry preserved in CHANGELOG. VERSION trio resolved: my 0.36.1.0 wins over master's 0.36.0.0 on VERSION, package.json, and CHANGELOG.md top entry. llms-full.txt regenerated. All other files auto-merged cleanly (CLAUDE.md, README.md, skills/RESOLVER.md, etc). Verification: - bun run typecheck: green - bun install: lockfile up to date Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan
added a commit
that referenced
this pull request
May 19, 2026
…1136) * feat(dims): OpenAI text-embedding-3 Matryoshka range validation (D13) dimsProviderOptions now fail-loud at the embed boundary when the configured embedding_dimensions is outside the model's native range (1..1536 for -small, 1..3072 for -large). Paste-ready fix hint in the AIConfigError.fix field. Closes the silent-HTTP-400 path that would have bit OpenAI-fallback users on v0.36.0.0 ZE-default installs. 16 new test cases in test/ai/dims-openai.test.ts pinning the contract across native-openai and openai-compatible adapter paths. * feat(ai): flip defaults to ZeroEntropy zembed-1 1280d + zerank-2 reranker Default embedding model is now zeroentropyai:zembed-1 at 1280d via Matryoshka. Real-corpus benchmark: 2.2x faster than OpenAI, 2.6x cheaper at regular pricing, wins 11/20 head-to-head queries. 1280 is the closest valid ZE Matryoshka step to the prior OpenAI 1536d default (valid set: 2560/1280/640/320/160/80/40). 1024 (Voyage's step) is NOT on ZE's list — pinned by AIConfigError fail-loud in dims.ts. balanced mode bundle now defaults reranker_enabled=true. zerank-2 reshuffles 60% of top-1 results in benchmarks. Missing-key fail-open contract in src/core/search/rerank.ts handles unauthenticated cases. Opt out with: gbrain config set search.reranker.enabled false Existing tests updated (gateway.test.ts, search-mode.test.ts) and a new test/balanced-reranker-default.test.ts (10 cases) pins the fail- open invariants. * feat(retrieval-upgrade): RetrievalUpgradePlanner + interactive prompt UX New src/core/retrieval-upgrade-planner.ts is the consolidated planner that computes the brain's pending retrieval-upgrade work (chunker bumps + ZE switch) in one pass and applies the schema transition + config updates atomically. Tagged-union ApplyResult enum (D15): 'applied' | 'skipped_already_ applied' | 'skipped_no_work' | 'declined' | 'planned' | 'failed'. No string-parsing reasons. Three config keys (D12): ze_switch_prompt_shown (UI state), ze_switch_requested (user intent), ze_switch_applied (work done). Plus ze_switch_previous_snapshot (JSON, full prior config for --undo per D16) and ze_switch_declined_at (90-day re-ask window). Schema transition (D18) is atomic: DROP indexes + ALTER COLUMN + CREATE INDEX inside a single engine.transaction(). HNSW recreation is part of the same transaction — no silent slow-search window. C3 eligibility logic: ze_switch_offered iff NOT on ZE + NOT declined recently + NOT applied + (legacy default OR >100 pages). C4 cost math: MAX(chunker_pending, dim_pending) not SUM — one re-embed pass invalidates both surfaces simultaneously. New src/core/retrieval-upgrade-prompt.ts wires the planner to a TTY-only interactive prompt with two-line cost split (D10) and privacy callout for the reranker flip. Tests: test/retrieval-upgrade-planner.test.ts (24 cases) pins the state machine. test/asymmetric-encoding-contract.test.ts (6 cases) pins D17: search read path uses gateway.embedQuery() not embed(), asserted via __setEmbedTransportForTests mock. * feat(cli): gbrain ze-switch — manual lever for the ZE switch New gbrain ze-switch CLI with --dry-run, --json, --resume, --force, --undo, --non-interactive, --confirm-reembed, --ignore-missing-key flags. Mirrors the upgrade prompt's UX symmetry: --undo presents a cost-warning before re-embedding back to the prior width. src/cli.ts: dispatch case + CLI_ONLY entry. ze-switch owns its own engine lifecycle (mirrors the doctor pattern). test/ze-switch-cli.test.ts (11 cases): --help, --dry-run, --json, --non-interactive, --ignore-missing-key, --resume, --undo, --confirm-reembed. Uses captureExit harness to test process.exit() paths without breaking the test process. * feat(doctor): ze_embedding_health + embedding_width_consistency checks Two new doctor checks (D-A5): ze_embedding_health: when embedding_model starts with zeroentropyai:, verify ZEROENTROPY_API_KEY is set (env or config). Paste-ready setup hint with the signup URL on failure. embedding_width_consistency: cross-check that the configured embedding_dimensions matches the actual vector(N) column width on content_chunks.embedding. Catches the half-applied switch state (schema migrated but config write crashed) with a paste-ready gbrain ze-switch --resume hint. Wired into runDoctor between reranker_health and the existing sync_freshness checks. Both checks gracefully no-op on non-ZE embedding configs. test/doctor-ze-checks.test.ts (8 cases) pins both checks across happy + missing-key + missing-config + drift paths. Uses withEnv() helper to clear ZEROENTROPY_API_KEY for the no-key path so tests are hermetic against contributor env state. test/e2e/v0_28_5-fix-wave.test.ts + test/openai-compat-multimodal.test.ts: updated to explicit-configure the gateway when the test depends on specific dims that diverge from the v0.36.0.0 default (1280d). * docs: README zero-based rewrite (884 -> 139 lines) + new docs files Strip 4 months of accreted "New in v0.X.Y" hero blocks and reorganize around what gbrain does today. 33 H2s -> 8. The Commands section (136 lines duplicating gbrain --help) moved out; the 6-table skills enumeration collapsed to a one-paragraph capability description with a link to skills/RESOLVER.md. Hero retains load-bearing facts: OpenClaw + Hermes credit, production numbers (17,888 pages / 4,383 people / 723 companies), BrainBench numbers (P@5 49.1% / R@5 97.9% / +31.4 lift), ZE comparison numbers, 30-min install claim. Adds one paragraph announcing the v0.36.0.0 ZE default with the explicit gbrain config set escape for OpenAI/Voyage users. New files: - docs/INSTALL.md: every install path consolidated (agent platform, CLI standalone, MCP server). Thin-client mode covered. - docs/architecture/RETRIEVAL.md: why the hybrid + graph stack works. BrainBench numbers, why each strategy alone fails, the source-aware ranking + intent classification + multi-query expansion story. - docs/ethos/ORIGIN.md: origin story lifted from the old README so the front door stays factual + concrete. test/readme-hero-anchors.test.ts (5 cases) is the D9 regression guard. Five load-bearing strings: OpenClaw, Hermes, ZE, production-numbers regex, P@5/R@5. Light anchors that let voice/ structure evolve but block accidental loss of headline facts. scripts/check-test-real-names.sh: allowlist entries for OpenClaw + Hermes literals in the anchor test (it explicitly asserts those strings appear in README). * chore: bump version and changelog (v0.36.0.0) ZeroEntropy as the new default for embedding (zembed-1 at 1280d via Matryoshka) and reranker (zerank-2 cross-encoder, on by default in balanced mode bundle). README zero-based rewrite (884 -> 139 lines). 3 new docs files. Two new doctor checks. New gbrain ze-switch CLI with --undo for symmetric reversibility. skills/migrations/v0.36.0.0.md tells the agent how to surface the retrieval-upgrade prompt post-upgrade. llms-full.txt regenerated via bun run build:llms. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(docs): scrub Wintermute from RETRIEVAL.md per privacy rule * chore: rebump version 0.36.0.0 → 0.36.2.0 (queue collision) Three open PRs were claiming v0.36.0.0 (#1130 skillpack, #1139 hindsight, #1136 this PR). Ship-aware queue allocator says this branch lands at v0.36.2.0. Trio audit: VERSION 0.36.2.0 package.json 0.36.2.0 CHANGELOG ## [0.36.2.0] - 2026-05-17 Updates: VERSION, package.json, CHANGELOG header + body refs, README "New default in v0.36.2.0" announcement + credit line, skills/migrations/v0.36.0.0.md renamed to v0.36.2.0.md with frontmatter + body refs updated. llms-full.txt regenerated. * fix(test): pin gateway dim=1536 in cross-file-stateful PGLite tests CI shard 1 reported 10 failures across `query-cache.test.ts` (6) and `consolidate-valid-until.test.ts` (4). Both files hardcode 1536-dim vectors but rely on `PGLiteEngine.initSchema()` to size `vector(__EMBEDDING_DIMS__)` at the right width. Root cause: v0.36.2.0 flipped DEFAULT_EMBEDDING_DIMENSIONS from 1536 to 1280 (ZE Matryoshka step). The gateway module is process-singleton; when ANOTHER test file in the same shard's bun-test process configures the gateway before us, `pglite-engine.ts:216` reads `getEmbeddingDimensions() === 1280` and sizes the schema columns at vector(1280). The hardcoded 1536-dim INSERTs then fail with "expected 1280 dimensions, not 1536". Locally these tests pass in isolation because the gateway falls back through the try/catch at pglite-engine.ts:218 (1536 default). CI runs multiple test files in one process, so cross-file state poisons the schema width. Fix: explicit `resetGateway()` + `configureGateway({embedding_dimensions: 1536, ...})` at the top of `beforeAll`, plus `resetGateway()` in `afterAll`. Pins the schema width regardless of cross-file state. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
brandonlipman
added a commit
to brandonlipman/gbrain
that referenced
this pull request
May 29, 2026
* upstream/master: v0.37.0.0 feat(skillpack): registry cathedral — third-party publish + install + 10/10 quality bar (garrytan#1208) v0.36.6.0 feat: cross-modal search wave (text↔image + unified column + LLM intent) (garrytan#1165) v0.36.5.0 feat: secure DATABASE_URL access for shell jobs (inherit: ["database_url"]) (garrytan#1192) v0.36.4.0 feat: brain-health-100 — autonomous remediation via doctor --remediate + Minions (garrytan#1193) fix(docs): comprehensive drift audit — contradictions, broken links, stale refs (garrytan#1201) v0.36.3.0 feat: dynamic embedding column selection for search (garrytan#1164) v0.36.2.0 feat: ZeroEntropy as default + zero-based README rewrite (garrytan#1136) v0.36.1.1 fix-wave: community PR triage + 28 atomic fixes (garrytan#1182) v0.36.1.0 Hindsight calibration wave: brain learns how you tend to be wrong (garrytan#1139) v0.36.0.0 feat(skillpack): scaffold + reference + harvest (retire managed-block install) (garrytan#1130) v0.35.8.0 feat(cycle): phantom-page redirect inside extract_facts (garrytan#1138) v0.35.7.0 feat: temporal trajectory + founder scorecard (Phases 2-4) (garrytan#1131) v0.35.6.0 feat(search): floor-ratio gate for metadata boost stages (closes garrytan#1091) (garrytan#1129) v0.35.5.1 fix(doctor): stop counting clean supervisor exits as crashes (garrytan#1108) v0.35.5.0 fix wave: bootstrap + orphans + think MCP + worktree + walker (garrytan#1111) v0.35.4.0 fix(doctor,entities): supervisor crash classification + bare-name resolver + 58x perf + stub guard observability (garrytan#1085) v0.35.3.1 feat(eval): temporal-aware contradiction probe + verdict enum (garrytan#1052) v0.35.3.0 fix wave: extract_facts items + git --no-recurse-submodules placement (garrytan#1053) # Conflicts: # src/core/postgres-engine.ts # test/schema-bootstrap-coverage.test.ts
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Skillpacks are scaffolding now, not amber. Scaffold once, own the files, fork freely.
v0.36 retires the managed-block install model that shipped in v0.19 and accumulated machinery through v0.32.
installanduninstallare gone — replaced by six new subcommands that treat skills as first-class code in your agent repo, not vendor packages.New CLI surface:
scaffold <name>— one-time, additive copy into your repo (refuses to overwrite). Includes paired source files declared in each SKILL.md's frontmattersources:array. Partial-state fills missing paired sources even when the skill dir already exists.reference <name>— read-only diff lens with an agent-readable framing line.--apply-clean-hunksruns a two-way auto-apply via a pure-JS unified-diff parser + applier (documented two-way limitation: no scaffold-time base tracking).migrate-fence— one-shot strip of the legacy<!-- gbrain:skillpack:begin -->fence. Cumulative-slugs receipt → row-parsing fallback when stale. Preserves rows verbatim as user-owned routing during the transition.scrub-legacy-fence-rows— opt-in cleanup aftermigrate-fence. Two-condition gate: removes a row only whenskills/<slug>/exists AND the skill's frontmatter declares non-emptytriggers:.harvest <slug> --from <host-repo-root>— the inverse loop. Lift a proven skill from~/git/wintermute(or any host repo) back into gbrain. Symlink-reject + canonical-path containment + default-on privacy linter (catches\bWintermute\b, email, Slack channel patterns) with rollback on match.check --strict— exit non-zero on bundle drift for CI gating.Companion editorial skill
skills/skillpack-harvest/SKILL.mddrives the genericization checklist (scrub fork names, generalize triggers, lift fork-specific conventions to references) before running the harvest CLI.autoDetectSkillsDirgains acwd_walk_uptier ahead of~/.openclaw/workspace.cd ~/git/wintermute && gbrain skillpack scaffold ...now auto-detects wintermute without needing aRESOLVER.md. R5 regression preserves$OPENCLAW_WORKSPACEprecedence.Subtractive surface gone: managed block, cumulative-slugs receipt, content-hash gates, lockfile, prune semantics. ~600 LOC of mechanism deleted from
installer.ts(kept around in v0.36 to back the informationaldiffcommand; slated for v0.37 cleanup).11 bisectable commits.
Test Coverage
Pre-Landing Review
Plan was reviewed via
/plan-eng-reviewwith codex outside-voice during plan mode (14 codex findings, all addressed in plan or implementation). All 16 implementation tasks complete; design surface matches the locked plan. No new findings in code review.Eval Results
No prompt-related files changed — evals skipped.
Plan Completion
All 16 tasks from
~/.claude/plans/system-instruction-you-are-working-glittery-crab.md:Verification Results
No dev server flow needed — skillpack is filesystem-only. Verified via E2E real-subprocess tests (
test/e2e/skillpack-flow.test.ts, 9 cases covering scaffold first-run + re-run no-op, reference diff, reference --apply-clean-hunks, migrate-fence, scrub-legacy-fence-rows, harvest privacy-lint catch, harvest --no-lint bypass, install removed-error).TODOS
All three TODOs originally deferred during plan-eng-review were folded into this PR as bisect commits per user direction:
scrub-legacy-fence-rowscommandreference --apply-clean-hunksDocumentation
docs/guides/skillpacks-as-scaffolding.md(model + workflow)Test plan
bunx tsc --noEmit)🤖 Generated with Claude Code