v0.41.31.0 feat(embed): delta-aware sync --all cost gate + real stale-embedding semantics#1632
Merged
Merged
Conversation
…dcoded OpenAI The sync --all cost gate computed spend from a hardcoded EMBEDDING_COST_PER_1K_TOKENS = 0.00013 (OpenAI text-embedding-3-large) and labeled the preview with the back-compat EMBEDDING_MODEL constant, regardless of the actually-configured embedding model. A brain running a cheaper model (e.g. zeroentropyai:zembed-1 @ $0.05/Mtok) saw a preview that named the wrong provider and over-stated spend ~2.6x ($337 vs $130 on a 2.6B-token corpus). estimateEmbeddingCostUsd now resolves the live model via the gateway and prices it through embedding-pricing.ts (the existing per-provider:model table), falling back to the OpenAI rate only when the gateway is unconfigured (unit-test context) or the model is unknown. sync.ts surfaces the real model name in the preview message and JSON. Regression test pins model-aware pricing: openai 3-large vs zembed-1 must produce materially different previews; collapsing both to the OpenAI number fails the assertion.
…elta-aware inline gate
Under federated_v2 (default), sync --all DEFERS embedding to per-source
embed-backfill jobs that already cap spend at $25/source/24h. The v0.20
cost gate predated that cap and fired ConfirmationRequired + exit 2 on
EVERY non-TTY sync --all without --yes, regardless of cost — blocking
nightly crons over already-synced corpora and forcing permanent --yes.
The gate is now mode-aware:
- Deferred embed (v2 default): print an FYI deferred notice (cap-aware,
"not charged by this sync") + the stale-chunk backlog estimate, and
NEVER exit 2. The backfill cap is the real money gate.
- Inline embed (v2 off, or --serial without --no-embed): keep the
blocking gate, but estimate the actual delta — full-tree ceiling for
changed sources (unchanged sources contribute 0 via the same git +
chunker_version "do work?" gate doctor/sync use) + stale backlog — and
block only when it exceeds the new configurable floor
sync.cost_gate_min_usd (default $0.50).
New pure helpers in embedding.ts (willEmbedSynchronously, shouldBlockSync)
keep the decision logic hermetically testable. New engine method
sumStaleChunkChars (both engines) prices the embedding backlog via
estimateCostFromChars. estimateSyncAllCost's per-source walk extracted to
estimateSourceTreeTokens (reused by the inline estimator).
Regressions pinned: R-1 deferred non-TTY never exit 2 (headline), R-2
inline above-floor still exit 2 (protection), plus the willEmbedSynchronously
/ shouldBlockSync matrix and sumStaleChunkChars engine + scope + embed_skip
coverage.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ation v108)
Pre-v0.41.30 "stale" meant only `embedding IS NULL`, so swapping the
embedding model or dimensions left the whole corpus silently embedded under
the OLD model — `embed --stale` ignored it and search quality quietly
degraded.
New `pages.embedding_signature` (TEXT, migration v108) stamps the embedding
provenance (`<provider:model>:<dims>`) whenever a page's chunks are embedded.
A later model/dims swap makes the stored signature differ from the current
one, which the embed paths now detect and re-embed.
GRANDFATHER (critical): the stale predicate is
`embedding IS NULL OR (embedding_signature IS NOT NULL AND <> $current)`
so a NULL signature is NEVER stale. After the migration every existing page
has NULL → none flagged → the next `embed --stale` does NOT re-embed the
whole corpus. Signatures are stamped going forward only.
Surface:
- countStaleChunks / sumStaleChunkChars gain an optional `signature` opt
that widens staleness (read-only; used by the dry-run preview + the
sync cost preview, which is now signature-aware).
- invalidateStaleSignatureEmbeddings(signature, sourceId?) NULLs the
embeddings of signature-mismatched pages so the EXISTING NULL-embedding
cursor (listStaleChunks, untouched) re-embeds them — keeps the keyset
pagination logic intact.
- setPageEmbeddingSignature stamps after a page's chunks land.
- Both embed loops wired: `gbrain embed --stale`/`--all` (embed.ts) and the
embed-backfill minion (embed-stale.ts) invalidate-then-stamp.
Migration v108 + bootstrap probe (both engines) + REQUIRED_BOOTSTRAP_COVERAGE
entry. Pinned by test/embedding-signature-stale.test.ts (R-4 grandfather,
mismatch detection, matching no-op, scoped invalidate, stamp) + the
bootstrap-coverage gate.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rred notice Under federated_v2, `sync --all` exits 0 and embedding lags behind in embed-backfill jobs (subject to cooldown + the per-source 24h cap). Pre-fix an operator had no signal those jobs were queued or lagging — the sync looked "done" while embeddings trickled in later. `gbrain sources status` now shows a BACKFILL column per source (active(N)/queued(N)/idle) plus the last completion timestamp, read from minion_jobs. The deferred-sync notice appends "N backfill job(s) queued" so a cron operator sees work is enqueued, not lost. Both reads are best-effort — a brain that never ran a worker (no minion_jobs table) reports idle/0 instead of crashing the dashboard. SyncStatusReportSource gains backfill_queued / backfill_active / backfill_last_completed_at (additive; JSON envelope schema_version unchanged). Pinned by a new case in test/e2e/sync-status-pglite.test.ts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e EXPECTED_PHASES Commit 3 made embed.ts import currentEmbeddingSignature from embedding.ts. Four tests mock.module the whole embedding.ts and omitted the new export, so embed.ts (imported transitively) failed at load with "Export named 'currentEmbeddingSignature' not found". Add the export to each mock: embed.serial.test.ts, e2e/cycle.test.ts, e2e/dream.test.ts, e2e/dream-cycle-phase-order-pglite.test.ts. Also sync the stale EXPECTED_PHASES in dream-cycle-phase-order-pglite.test.ts to match cycle.ts ALL_PHASES — extract_atoms, synthesize_concepts, and conversation_facts_backfill drifted in after the test was last touched (v0.41.0.0) and were never added, so both phase-order assertions were failing on the branch before this wave (confirmed against 0906ab0). The dry-run cycle emits all 20 phases, so mirroring the constant makes both assertions pass. Pre-existing, unrelated: cycle.test.ts / dream.test.ts have 5 runCycle failures via direct `bun test` (the conversation_facts_backfill phase uses the module-singleton getConnection) — present identically at 0906ab0, not touched by this wave. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…iring Ship-workflow coverage audit flagged two gaps: - R-3 (mandatory regression) had no dedicated test: the inline unchanged-source short-circuit requires git-unchanged AND chunker_version match, but nothing pinned the chunker half. Add a case where git is unchanged (HEAD==last_commit, clean) but chunker_version is stale → estimate still fires (exit 2), plus a control where chunker matches → short-circuits to $0 (no block). - The embed loops' setPageEmbeddingSignature call-site was only kept green by the mock, never asserted. Add a test that runs `embed --all` and asserts the stamp fires once per page with the current signature. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… paths (F1); inline cost gate counts new-content only (F2)
Adversarial review caught that the stale-detection feature was inert for
non-federated/inline brains: the embed-write paths that DON'T go through
embed.ts/embed-stale.ts never stamped pages.embedding_signature.
F1 — stamp at the remaining write sites:
- embedPage (gbrain embed <slug> + sync's post-import runEmbedCore({slugs}))
- importFromContent markdown branch (inline import/sync embed + gbrain import)
- importCodeFile (only when EVERY chunk was freshly embedded this call —
reuse-by-hash carries old-model vectors, so a mixed page stays unstamped
rather than falsely marked current)
Without this, inline-synced pages kept NULL signatures → grandfathered → never
re-embedded on a model/dims swap. Now all embed-write paths stamp.
F2 — coupled regression the F1 fix would otherwise introduce: the inline cost
gate added the stale backlog (NULL + signature drift) into the BLOCKING cost,
but `gbrain sync` inline only embeds new/changed content — the backlog is
`gbrain embed --stale`'s job. Once F1 gives inline brains real signatures, a
model swap would inflate the inline gate and block the next cron for cost the
sync never incurs. Inline blocking cost is now new-content only; the stale
backlog is shown informationally ("pending gbrain embed --stale"). Deferred
path keeps the signature-aware backlog FYI (the backfill does clear it).
Pinned by test/import-signature-stamp.serial.test.ts (inline stamp + --no-embed
NULL) and the existing R-2/R-3 inline-gate tests (still exit 2 on new-content).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…; guard partial-page signature stamping (P0b) Codex adversarial review caught two issues in the v0.41.30 wave: P0a — `gbrain sources status` routes through computeAllSourceMetrics (source-health.ts), not the buildSyncStatusReport helper where the BACKFILL column was added, so the CLI never showed it. Add per-source embed-backfill active/queued counts to computeAllSourceMetrics (one extra FILTER on the existing minion_jobs query) and render a BACKFILL column in `sources status`. The deferred-sync notice's queued-job count (live sync path) already worked. P0b — embedPage / embedAllStale / embed-stale stamped embedding_signature unconditionally after embedding only the STALE subset of a page's chunks. A partially-embedded page (some chunks preserved from a prior embed under unknown/old provenance) would be falsely marked current, hiding the old vectors from future stale detection. Now stamp only when EVERY chunk of the page was (re)embedded this pass (toEmbed === chunks / stale === existing). importFromContent embeds the full chunk set so it stays unconditional; importCodeFile already had the equivalent guard. `gbrain embed --all` fully re-embeds and stamps mixed pages. Accepted as documented limitations (not fixed): the inline cost gate can over-estimate a >100-file `--serial` sync that performSync will defer (non-default mode, conservative-high bias), and model-swap invalidation NULLs drifted vectors before re-embed (a deliberate, rare operation). Pinned by a new backfill-counts case in test/source-health.test.ts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Refresh CLAUDE.md Key Files + Commands for the embedding cost-model + stale-semantics wave: model-aware cost helpers in embedding.ts (currentEmbeddingPricePerMTok / currentEmbeddingSignature / willEmbedSynchronously / shouldBlockSync), the embedding-signature stale-detection engine quartet (sumStaleChunkChars / setPageEmbeddingSignature / invalidateStaleSignatureEmbeddings + widened countStaleChunks), migration v108, signature stamping across embed.ts / import-file.ts, the mode-aware sync --all cost gate + sync.cost_gate_min_usd config key, and the sources status BACKFILL column. Add a same-dimension-swap auto-reembed note to docs/embedding-migrations.md. Regenerate llms-full.txt. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Mechanical version-string sweep across VERSION, package.json, CHANGELOG, CLAUDE.md, docs, source/test comments, and regenerated llms bundles. No logic change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…cost-model-fix # Conflicts: # CHANGELOG.md # CLAUDE.md # VERSION # llms-full.txt # package.json
…5 contamination)
CI shard 5 failed deterministically with `expected 1280 dimensions, not 1536`.
Root cause: cosine-rescore-column.test.ts hardcodes 1536-dim `embedding`
vectors and asserts length 1536, but its beforeAll ran `initSchema()` with no
gateway config. initSchema sizes the `embedding` column from
getEmbeddingDimensions(), whose default is 1280 (zeroentropyai:zembed-1). The
test only passed by inheriting a leaked 1536 gateway config from an earlier
test (or, locally, from ~/.gbrain). When the v0.41.31 merge shifted the
weight-aware shard bin-packing, the file order changed so the 1280 default won
in CI → vector(1280) column → 1536 insert rejected. (Passed locally because
the dev machine's ~/.gbrain resolves 1536.)
Fix: configureGateway({ openai:text-embedding-3-large, 1536 }) in beforeAll
BEFORE connect/initSchema so the column is deterministically vector(1536)
regardless of ambient/leaked state, and resetGateway() in afterAll for
hygiene. Proven: under a forced-1280 gateway preload the old test reproduces
the exact CI error and the fixed test passes (4/4).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
mgunnin
added a commit
to mgunnin/gbrain
that referenced
this pull request
Jun 3, 2026
* upstream/master: v0.41.36.0 feat(mcp): publish agent skills (list_skills / get_skill) for thin clients (garrytan#1661) v0.41.35.0 feat(guardrails): vendor-neutral content guardrail seams (supersedes garrytan#1652) (garrytan#1660) v0.41.34.0 feat(search): retrieval cathedral — max-pool + title + alias + evidence (garrytan#1657) v0.41.33.0 feat(search): intent-aware adaptive return-sizing + agent-facing query param (garrytan#1640) v0.41.32.0 fix(staleness): commit-relative sync staleness (supersedes garrytan#1623) (garrytan#1656) v0.41.31.0 feat(embed): delta-aware sync --all cost gate + real stale-embedding semantics (garrytan#1632) v0.41.30.0 fix(brainstorm/lsd): --save writes the advertised .md file via canonical ingestion path (garrytan#1655) # Conflicts: # src/core/operations.ts
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three things, shipped as bisectable commits on top of the v0.41.29 base.
1.
gbrain sync --allstops blocking crons. The pre-existing cost gate emittedConfirmationRequired+exit 2on every non-TTY run without--yes, based on a whole-corpus estimate — so a nightly cron over an already-synced brain failed every night. The gate is now mode-aware: when embedding is deferred to backfill jobs (the federated_v2 default), it prints an informational notice and never exits 2 (the backfill's own $25/source/24h cap is the real money gate). The blocking confirmation fires only when sync embeds inline, and only when the new-content estimate exceeds a configurable floorsync.cost_gate_min_usd(default $0.50). Unchanged sources (git HEAD == last_commit, clean tree, current chunker) contribute 0.2. Real stale-embedding semantics (migration v108). New
pages.embedding_signaturecolumn records<provider:model>:<dims>at every embed-write site. Swap your embedding model andgbrain embed --stalenow finds and re-embeds the drifted pages. NULL signature is grandfathered (never stale) so upgrading does not mass-re-embed the corpus.3. Embed-backfill visibility.
gbrain sources statusgains a BACKFILL column (active/queued/idle per source) and the deferred sync notice appends a queued-job count, so a cron operator can see deferred embedding work.Commit walk (
git log master..HEAD): model-aware cost rate → mode-aware/delta-aware gate → stale semantics + v108 → backfill visibility → test mock fixes → R-3 + stamp-wiring tests → F1/F2 adversarial fix → version bump → P0a/P0b adversarial fix → docs.Test Coverage
Coverage audit: 13/17 paths at first pass (76%), then the two flagged gaps closed — R-3 (a mandatory regression) and the embed-signature stamp-call wiring now have dedicated tests. New/extended test files:
sync-cost-preview(shouldBlockSync/willEmbedSynchronously matrix incl R-1/R-2),sum-stale-chunk-chars,embedding-signature-stale(R-4 grandfather + mismatch + scoped invalidate + stamp),sync-cost-gate.serial(R-1/R-2/R-3 + control),import-signature-stamp.serial(inline stamp),source-health(BACKFILL counts),embed.serial(stamp-call assertion),e2e/sync-status-pglite(BACKFILL).4 mandatory regressions pinned: R-1 deferred non-TTY never exit 2 · R-2 inline above-floor still exit 2 · R-3 chunker drift still estimates (git-unchanged ≠ free) · R-4 migration never mass-marks-stale (NULL grandfathered).
Pre-Landing & Adversarial Review
Two adversarial rounds (Claude subagent + Codex), both found real bugs that are now fixed:
embed/--all/--stale, embed-backfill,syncinline import,gbrain import) now stamp.sources statusdoesn't call. Fixed: wired into the realcomputeAllSourceMetricspath.Accepted as documented limitations (per-page signature tradeoff): P1b (a >100-file
--serialsync can over-estimate; non-default mode, conservative-high bias) and P1a (model-swap invalidate NULLs drifted vectors before re-embed; deliberate rare op).Plan Completion
All 6 implementation tasks (T1-T6) DONE; T4 signature format is a documented CHANGED (model:dims, with chunker drift tracked separately via the existing
pages.chunker_version). Plan + reviews at~/.claude/plans/system-instruction-you-are-working-humming-kettle.md.Migration
v108
pages_embedding_signature— additive nullable TEXT column, metadata-only on both engines, bootstrap probe added to both engines +REQUIRED_BOOTSTRAP_COVERAGE. Verified on real Postgres (schema-drift parity) + PGLite.Documentation
Synced for v0.41.31.0: CLAUDE.md Key Files + Commands annotations, a "same-dimension model swaps (automatic)" section in docs/embedding-migrations.md, and regenerated llms-full.txt. CHANGELOG written. Trio audit clean: VERSION = package.json = CHANGELOG = 0.41.31.0.
Test plan
bun run verify: 29/29 checks greenbun run typecheck: clean🤖 Generated with Claude Code