Skip to content

v0.42.32.0 fix(sync): coerce non-string frontmatter titles + bounded auto-skip failure ledger (#1939)#1956

Merged
garrytan merged 11 commits into
masterfrom
garrytan/fix-issue-1939
Jun 8, 2026
Merged

v0.42.32.0 fix(sync): coerce non-string frontmatter titles + bounded auto-skip failure ledger (#1939)#1956
garrytan merged 11 commits into
masterfrom
garrytan/fix-issue-1939

Conversation

@garrytan

@garrytan garrytan commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Summary

Fixes #1939 — a single un-parseable note could silently stop a brain from indexing anything new.

The bug. YAML frontmatter title: 2024-06-01 parses to a Date (and title: 1458 to a number). The importer called .toLowerCase() on it and threw; that throw landed the file in the failure set, which blocked the sync bookmark from advancing. Every later gbrain sync re-walked the whole repo, never reached HEAD, and quietly stopped indexing new commits — committed pages returned page_not_found with no surfaced error.

Grouped changes:

  • Frontmatter coercion (fix(import))src/core/markdown.ts coerces title/slug/type at the parse chokepoint via coerceFrontmatterString (a YAML Date → UTC ISO 2024-06-01, deterministic across machines; everything else String()). assessContentSanity self-protects too. A non-string title can never throw again.
  • Bounded auto-skip failure ledger (feat(sync)) — new src/core/sync-failure-ledger.ts. A file that fails N consecutive syncs (GBRAIN_SYNC_AUTOSKIP_AFTER, default 3, 0 disables) is recorded and skipped so one poison file can't wedge all indexing forever. Fresh failures still fail closed; a <head> history-rewrite sentinel hard-blocks even with --skip-failed. Per-source keying, success clears a path (consecutive attempts), advance-before-ack crash atomicity, cross-process lock + atomic temp-rename, legacy-row normalization. One shared applySyncFailureGate drives BOTH the incremental and full-sync gates; import.ts defers the bookmark to it via managedBookmark.
  • Doctor severity unification (fix(doctor)) — local buildChecks and remote doctorReportRemote both decide sync_failures severity through decideSyncFailureSeverity, so a stuck bookmark surfaces identically on either surface; auto-skipped pages stay WARN-visible, a real blocking failure past the staleness window escalates to FAIL.

Test Coverage

New test/sync-failure-ledger.serial.test.ts covers the ledger end-to-end: multi-source isolation, success-clears-attempts, sentinel never-skips, legacy normalization + dup-collapse, severity boundaries (open vs auto_skipped), gate-action branch table, advance-before-ack atomicity (throwing advance marks nothing), and lock/atomic-write/stale-break. test/markdown.test.ts + test/content-sanity.test.ts cover non-string title/slug/type coercion. test/e2e/sync.test.ts adds the critical regression (date/number-titled page imports, bookmark advances, get returns it), the valve (blocks then auto-skips on the Nth sync), and self-heal (deleting a failed file clears its ledger row).

  • Targeted suites: 218 pass / 0 fail (bun test, dedicated DB).
  • bun run typecheck: clean. bun run verify: 30/30 green.

Pre-Landing Review

Adversarial review (Claude subagent + Codex). Findings addressed:

  • [FIXED] Stale ledger row on file deletion — a parse-failed file that was later deleted/renamed-away left a permanent open row that aged doctor to FAIL forever. Removed paths (deleted, renamed-from, and the "gone from disk" forward-delete skip) are now treated as resolved, so the ledger self-heals.
  • [FIXED] Severity counted auto-skipped toward FAIL — the count-based FAIL now keys off OPEN (blocking) failures only; auto-skipped rows stay WARN-visible regardless of count, matching the state-machine contract.
  • [ACCEPTED] Best-effort lock fallbackwithLedgerLock proceeds without the lock on a 5s timeout (never deadlocks a sync) with a 30s stale-break; a documented "never wedge a sync" tradeoff over "never lose a write."
  • [ACCEPTED] Attempts can tick on repeated advance() infra failures — bounded (the file auto-skips anyway once advance succeeds); a flapping DB during advance means sync is already degraded.

Eval Results

No prompt-related files changed — evals skipped.

Plan Completion

Plan fully implemented (Parts 1–3 + all planned tests). Reviewed via /plan-eng-review (CLEAR) and /codex outside-voice (9 findings, all folded into the design or addressed).

TODOS

Updated the long-standing "source-scope the sync-failures log so --skip-failed works under --parallel" TODO: this PR landed the source-scoping infrastructure + concurrency lock; the only remaining work (a P3 follow-up) is lifting the v0.40.3.0 --skip-failed + --parallel>1 interim guard once a parallel-ack determinism test is added.

Documentation

Docs updated to cover the v0.42.32.0 sync-failure ledger (#1939): docs/architecture/KEY_FILES.md (new sync-failure-ledger.ts entry; updated sync.ts/doctor.ts/markdown.ts/import.ts), docs/guides/live-sync.md (auto-skip "Tricky Spots" item), and regenerated llms-full.txt. Guards green (build:llms, build-llms.test.ts 12/12, check-key-files-current-state.sh).

Test plan

  • Targeted unit + e2e suites pass (218 tests, dedicated DB)
  • bun run typecheck clean
  • bun run verify 30/30 green

Note: full sharded bun run test runs in CI; locally it contends with concurrent workspaces sharing one machine + DB, so verification was done via typecheck + verify + targeted/e2e suites on a dedicated DB.

🤖 Generated with Claude Code

garrytan and others added 9 commits June 7, 2026 08:44
YAML `title: 2024-06-01` parses to a Date and `title: 1458` to a number;
the old `(frontmatter.X as string)` cast was a compile-time lie, so
downstream `.toLowerCase()` threw and (via the importer failure gate)
could wedge sync indefinitely. parseMarkdown now coerces via
coerceFrontmatterString (Date -> UTC ISO date, deterministic), and the
pure assessContentSanity self-protects against a non-string title.
… indexing (#1939)

New src/core/sync-failure-ledger.ts owns the failure store + a crash-safe,
multi-source, concurrent bounded auto-skip valve. A file that fails N
consecutive syncs (GBRAIN_SYNC_AUTOSKIP_AFTER, default 3) auto-skips so it
can't freeze all indexing forever, while fresh failures still fail-closed
and a `<head>` history-rewrite sentinel hard-blocks even with --skip-failed.

- (source_id, path) keying — failures never merge across sources
- success clears a path so attempts are truly consecutive
- advance-before-ack ordering (a crash can't mark a file skipped while wedged)
- shared applySyncFailureGate used by BOTH the incremental and full-sync gates
- legacy-row normalization + duplicate collapse on load
- cross-process lock + atomic temp-rename, age-based stale-lock break

sync.ts re-exports the ledger for existing callers; import.ts records
source-scoped and defers the bookmark to the gate under managedBookmark.
…urfaces (#1939)

Local buildChecks and remote doctorReportRemote now both route through
decideSyncFailureSeverity, so a stuck bookmark escalates WARN -> FAIL
consistently (oldest-open age > fail cadence, or large unresolved count),
auto-skipped pages stay visible (WARN, not hidden), and the
acknowledged/acknowledged_at field-split that caused drift is gone. The
remote surface stays subprocess-free (file read + Date.parse only).
…1939)

- #1: a parse-failed file that is later deleted/renamed-away no longer leaves
  a permanent open ledger row. Removed paths (filtered.deleted, renamed-from,
  and the "gone from disk" forward-delete skip branch) are treated as resolved
  so the ledger self-heals instead of aging doctor to a stuck FAIL.
- #3: decideSyncFailureSeverity escalates to FAIL on OPEN (blocking) failures
  only — auto_skipped rows already advanced the bookmark, so they stay
  WARN-visible regardless of count, matching the state-machine contract.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
KEY_FILES.md: new src/core/sync-failure-ledger.ts entry (bounded auto-skip
state machine, decideGateAction/decideSyncFailureSeverity/applySyncFailureGate,
GBRAIN_SYNC_AUTOSKIP_AFTER); update sync.ts (failure store moved to ledger,
re-exported), doctor.ts (sync_failures severity via shared rule on both
surfaces), markdown.ts (coerceFrontmatterString), import.ts (managedBookmark).
live-sync.md: poison-file auto-skip tricky-spot. Regenerated llms-full.txt.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@garrytan garrytan changed the title v0.42.30.0 fix(sync): coerce non-string frontmatter titles + bounded auto-skip failure ledger (#1939) v0.42.31.0 fix(sync): coerce non-string frontmatter titles + bounded auto-skip failure ledger (#1939) Jun 8, 2026
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@garrytan garrytan changed the title v0.42.31.0 fix(sync): coerce non-string frontmatter titles + bounded auto-skip failure ledger (#1939) v0.42.32.0 fix(sync): coerce non-string frontmatter titles + bounded auto-skip failure ledger (#1939) Jun 8, 2026
…1939

# Conflicts:
#	CHANGELOG.md
#	VERSION
#	docs/architecture/KEY_FILES.md
#	package.json
@garrytan garrytan merged commit 5a06af5 into master Jun 8, 2026
21 checks passed
mgunnin added a commit to mgunnin/gbrain that referenced this pull request Jun 8, 2026
* upstream/master:
  v0.42.32.0 fix(sync): coerce non-string frontmatter titles + bounded auto-skip failure ledger (garrytan#1939) (garrytan#1956)
  v0.42.31.0 feat(links): open link_source provenance + link-add/link-rm/link-sources (garrytan#1941) (garrytan#1957)
  feat(skills): add idea-lineage (garrytan#1830)

# Conflicts:
#	src/core/content-sanity.ts
#	src/core/markdown.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sync bookmark permanently stuck when importer throws on non-string frontmatter title (content-sanity.ts:379) — silent indexing outage

1 participant