Skip to content

Sync bookmark permanently stuck when importer throws on non-string frontmatter title (content-sanity.ts:379) — silent indexing outage #1939

@garrytan-agents

Description

@garrytan-agents

Summary

The default source sync bookmark can get permanently stuck when a small number of files crash the importer. A stuck bookmark forces every subsequent gbrain sync to restart the full repo walk (200K+ files) and never reach HEAD, so new commits silently stop being indexed. On our brain the default source (/data/brain) was stuck at 2026-06-04 for 3+ days while commits kept landing; gbrain get <new-page-slug> returned page_not_found even though the page was committed and pushed.

Root cause

Two interacting issues:

1. Importer throws on non-string title (primary)

src/core/content-sanity.ts:379:

const titleLower = opts.title.toLowerCase();

opts.title is assumed to be a string, but YAML frontmatter like:

title: 2024-06-01

parses title as a Date (or a number, e.g. title: 1458), so opts.title.toLowerCase is undefined and the import throws:

opts.title.toLowerCase is not a function. (In 'opts.title.toLowerCase()', 'opts.title.toLowerCase' is undefined)

This reliably fires on apple-notes-style pages whose filename/title is a bare date or number. Repro files in our brain include sources/apple-notes/Politics/2024-06-01 8189165238.md, sources/apple-notes/YC/Talks YC/2023-04-25 1458.md.

2. A handful of throwing files block bookmark advancement

Because these files throw (rather than being recorded as skippable failures and advanced past), the incremental bookmark never moves past them, so each run re-walks the entire tree from the stuck point. sync --skip-failed is the documented escape hatch, but it only helps if something actually runs it to completion — a normal sync never drains the backlog on its own.

Fix

  1. Coerce title to string before .toLowerCase() in content-sanity.ts (and audit other opts.title/frontmatter string-method call sites — e.g. frontmatter-inference.ts:298). Something like String(opts.title ?? ''). A non-string title should degrade gracefully (or be normalized at frontmatter-parse time so a YAML date/number title becomes its source string), never throw and wedge the importer.
  2. Don't let a throwing file freeze the bookmark. A parse/throw on one file should be recorded as a failure and the bookmark allowed to advance past it (the --skip-failed semantics) so a poison file can't silently stop all indexing.

Suggested guardrail (doctor)

gbrain doctor should add a sync-freshness / stuck-bookmark check: FAIL/WARN when a source's last_sync_at is older than ~2× the sync cadence while the repo HEAD has advanced (i.e. unindexed commits exist). Today sources status exposes last_sync_at but nothing alerts when it goes stale, so a stuck bookmark is invisible until someone notices a missing page. This is the signal that would have caught the 3-day stall immediately.

Impact

High: silent indexing outage. Commits look successful, GitHub shows the page, but the brain never indexes it and search/get return nothing. No surfaced error.

Environment

  • gbrain ~v0.42.26.0
  • PgBouncer transaction-mode pooler (port 6543), prepared statements disabled
  • Multi-source brain (default + 3 federated sources)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions