Skip to content

doctor + autopilot: child-table orphan detection (content_chunks, page_versions, etc.) not surfaced #1063

@vincedk-alt

Description

@vincedk-alt

Summary

gbrain dream's orphans phase only detects orphan PAGES (pages with no inbound links). It does NOT detect orphan CHILD ROWS in FK-bound tables like content_chunks, page_versions, tags, takes, etc. Bulk page deletions can leave orphans in these tables that persist indefinitely.

Discovered after a bulk source removal on v0.35.0.0 left 234 orphan content_chunks rows. All FK constraints to pages.id are correctly declared as ON DELETE CASCADE (verified via pg_constraint against the live PGLite DB), so the orphans must have come from one of:

A. A historical schema version where the FK was missing (and never retro-cascaded)
B. A bulk-delete code path that bypassed cascade (manual SQL UPDATE without follow-up CASCADE)
C. A race condition during a large CASCADE batch

Regardless of origin, the existing orphans phase doesn't catch them. The find_orphans MCP op also only scans pages.

Repro / Detection

bun -e "
import { PGlite } from '@electric-sql/pglite';
import { vector } from '@electric-sql/pglite/vector';
const db = new PGlite('/Users/<you>/.gbrain/brain.pglite', { extensions: { vector } });
await db.waitReady;
for (const tbl of ['content_chunks', 'page_versions', 'tags', 'takes', 'links', 'raw_data', 'files', 'timeline_entries']) {
  const col = tbl === 'links' ? 'from_page_id' : 'page_id';
  const r = await db.query(\`SELECT COUNT(*) AS c FROM \${tbl} WHERE \${col} IS NOT NULL AND \${col} NOT IN (SELECT id FROM pages)\`);
  console.log(\`\${tbl}: \${r.rows[0].c} orphans\`);
}
"

If any table reports non-zero orphans, the existing orphans phase didn't catch them.

Suggested fix shapes

A. Extend the existing orphans phase to scan all FK-to-pages tables. Could add to runPhaseOrphans in src/core/cycle.ts; would surface in autopilot output naturally.

B. New find_child_orphans MCP op (admin-scoped, localOnly) returning a per-table breakdown. Operators can run on-demand without waiting for the next dream cycle.

C. New doctor check (child_table_orphans) — fast, sampled, surfaces non-zero counts with paste-ready cleanup SQL. Cheapest to implement; matches the existing doctor health-check pattern.

I'd vote (C) — it's the smallest blast radius, gives operators immediate signal, and doesn't change autopilot semantics.

Context

  • Live FK constraints are correct (pg_get_constraintdef shows ON DELETE CASCADE on every page-referencing table including content_chunks).
  • 234 orphans were cleaned with DELETE FROM content_chunks WHERE page_id NOT IN (SELECT id FROM pages) — manual operator intervention.
  • Pre-cleanup orphan check could have been auto-fired by the dream cycle but wasn't.

Severity: medium. Doesn't break anything visible (orphans don't poison search since the join eliminates them), but they bloat storage and embedding-index size proportionally to historical deletes. On a brain with months of churn, this can be hundreds of MB of dead vectors.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions