Skip to content

apply-migrations cannot bootstrap brains older than v17: init --migrate-only references pages.source_id before v0.18.0 has added the column #451

@MFHC11

Description

@MFHC11

Suggested labels

  • bug
  • migrations
  • bootstrap
  • priority/high (blocks all upgrades from <v17 schemas)

Summary

A brain on schema version 4 cannot be migrated to schema version 24 using
gbrain apply-migrations. The first migration in the queue (v0.11.0) calls
gbrain init --migrate-only as its Phase A. The current src/schema.sql
unconditionally references pages.source_id (added by migration v0.18.0,
which hasn't run yet on a v4 brain), so Phase A fails with
column "source_id" does not exist and the entire migration ladder is
blocked. This is a chicken-and-egg between the latest-schema bootstrap and
the per-version migration ladder.

Anyone with a brain that pre-dates v17 (i.e. shipped before multi-source
support landed in v0.18.0) is stuck on their current schema until this is
fixed.

Environment

  • gbrain version: 0.20.4 (CLI binary)
  • DB: Supabase Postgres (pgvector + pg_trgm + pgcrypto), pgbouncer on :6543
  • Starting schema_version: 4
  • Target schema_version: 24
  • Pages: 564 (intact, no data loss)
  • macOS, bun installation

Reproduction

On any brain with schema_version < 17:

gbrain --version       # 0.20.4
gbrain doctor          # reports schema_version=4, latest=24, queue_health warns "minion_jobs does not exist"
gbrain apply-migrations --dry-run   # lists 9 migrations from v0.11.0 → v0.18.1
gbrain apply-migrations --yes       # fails on the first migration

Outcome: every migration after v0.11.0 stays pending; v0.11.0 is marked
partial. Re-running produces the same failure.

Full failure log

=== Applying migration v0.11.0: GBrain Minions — durable background agents ===
NOTICE: extension "vector" already exists, skipping
NOTICE: extension "pg_trgm" already exists, skipping
NOTICE: extension "pgcrypto" already exists, skipping
NOTICE: relation "pages" already exists, skipping
NOTICE: relation "idx_pages_type" already exists, skipping
NOTICE: relation "idx_pages_frontmatter" already exists, skipping
NOTICE: relation "idx_pages_trgm" already exists, skipping
column "source_id" does not exist
Phase A (schema) failed: Command failed: gbrain init --migrate-only. Aborting; re-run after fixing.
Migration v0.11.0 reported status=failed.

apply-migrations --list after the failure:

Status   Version   Headline
-------  --------  -----------------------------------------
partial  0.11.0    GBrain Minions — durable background agents
pending  0.12.0    Knowledge Graph wires itself
pending  0.12.2    Postgres frontmatter queries — JSONB double-encode bug fixed
pending  0.13.0    Frontmatter becomes a graph
pending  0.13.1    BrainWriter integrity + grandfather protection
pending  0.14.0    Shell jobs + autopilot cooperative handler
pending  0.16.0    Durable LLM agents
pending  0.18.0    Multi-source brains
pending  0.18.1    Row Level Security hardened on all public tables

doctor --fast after the failure now reports a new failing check:

[FAIL] minions_migration: MINIONS HALF-INSTALLED (partial migration: 0.11.0)

Root cause

src/schema.sql (gbrain v0.20.4) creates an index on pages.source_id
before the column has been introduced by the per-version migration ladder:

-- src/schema.sql:48
-- v0.18.0 (Step 2): pages.source_id scopes each row to a sources(id) row.
CREATE TABLE IF NOT EXISTS pages (
  id            SERIAL PRIMARY KEY,
  source_id     TEXT    NOT NULL DEFAULT 'default'
                REFERENCES sources(id) ON DELETE CASCADE,
  ...
);

-- src/schema.sql:74
CREATE INDEX IF NOT EXISTS idx_pages_source_id ON pages(source_id);

CREATE TABLE IF NOT EXISTS pages is a no-op on an existing pre-v17 brain
because pages already exists. CREATE INDEX IF NOT EXISTS idx_pages_source_id ON pages(source_id) then runs unconditionally, but
pages.source_id does not exist on the existing table — failure.

The column is only added by the migration whose name is
pages_source_id_composite_unique:

// src/core/migrate.ts:605
{
  name: 'pages_source_id_composite_unique',
  // v0.18.0 Step 2 (Lane B) — adds pages.source_id. Engine-split after
  ...
  // src/core/migrate.ts:631
  ALTER TABLE pages ADD COLUMN IF NOT EXISTS source_id TEXT
    NOT NULL DEFAULT 'default'
    REFERENCES sources(id) ON DELETE CASCADE;
  CREATE INDEX IF NOT EXISTS idx_pages_source_id ON pages(source_id);
}

Because every per-version migration calls gbrain init --migrate-only as
its Phase A, the failure occurs on the first migration in the queue
regardless of which migration logically introduced the column. No migration
in the ladder ever reaches its own Phase B/C/etc.

The same pattern affects files.source_id and files.page_id
(src/schema.sql:255-282) — same bootstrap-vs-ladder mismatch, will also
trip pre-v17 brains.

Suggested fix (any one of these)

  1. Make init --migrate-only schema-version-aware. Have it skip or
    conditionalise the source_id-dependent statements until the
    pages_source_id_composite_unique migration has been applied (or run a
    pre-flight check that adds the column if schema_version < 17).
  2. Wrap the source_id index creation in a DO block that only runs when
    pages.source_id exists. e.g.:
    DO $$
    BEGIN
      IF EXISTS (SELECT 1 FROM information_schema.columns
                 WHERE table_name='pages' AND column_name='source_id') THEN
        CREATE INDEX IF NOT EXISTS idx_pages_source_id ON pages(source_id);
      END IF;
    END $$;
    Same pattern for files indexes that depend on source_id / page_id.
  3. Hoist the source_id ADD COLUMN statements out of the per-version
    migration
    and into init --migrate-only itself, so the column always
    exists by the time any per-version migration's Phase A runs. Then the
    per-version migration becomes a no-op for that step on already-bootstrapped
    brains.

Option 2 is the smallest, lowest-risk patch and matches the
IF EXISTS / IF NOT EXISTS defensiveness already used elsewhere in
schema.sql. Option 3 is the most architecturally clean.

Workaround

Brain functions normally on schema v4 — query, sync, embed, frontmatter
search, and existing skills all work. Integrations recipes that require
v24 features (minion_jobs, durable LLM agents, multi-source, auto-extracted
graph edges) will need to wait for the migration ladder to walk cleanly.

For users who can't wait, a manual pre-add of the missing columns might
unblock the ladder, but I have not attempted this and don't recommend it
without upstream sign-off:

-- UNTESTED — not endorsed by upstream
ALTER TABLE pages ADD COLUMN IF NOT EXISTS source_id TEXT NOT NULL DEFAULT 'default';
ALTER TABLE files ADD COLUMN IF NOT EXISTS source_id TEXT NOT NULL DEFAULT 'default';
ALTER TABLE files ADD COLUMN IF NOT EXISTS page_id INTEGER REFERENCES pages(id) ON DELETE CASCADE;

This bypasses the FK constraint to sources(id) (which the migration would
add). Risky on production data — file the bug, wait for a clean fix.

Attached evidence (paste these as gist links or files when filing)

  • pre-doctor.json — full pre-migration health output
  • migrations.log — first attempt
  • migrations-retry-1.log — second attempt (identical failure)
  • migrations-list.txt — partial/pending state after failure
  • post-failed-migration-doctor.txt — doctor output showing
    MINIONS HALF-INSTALLED
  • version-before.txt — confirms binary 0.20.4

All available at the user's local
~/gbrain-migration-2026-04-25/ directory.

Severity

High — blocks all schema upgrades from any brain that pre-dates v17. Anyone
in that cohort cannot use post-v0.11 features (minion supervisor,
auto-graph extraction, durable LLM agents, multi-source, RLS

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions