Skip to content

fix: exclude channel_database from migration 021 to prevent PostgreSQL boot-loop#3640

Merged
Yeraze merged 1 commit into
mainfrom
claude/great-dijkstra-7p46ub
Jun 22, 2026
Merged

fix: exclude channel_database from migration 021 to prevent PostgreSQL boot-loop#3640
Yeraze merged 1 commit into
mainfrom
claude/great-dijkstra-7p46ub

Conversation

@Yeraze

@Yeraze Yeraze commented Jun 22, 2026

Copy link
Copy Markdown
Owner

Summary

Fixes #3639 — PostgreSQL users experiencing a boot-loop with tables can have at most 1600 columns after many restarts.

Root Cause

PostgreSQL runs all migrations on every startup (no per-migration tracking, unlike SQLite which uses settingsKey). This is intentional — each migration is supposed to be idempotent via IF NOT EXISTS guards.

However, there's a dangerous interaction between two migrations:

  • Migration 021 adds sourceId to channel_database with ADD COLUMN IF NOT EXISTS
  • Migration 063 drops sourceId from channel_database with DROP COLUMN IF EXISTS

On every boot, the cycle runs: add → drop → add → drop → …

PostgreSQL counts dropped column "tombstones" in pg_attribute toward its hard 1600-column limit per table, and never reclaims those slots automatically. After ~1600 boots, any ALTER TABLE channel_database ADD COLUMN fails with the error the reporter sees.

Fix

Remove channel_database from migration 021's DATA_TABLES array.

channel_database is global-by-designchannelDecryptionService queries all rows regardless of source. Migration 063 already handles cleanup of any sourceId that was added by prior runs of migration 021 (DROP COLUMN IF EXISTS is a safe no-op if the column isn't there).

The net result: migration 021 no longer touches channel_database, the add/drop cycle stops, and the tombstone count stops growing.

Impact on existing affected users

Users who have already hit the limit can simply update to this version and restart — migration 021 will no longer attempt ADD COLUMN on channel_database, so the 1600-column error won't fire. Their table will still have many dead tombstone slots, but since no migration adds further columns to channel_database, this is harmless at runtime.

🤖 Generated with Claude Code

https://claude.ai/code/session_01Cu5UZv6q58JhV62cWjM6KG


Generated by Claude Code

…t PostgreSQL boot-loop

PostgreSQL runs all migrations on every startup (no per-migration tracking).
Migration 021 added sourceId to channel_database; migration 063 drops it.
The ADD/DROP cycle each boot consumed a PostgreSQL pg_attribute tombstone slot.
After ~1600 boots the table hit PostgreSQL's hard 1600-column limit, causing a
crash on startup with "tables can have at most 1600 columns".

Fix: remove channel_database from migration 021's DATA_TABLES array.
Migration 063 already handles cleanup of any previously added sourceId column.
channel_database is global-by-design (migration 063 explains why).

Fixes: #3639

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Cu5UZv6q58JhV62cWjM6KG
@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown

Claude finished @Yeraze's task —— View job


PR Review

  • Gather context
  • Review migration 021 changes
  • Review migration 063 for relationship
  • Check test coverage
  • Post final review

Code Review

Overall: LGTM — the fix is correct and well-reasoned.

What the fix does

Removes channel_database from the DATA_TABLES array in src/server/migrations/021_add_source_id_columns.ts. This breaks the add/drop cycle where:

  • Migration 021 (every boot on PG): ALTER TABLE channel_database ADD COLUMN IF NOT EXISTS "sourceId"
  • Migration 063 (every boot on PG): ALTER TABLE channel_database DROP COLUMN IF EXISTS "sourceId"

PostgreSQL's dead column tombstones accumulate in pg_attribute and never get reclaimed, eventually hitting the hard 1600-column limit.

Correctness ✅

The fix is correct on all three backends:

  • PostgreSQL: Migration 021 no longer adds the column → migration 063's DROP COLUMN IF EXISTS becomes a permanent no-op → tombstone accumulation stops.
  • SQLite: Migration 021 uses a try/catch for duplicate column (already idempotent). Migration 063 catches no such column. Both are safe no-ops after the fix.
  • MySQL: Migration 021 checks information_schema.COLUMNS before adding — already idempotent. Migration 063 does the same before dropping. Both are harmless no-ops.

Comment quality ✅

The new block comment at the top of 021_add_source_id_columns.ts (lines 11–18) clearly explains:

  • Why channel_database is excluded
  • The cross-migration dependency (021 ↔ 063)
  • The specific PostgreSQL failure mode
  • The issue tracker link

This is valuable documentation for future maintainers.

Migration 063 compatibility ✅

Migration 063's DROP COLUMN IF EXISTS (PostgreSQL) and information_schema check (MySQL) mean it's already a safe no-op when the column is absent — no changes needed there.

Impact on affected users ✅

The PR description correctly notes that users who have already hit the limit will recover on restart — migration 021 stops attempting the add, so the 1600-column error won't fire again. Existing tombstones are harmless since no migration adds further columns to channel_database.

Minor observations (non-blocking)

  1. No test update needed: The migration test file (src/db/migrations.test.ts) tests migration counts and structure, not the specific table lists inside each migration. No changes required there.

  2. SQLite was never affected: The SQLite path catches duplicate column errors, so it was idempotent before this fix too. The fix is still correct to make the behavior consistent and documented.

  3. The DATA_TABLES comment at line 7–9 in the file header still accurately lists the 8 remaining tableschannel_database was never listed there since those lines describe the final intended state.

No issues found. The fix is minimal, targeted, well-documented, and correct across all three database backends.

@Yeraze Yeraze merged commit dbfc01a into main Jun 22, 2026
19 checks passed
@Yeraze Yeraze deleted the claude/great-dijkstra-7p46ub branch June 22, 2026 11:54
Yeraze added a commit that referenced this pull request Jun 22, 2026
…3657) (#3658)

* fix(migration): guard 033 channel_database backfill on column existence (#3657)

Regression introduced by #3640 (v4.11.4): removing channel_database from
migration 021's sourceId-add left migration 033's
`UPDATE channel_database SET sourceId ...` referencing a column that no longer
exists. On PostgreSQL — which re-runs every migration on every boot — and on
fresh installs, 033 then crashed during DB init with
`column "sourceId" does not exist`, so the app couldn't start.

The crash only triggered when at least one source was registered (the backfill
is gated on `sources.length > 0`); the CI fixtures have no sources, which is why
it slipped through.

Fix: guard the channel_database backfill on the legacy `sourceId` column still
existing (PRAGMA table_info on SQLite, information_schema on PG/MySQL).
channel_database is global-by-design (063 drops sourceId), so on current
databases the column is absent and the backfill is correctly a no-op; on a
mid-upgrade DB where the column is still present it still runs.

Adds 033_per_source_permissions.test.ts: reproduces #3657 (source present +
channel_database without sourceId → no throw), confirms back-compat backfill
when the column is present, and the no-sources case.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011JEaCGwY9Wz8jeV4e22GW4

* release: v4.11.5 (hotfix for #3657)

Bumps all five version files 4.11.4 -> 4.11.5. v4.11.4 was retracted because
migration 033 crashed Postgres/fresh-install startup (#3657); this release
carries the fix.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011JEaCGwY9Wz8jeV4e22GW4

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] bootloop: ❌ Failed to migrate auto-responder triggers: error: tables can have at most 1600 columns

2 participants