Skip to content

P1.13 FK constraints on users(id) lack ON DELETE CASCADE in 32 places (blocker of #376) #413

@jayzalowitz

Description

@jayzalowitz

Parent: #361
Blocks: #376

Context

While verifying the cascade scope for #376 (P1.6 delete-my-data endpoint), the migration audit surfaced a structural gap: the FK references on users(id) are inconsistent across the schema. Migrations 015 onward (plus the migrations that came in 027+) correctly use ON DELETE CASCADE. Migrations 002-013 and the core schema.sql body do NOT — they use bare REFERENCES users(id). This means a naive DELETE FROM users WHERE id = ? against the current schema fails with a foreign-key violation on the first child row.

This is a blocker for #376. Either #376 has to enumerate every table and DELETE in order (fragile — every new table added without CASCADE silently breaks #376), or this issue lands first and the cascade walk becomes a one-line operation.

Root Cause Analysis

The schema grew organically. Early migrations (002-013) and the original schema.sql predate the project's adoption of ON DELETE CASCADE as the default for per-user tables. Each migration was authored independently; nobody enforced "if your table has a user_id, it needs ON DELETE CASCADE." Migration 015 (ai-provider-settings.sql) was the first to explicitly cascade, and migrations from then on followed the pattern — but the old refs were never backfilled.

CockroachDB does not retroactively cascade. You have to ALTER TABLE ... DROP CONSTRAINT and ADD CONSTRAINT ... ON DELETE CASCADE. Cheap, but it's 32 separate constraints across 13 files.

Verified Current State

Verified 2026-05-25 against jayzalowitz/launch-skytwin-demo @ 8bb5379.

Refs WITHOUT ON DELETE CASCADE (32 total)

File Lines Table(s)
packages/db/src/schemas/schema.sql 21, 36, 82, 101, 153, 169, 264 7 tables in the core schema (twin_profiles, etc. — full list to enumerate during fix)
packages/db/src/migrations/002-oauth-tokens.sql 5, 18 oauth_tokens, oauth_states
packages/db/src/migrations/003-behavioral-patterns.sql 5, 21 behavioral_patterns, pattern_observations
packages/db/src/migrations/004-eval-history.sql 6, 21 eval_runs, eval_decisions
packages/db/src/migrations/005-scope-expansion.sql 12, 26, 42, 53, 64, 76 6 tables (cross_domain_traits, etc.)
packages/db/src/migrations/006-trust-tier-audit.sql 6 trust_tier_audit
packages/db/src/migrations/008-spend-tracking.sql 6 spend_records
packages/db/src/migrations/009-domain-autonomy-escalation.sql 7, 20 domain_autonomy, domain_autonomy_audit
packages/db/src/migrations/011-sessions.sql 4 sessions
packages/db/src/migrations/012-mempalace.sql 10, 40, 59, 72, 86, 101, 119, 143 8 mempalace tables

Refs WITH ON DELETE CASCADE (correct, do not touch)

Migrations 015, 027, 028, 030, 031, 032, 035, 036, 037, 038, 039, 040, 048, 049, 050, 059 — all consistently cascade.

What DELETE FROM users WHERE id = ? does today

CockroachDB raises FK violation on the first child row from any of the 32 no-cascade refs above. The delete is rejected; no rows touched. The error is structurally correct (the FK constraint prevents data orphan) but operationally fatal for #376.

What's Working Well (Do Not Touch)

  • Migrations 015+ are correctly modeled. Don't change them; don't "re-cascade" what's already cascading.
  • The mempalace tables in 012-mempalace.sql are a feature-flagged backend (MEMORY_BACKEND=mempalace) — cascading them is still right, just verify the migration doesn't break the in-flight feature flag.
  • The sessions table cascade behavior: sessions ARE per-user state and SHOULD cascade. Don't add an exception.

Proposed Change

A single migration 060-cascade-cleanup.sql that drops and re-adds every no-cascade user_id FK constraint with ON DELETE CASCADE.

Pattern per constraint:

-- Example for migrations/002-oauth-tokens.sql line 5
ALTER TABLE oauth_tokens DROP CONSTRAINT IF EXISTS oauth_tokens_user_id_fkey;
ALTER TABLE oauth_tokens ADD CONSTRAINT oauth_tokens_user_id_fkey
  FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE;

The constraint name will vary per table — find them with:

SELECT conname, conrelid::regclass AS table_name
FROM pg_constraint
WHERE confrelid = 'users'::regclass AND contype = 'f';

Repeat for all 32 tables. The migration is idempotent (the IF EXISTS makes re-runs safe).

After the migration lands, DELETE FROM users WHERE id = ? succeeds with a single statement; #376 becomes trivial.

Implementation Details

The migration generator should:

  1. Query pg_constraint to enumerate every FK on users(id).
  2. For each, check whether confdeltype = 'c' (cascade); skip if already cascading.
  3. Drop + re-add with cascade.

In practice this is a one-shot script — author writes the migration once, by hand or generator, and commits. Migration runs once per environment.

CockroachDB note: dropping and re-adding the constraint is online (no table lock) for tables under a few million rows, which all of these are.

Acceptance Criteria

  1. Migration 060-cascade-cleanup.sql exists and is idempotent (re-running is a no-op).
  2. After the migration, all 32 FK constraints on users(id) have ON DELETE CASCADE.
  3. DELETE FROM users WHERE id = '<test-uuid>' (with seeded data across all tables) succeeds without FK violation.
  4. Existing test suite passes; no regression on inserts/reads.
  5. New integration test: create a user → write rows to every per-user table → DELETE the user → assert zero rows remain in any per-user table.
  6. Verification query in the issue body (SELECT conname FROM pg_constraint WHERE confdeltype != 'c' AND confrelid = 'users'::regclass) returns zero rows after migration.

Testing Plan

Layer What Count
Unit Migration's SQL is syntactically valid (parse with pg / cockroach client) +1
Integration Seed all per-user tables → DELETE user → verify cascade clears every table +1
Verification query Post-migration, no FK on users(id) lacks cascade +1

The integration test is the load-bearing one. It also serves as #376's regression test.

Rollback Plan

Rolling back the migration means re-adding the no-cascade constraints. The rollback SQL mirrors the up migration:

ALTER TABLE oauth_tokens DROP CONSTRAINT oauth_tokens_user_id_fkey;
ALTER TABLE oauth_tokens ADD CONSTRAINT oauth_tokens_user_id_fkey
  FOREIGN KEY (user_id) REFERENCES users(id);  -- no CASCADE

Ship the rollback SQL as 060-cascade-cleanup.down.sql for any in-flight migration tooling that supports down-migrations.

Risk of rollback: if any user has already been deleted via #376's endpoint after this migration lands, the rolled-back state is invalid (deleted users would have left orphaned rows because the cascade was active during delete). In practice this means: once any production user has been deleted, this migration cannot be rolled back without restoring from backup. Document this in the migration's header comment.

Effort Estimate

Component Hours
Author the migration (enumerate constraints + write the cascade ALTERs) 1.5h
Integration test that seeds + deletes 2h
Verification query + readme update 0.5h
Code review + safe rollout 1h

Total: ~5h. Ships in the same PR series as #376 — this migration lands first, #376 lands second.

Files Reference

File Change
packages/db/src/migrations/060-cascade-cleanup.sql (new) The migration
packages/db/src/migrations/060-cascade-cleanup.down.sql (new) Rollback SQL
packages/db/__tests__/cascade-cleanup.test.ts (new) Integration test
packages/db/README.md (or migration header) Document the rollback risk

Out of Scope

  • The DELETE endpoint itself — that's P1.6 No "delete my data" endpoint exists #376.
  • Adding new cascade rules to non-user FKs (e.g., decision_id cascades). Future audit.
  • Changing the migration tooling. Use whatever the project uses today.
  • Backfilling cascade to tables in future migrations — those should be authored correctly to begin with (consider a lint rule: any new REFERENCES users(id) without ON DELETE CASCADE fails CI).

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingpriority/P1Must-fix before launchreleaseRelease / packaging

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions