Skip to content

πŸ€” Discussion β€” SQLite: disabling foreign keys for migration/import?Β #2471

@geelen

Description

@geelen

We have a couple of related issues:

1. D1 export produces unimportable files

D1 Export follows SQL dump, which outputs SQL in the following order for a DB with two tables, A and B:

  1. pragma defer_foreign_keys=TRUE
  2. Table definition A
  3. Inserts for row A
  4. Table definition B
  5. Inserts for row B

If table A has a column that references B, the INSERT on step 3 will fail with Parse error: no such table: main.B. If we reorder 3 and 4, then defer_foreign_keys does what you'd expect and allows the rows in A to be added with no corresponding rows in B to reference (without defer_foreign_keys, step 3 would fail with FOREIGN KEY constraint failed).

This appears to be an unavoidable difference between SQLite's default behaviour of setting PRAGMA foreign_keys=OFF; at the beginning of DB exports, and D1/workerd using pragma defer_foreign_keys=TRUE. From the SQLite docs, these are "DML errors" and are separate to the section on immediate/deferred constraints:

If the database schema contains foreign key errors that require looking at more than one table definition to identify, then those errors are not detected when the tables are created. Instead, such errors prevent the application from preparing SQL statements that modify the content of the child or parent tables in ways that use the foreign keys. Errors reported when content is changed are "DML errors" and errors reported when the schema is changed are "DDL errors". So, in other words, misconfigured foreign key constraints that require looking at both the child and parent are DML errors. The English language error message for foreign key DML errors is usually "foreign key mismatch" but can also be "no such table" if the parent table does not exist. Foreign key DML errors are reported if:

β€’ The parent table does not exist, or
β€’ The parent key columns named in the foreign key constraint do not exist, or
β€’ The parent key columns named in the foreign key constraint are not the primary key of the parent table and are not subject to a unique constraint using collating sequence specified in the CREATE TABLE, or
β€’ The child table references the primary key of the parent without specifying the primary key columns and the number of primary key columns in the parent do not match the number of child key columns.

The easy solution would be to reorder the output of d1 export so that all tables are generated first, then all INSERT statements. That should make all foreign key violations deferrable, as there's no way for a dump to contain a DML error. But that doesn't solve the related issues.

2. Dropping tables in migrations cannot prevent ON DELETE CASCADE (workerd-issue prisma-issue)

Some table altering tasks in SQLite cannot be achieved without replacing it with a new table (by dropping - recreating the table) (like adding foreign keys, changing primary keys, updating column type). Dropping a table which has column referenced by other tables with ON DELETE CASCADE will delete all records of the child tables.

This is also an issue with pragma defer_foreign_keys=TRUE not being a substitute for PRAGMA foreign_keys=OFF;. Tools like prisma, aware of SQLite's limitations, generate migrations like the following to migrate between table schemas (from this reproduction, thank you @hrueger!):

-- Initial schema
CREATE TABLE "User" (
    "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
    "email" TEXT NOT NULL
);
CREATE TABLE "Post" (
    "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
    "title" TEXT NOT NULL,
    "authorId" INTEGER,
    CONSTRAINT "Post_authorId_fkey" FOREIGN KEY ("authorId") REFERENCES "User" ("id") ON DELETE SET NULL ON UPDATE CASCADE
);
-- Generated migration for adding 'name' to 'User' table:
PRAGMA defer_foreign_keys=ON;
PRAGMA foreign_keys=OFF;
CREATE TABLE "new_User" (
    "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
    "email" TEXT NOT NULL,
    "name" TEXT NOT NULL DEFAULT 'Anonymous'
);
INSERT INTO "new_User" ("email", "id") SELECT "email", "id" FROM "User";
DROP TABLE "User";
ALTER TABLE "new_User" RENAME TO "User";
CREATE UNIQUE INDEX "User_email_key" ON "User"("email");
PRAGMA foreign_keys=ON;
PRAGMA defer_foreign_keys=OFF;

It's not clear to me how much of the above is D1-specific (the absence of a BEGIN TRANSACTION is surely specific to us, for example), but since D1 always runs queries within transactions, the PRAGMA foreign_keys=OFF; statement has no effect (which is why we use defer_foreign_keys in the first place). The result is that D1 runs all ON DELETE clauses when you drop the old table, resulting in lost data.

3. No support for D1 export for databases containing FTS5 full-text search tables (workers-sdk issue)

We only support one kind of virtual table, FTS5, for full text search. But, when porting SQLite's .dump command we found that it would generate SQL requiring writable_schema, a pragma we don't support.

This input SQL:

CREATE TABLE documents (id INTEGER PRIMARY KEY, title TEXT NOT NULL, content TEXT NOT NULL);
CREATE VIRTUAL TABLE documents_fts USING FtS5(id, title, content, tokenize = porter);

Generates the following SQL using SQLite's native .dump command:

PRAGMA writable_schema=ON;
INSERT INTO sqlite_schema(type,name,tbl_name,rootpage,sql)VALUES('table','documents_fts','documents_fts',0,'CREATE VIRTUAL TABLE documents_fts USING FtS5(id, title, content, tokenize = porter)');
CREATE TABLE IF NOT EXISTS 'documents_fts_data'(id INTEGER PRIMARY KEY, block BLOB);
INSERT INTO documents_fts_data VALUES(1,X'');
INSERT INTO documents_fts_data VALUES(10,X'00000000000000');
CREATE TABLE IF NOT EXISTS 'documents_fts_idx'(segid, term, pgno, PRIMARY KEY(segid, term)) WITHOUT ROWID;
CREATE TABLE IF NOT EXISTS 'documents_fts_content'(id INTEGER PRIMARY KEY, c0, c1, c2);
CREATE TABLE IF NOT EXISTS 'documents_fts_docsize'(id INTEGER PRIMARY KEY, sz BLOB);
CREATE TABLE IF NOT EXISTS 'documents_fts_config'(k PRIMARY KEY, v) WITHOUT ROWID;
INSERT INTO documents_fts_config VALUES('version',4);
PRAGMA writable_schema=OFF;

We chose to disallow D1 exports containing virtual tables until we could find a resolution, as we didn't want to allow writable_schema (it requires SQLITE_DBCONFIG_DEFENSIVE to be disabled). Our best option was to skip the create table statements for the generated tables (documents_fts_data, documents_fts_idx, documents_fts_content documents_fts_docsize and documents_fts_config) and instead of the INSERT INTO sqlite_schema call, instead recreate the CREATE VIRTUAL TABLE that created it. The result won't necessarily be identical, but should be functionally equivalent for normal use. But maybe we should explore alternatives.

Proposal: SQL unsafe mode for D1 migrations

The lack of PRAGMA foreign_keys=OFF stems from the fact that all workerd queries execute within a transaction, which is implicitly created whenever a SQL query includes a write statement. If we had an API that let us execute a pragma without this logic, we could support importing from unmodified SQL dumps.

I propose the following api, sql.unsafeExec, which instead of implicitly creating a transaction, requires the user to submit SQL with multiple statements:

sql.execUnsafe(`
  PRAGMA ...;
  BEGIN;
  ...
  COMMIT;
`)

unsafeExec enforces these constraints:

  • Within an .execUnsafe block, a different authorizer is invoked, allowing BEGIN/COMMIT/ROLLBACK, as well as SAVEPOINT/RELEASE
  • Any write statement outside of a transaction result in an error
  • Multiple sequential transactions in a single call are permitted

This doesn't give us writable_schema without disabling SQLITE_DBCONFIG_DEFENSIVE (which I don't know what the implications of that would be, so I'm not suggesting it here), but we could wire up D1 to call execUnsafe on migrations/import and allow tools like Prisma to manage the schema without needing so much special treatment for D1.

This solves problem 1 and 2, and gives us a potential way to solve problem 3 in the future. Thoughts?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions