-
Notifications
You must be signed in to change notification settings - Fork 4.1k
sql, backfill: rolling back a schema change that drops a column leads to restored column with missing data #46541
Description
If a schema change involves dropping a column, and the column doesn't have a default value (or anything that would require a backfill if the column were being added for the first time), rolling back the schema change for any reason can produce a restored column with missing data.
Here's an example, on 19.2.2, where we attempt to drop a column and add a unique index in the same transaction, and adding the unique index fails:
root@127.0.0.1:54369/movr> create table t (a int, b int); insert into t values (1, 1), (2, 1);
INSERT 2
Time: 13.643ms
root@127.0.0.1:54369/movr> begin; alter table t drop column a; create unique index on t(b); commit;
pq: transaction committed but schema change aborted with error: (23505): duplicate key value (b)=(1) violates unique constraint "t_b_key"
HINT: Some of the non-DDL statements may have committed successfully, but some of the DDL statement(s) failed.
Manual inspection may be required to determine the actual state of the database.
--
See: https://github.com/cockroachdb/cockroach/issues/42061
root@127.0.0.1:54369/movr> select * from t;
b | a
+---+------+
1 | NULL
1 | NULL
(2 rows)
Time: 2.192ms
I haven't directly verified that it's possible to only have some (or none) of the values in the column be nulled out, instead of all of them, but it should be possible in theory if we cancel the job before the backfill to drop the column is finished.
Ultimately, this is because we treat the rollback of a dropped column identically to adding the column in the first place. Maybe a relatively easy improvement would be to run a backfill that deletes all the values whenever we're doing a rollback that is initiated in the delete-only state, to at least avoid having a partially deleted column.
This is a problem that's existed for a long time (at least since 19.1, and maybe for as long as the column backfiller has existed), but I couldn't find an existing issue where it's discussed.
Jira issue: CRDB-5083
Epic CRDB-104