Skip to content

sql: rollback to savepoint after schema change can lead to deadlock #24885

@andreimatei

Description

@andreimatei

The following deadlocks:

begin; savepoint cockroach_restart; create table t(x int); rollback to savepoint cockroach_restart;
select * from t;

The reason why it deadlocks is that the create left an intent on something (probably a table name record); the intent will never commit because txn's epoch is incremented by the rollback to savepoint, but it still exists. The select then tries to resolve that name in order to get a lease on that table - and that name resolution process uses a different txn which gets blocked on the intent.
Funny enough, if you do another create after the rollback, that works - probably because all reads/writes are done using the main txn in that case.

Found while investigating #24785

One way to fix this is for transactions that perform schema changes to employ a policy that says that they don't get to keep intents from old epochs - or rather, whenever we would bump the epoch, we'd create a new txn instead. This would mean that they don't get to benefit from the "reservation" effect of leaving intents around, but who cares. The moral justification is that these transactions touched "special" keys.
Another way is to do all lease acquisitions and name resolution and such by using the transaction running the statement that caused the lease. But I don't like that at all; it goes against divorcing the table descriptor caching layer from SQL transactions which I think is generally a great thing.

Opinions?

cc @vivekmenezes @bdarnell

Metadata

Metadata

Assignees

Labels

A-schema-changesC-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.S-2Medium-high impact: many users impacted, risks of availability and difficult-to-fix data errors

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions