-
Notifications
You must be signed in to change notification settings - Fork 4.1k
kv: child transactions #54633
Description
Is your feature request related to a problem? Please describe.
KV transactions provide serializable isolation for a series of KV operations. Importantly, the KV layer also provides deadlock detection to ensure that the system remains live. This deadlock detection system makes an assumption that each operation transaction corresponds to a single thread of execution which dependent on the liveness of another transaction to make progress. We've observed a variety of deadlocks related to descriptor lease acquisition and query planning which invalidated that assumption (#24885, #46224, #46447). The solutions (#46170, #46384, #46588) to these deadlocks was to run the "child" operation at PRIORITY HIGH in hopes that the intent on which the operations blocked were not run at that priority. Unfortunately this was really just a band-aid and if users did run their schema change operations at PRIORITY HIGH we'd still have those deadlocks.
Another related problem with the current transactional model is that it does not provide a mechanism to run a secondary transaction on the thread of execution of another transaction which might encounter intents (of the parent transaction or of another transaction which might be blocked on the parents). This is a critical component of the proposed scheme for providing transactional schema changes. It is important to enable changes to the database for the purpose of coordinating with concurrent transactions but allowing the parent to avoid interacting with changing keys in order to retain its serializable view of the database.
Describe the solution you'd like
The proposal is that we create a new type of transaction, a child transaction to join the existing two types of transactions, root transactions and leaf transactions. A child transaction is an independent read-write transaction that refers to a parent transaction at a given sequence and snapshot. For the purpose of deadlock detection, the parent will be seen as being blocked on the child (more details below). The child transaction will read written values of the parent as though it were the parent transaction.
Implementation details
The child transaction will have a new field in its enginepb.TxnMeta. Child transactions may create their own child transactions. In that case, the TxnMeta.Parent.Parent will be not nil. This will effectively form a linked-list of parents. We do not intend for the depth to get very deep.
// Parent, if non-nil, represents the parent of this transaction ...
Parent *TxnMeta
When the MVCC code encounters an intent it will check if it refers to a parent (recursively).
Deadlock breaking during lease acquisition
A goal of this work should be to cleanup the hacks introduced in #46170 and #46384 which use priority high to push a deadlock caused by lease acquisition blocking on an earlier epoch or rolled back sequence number. Unfortunately, this isn't that simple! The problem is that today we use a single-flight group to coalesce lease acquisition. We have no way to represent the dependency on that singleflight in our distributed deadlock detection.
Jira issue: CRDB-3733