-
Notifications
You must be signed in to change notification settings - Fork 4.1k
storage/concurrency: prep lock-table for 20.1 release #44976
Description
Release 20.1 will contain a new approach to kv-level transaction locking. Instead of storing all locks inline in the MVCC keyspace and finding locks during request evaluation, locks will now be maintained in a separate lock-table. While replicated locks (intents) will not be pulled into a persistent lock-table keyspace during this release, the lock-table structure will track these locks, as well as unreplicated (best-effort) locks which enable SELECT FOR UPDATE (see #40205). Additionally, the lock-table structure will lead to significant improvements in transaction queuing. The lock-table's lock wait-queues will replace the current contentionQueue, leading to improved fairness between transactions and reduced latency now that the lock wait-queues react directly to intent resolution. For more on this change, see #41720.
The rest of this issue enumerates the remaining work items for the lock-table and storage/concurrency package that should get into this release. It does not include any work items that are specific to SELECT FOR UPDATE's use of the storage/concurrency package.
Must Have
- pull Store-level concurrency retry loop under Replica: @nvanbenschoten storage: pull Store-level concurrency retry loop under Replica, clean up req validation #43138
- stop modifying requests and returning responses from TxnWaitQueue: @nvanbenschoten storage: stop modifying requests and returning responses from TxnWaitQueue #43383
- prep LocalResult struct to carry LockUpdate and LockAcquisition events to feed into lock-table: @nvanbenschoten storage/batcheval/result: perform various cleanup on LocalResult struct #43703
- define lock Strength and Durability modes: @nvanbenschoten storage/concurrency/lock: define lock Strength and Durability modes #44054
- create lock-table and lock wait-queue structures: @sumeerbhola storage/concurrency: lock table for tracking replicated and unreplicated locks and sequencing requests #43740
- generate type-safe specializations of interval B-tree: @nvanbenschoten util/interval: generate type-safe specializations of interval B-tree #43850
- define concurrency control interfaces: @nvanbenschoten storage/concurrency: define concurrency control interfaces #44787
- randomized test for lockTable with concurrency: @sumeerbhola storage/concurrency: randomized test for lockTable with concurrency #44791
- implement lockTableWaiter interface: @nvanbenschoten storage/concurrency: implement lockTableWaiter #44885
- micro-benchmark lockTable: @sumeerbhola storage/concurrency: benchmark for lockTable #44964
- implement concurrency.Manager interface: @nvanbenschoten storage/concurrency: implement concurrency Manager #45062
- don't push reservation holders, only lock holders: @nvanbenschoten storage/concurrency: don't push reservation transactions #45420
- hook concurrency.Manager structure into storage package: @nvanbenschoten storage: integrate Concurrency Manager into Replica request path #45482
- push reservation holders to detect deadlocks: @nvanbenschoten storage/concurrency: push reservation holders to detect deadlocks #45567
- replace btree implementation in lockTable with specialized copy-on-write btree: @sumeerbhola storage/concurrency: switch lockTableImpl to use the specialized #45276
- place memory limits on lockTable and degrade gracefully when limits are hit
- add testing for lock upgrades (i.e. with epoch bumps and ignored seq nums): @sumeerbhola kvserver/concurrency: misc lock table tests to cover functionality #45769
- support non-transactional writes in lock-table: @sumeerbhola storage/concurrency: add support for non-transactional writes to #44975
- allow ReadOnly and ReadWrite spans passed to lockTable to overlap: @sumeerbhola storage/concurrency: changes to lock table to handle requests that #45124
- make
waitElsewherestate be used only for a replicated lock that is held: @sumeerbhola Locktable improvements #45040 storage/concurrency: test and bugfix for clearing the locks when the #45080 - change write timestamp declared by ResolveIntent{Range} requests to txn.MinTimestamp: @nvanbenschoten storage/batcheval: declare intent resolution at txn MinTimestamp #45232
- update waitingState.held on each waiter when locks are acquired and released: @sumeerbhola Locktable improvements #45040
- eliminate the doneWaiting slice that is used as a return value for many lockState methods: @sumeerbhola Locktable improvements #45040
- clear queued waiters when part of same txn as discovered lock: @nvanbenschoten storage/concurrency: release reqs from same txn as discovered lock #45601
- optimize memory allocations in lockTable: @nvanbenschoten storage/concurrency: pool concurrency manager memory allocations #45633
- stabilize
kv/contention/nodes=4: @nvanbenschoten - benchmark cockroach with and without concurrency.Manager integration using
kv,sysbench, andtpccto detect regression: @nvanbenschoten - disable lock-table when not leaseholder: @nvanbenschoten kvserver/concurrency: disable lock-table when not leaseholder #45830
Nice To Have
- use txn priorities when ordering requests to avoid higher priority request waiting on a lower priority reservation holder @sumeerbhola
- benchmark comparison between lazy waiting (current) and eager waiting in contended workloads @sumeerbhola
- re-introduce invariant that reservation holder is always the request that acquires locks
- distinguish in
lockTableGuardImplbetween waiting on lock holder and waiting on queued requests storage/concurrency: implement lockTableWaiter #44885 - add visibility into state of lockTable through metrics and/or debug pages: kv/concurrency: expose lock table counters, introduce timeseries metrics #67350
- simplify state transitions in
acquireLockby not removing from lock holder'slockTableGuardImpl.mu.locks - eliminate waiting details except when waiting on lock holder, and do related code simplification @sumeerbhola
Feel free to modify. Please check off when changes merge.