-
Notifications
You must be signed in to change notification settings - Fork 4.1k
kv: Scans with limit scan too much of lockTable #49973
Description
This is analogous to #9521, but for the lockTable.
Now that the lockTable is scanned ahead of evaluation, limited scans observe locks over their entire key range instead of just up to the point where they hit their limit. This can artificially increase contention in much the same way it does in #9521. The difference is that in #9521, limited scans will wait excessively long for artificially contending operations to complete. In this issue, limited scans will wait excessively long for the entire transaction that issued the artificially contending operations to complete.
The design of the lockTable took this into consideration, though we haven't implemented a solution for this issue yet. Namely, the lockTable is built using an immutable btree which supports O(1) snapshots in the same way the the latchManager does. The point of this is 1) to be able to scan the lockTable incrementally once we introduce a lockAwareIterator and 2) to potentially do exactly what we're doing in #33373 and push that prototype over the finish line. #33373 evaluate limited scans optimistically without latching until after the fact, when it has determined the full bounds of the scan. The ability to snapshot the lockTable means that we could do the same thing there.
The saving grace for v20.1 is that most intents don't actually end up in the lockTable because replicated locks are only pulled into the lockTable when they are discovered in the MVCC keyspace. This means that in most cases, this is a non-issue under low to moderate contention. However, it can become an issue under either high contention or when unreplicated locks are used heavily. A patch I'm about to push will mitigate the latter concern.