-
Notifications
You must be signed in to change notification settings - Fork 4.1k
kv: follower read that hits intent may get stuck in endless loop #46148
Description
@dt saw on one of his clusters that a historical read issued by a samplerProcessor was stuck in an indefinite loop of hitting an intent during evaluation, not pushing the intent's txn, and retrying evalution. Some debugging revealed that this endless loop was taking place on a follower replica.
Follower replicas do not have their lockTable enabled, which means that they will not add discovered locks to the table. The result of this is that the intents were being ignored during each "concurrency retry" and never pushed. Because the request was performing a follower read, it was not prevented from evaluating during its next iteration, so the loop never terminated. This was a counter-example to:
cockroach/pkg/kv/kvserver/concurrency/testdata/lock_table/disable
Lines 39 to 42 in a06be62
| # NOTE: this won't end up in an infinite loop of scanning a disabled | |
| # lock-table and discovering but ignoring the same lock in practice | |
| # because the second pass through evaluation is likely to hit a | |
| # NotLeaseholderError, bouncing the request back to the client. |
We need to fix this by either pushing immediately when an intent is discovered on a replica without an enabled lockTable or throwing a NotLeaseholderError in this situation.