kv: release latches before evaluation of intent resolution

This is related to https://github.com/cockroachdb/cockroach/issues/66485 which is for read-only request evaluation.
The high-level idea is covered in https://github.com/cockroachdb/cockroach/issues/41720 but that issue is very broad.

Releasing latches would allow read requests (that acquire latches even though they release them early) to evaluate concurrently. It would also allow a write to evaluate concurrently with the intent resolution, and write another intent even though the previous one is still present in the storage engine. Additionally, based on recent incidents we think that having background intent resolution not interfere with normal reads/writes would improve the stability of the system.

This relies on the separated lock table since that adds the possibility of multiple intents for the same key (when all but one are known to be committed or aborted), and adds code-level structures for interleaving intent state into the MVCC key space in the engine.

The correctness issue with releasing latches during intent resolution is that we can’t let the lockTable data-structure and the source of truth in the engine go out-of-sync in arbitrary ways. The lockTable data-structure contains a subset of the state in the engine, which is necessary for correctness. It is updated to remove an intent after it is removed from the engine, but latches are held during the whole duration. Request evaluation also adds discovered locks to the lockTable data-structure, which maintains this subset correctness property since the request evaluation is holding latches (one exception is optimistic scans without holding latches, but they are careful not to add discovered locks if they were racing with intent resolution).

Additionally, for performance, we don’t want many read requests trying to resolve locks that a concurrent intent resolution is already resolving.

One approach to maintain correctness is to limit the cases where intent resolution releases latches before evaluation. The simple case of intent resolution (see the fast path in `MVCCResolveWriteIntent`) is when the txn is committing and there are no timestamp or epoch changes and no rolled back savepoints. This is a case where requests evaluating concurrently with the intent resolution can quite easily pretend that the intent resolution has already happened (with an enhancement to `intentInterleavingIter`). In this case the intent resolution would first mark the txn as committed (in the `finalizedTxnCache`, and pin this txn in that cache), then release latches and evaluate. Any subsequent readers (and writers) would use the presence of the txn in the `finalizedTxnCache` to interleave this txn’s intents as committed writes. This eliminates the possibility that these readers/writers would try to add these as discovered locks or try to do intent resolution.

Jira issue: CRDB-8269

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kv: release latches before evaluation of intent resolution #66867

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

kv: release latches before evaluation of intent resolution #66867

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions