kv: pessimistic-mode, replicated read locks to enable large, long running transactions 

**Is your feature request related to a problem? Please describe.**

We've increasingly seen issues with unlimited retries of large, long-running transactions (see https://github.com/cockroachdb/cockroach/issues/51294, https://github.com/cockroachdb/cockroach/issues/44645). In some cases where there is no contention whatsoever, the solution to those problems has been to increase the `kv.transaction.max_refresh_spans_bytes` cluster settings. The work in 20.2 (#46275) to compress those spans should be helpful in these zero-contention cases, however, it may lead to false dependencies.

We've also noted that operations which have laid intents down over reads do not need to refresh those reads. This fact however has not been actionable (and thus there is no code to subtract write keys from the refresh spans) because SQL reads have always been performed as scans and not gets. This is changing soon as @helenmhe works on https://github.com/cockroachdb/cockroach/issues/46758 with a WIP at https://github.com/cockroachdb/cockroach/pull/52511. 

I suspect that the above situation will be rather helpful in unbounded `DELETE`s read off of a secondary index with no writes in the range being deleted but writes interspersed in the primary index. In that case, the compression introduced in #46275 will prove problematic.

As we move towards an implementation of transactional schema changes, we are going to be introducing operations which will, by their very nature, will have their timestamp pushed. Furthermore, these transactions are likely to be extremely expensive to retry. However, these transactions are unlikely to be latency sensitive and thus might pair nicely with a mode that allowed reads to push them but blocked contended writes.  

**Describe the solution you'd like**

The solution I'd like to see is a transaction mode whereby all reads acquired a replicated, durable read lock over all spans which were read. This read lock would prevent the need for reads to be refreshed when the transaction's timestamp is pushed. It might make sense to automatically switch to this mode when a transaction enters its second epoch. 

**Describe alternatives you've considered**

The primary alternative in the context to transactional schema changes is to just accept that retries may happen in the face of contention and that one needs to deal with retries.

**Additional context**
The concept of ranged read locks has long been blocked on the existence of a separated lock table. This seems to be possible in the 21.1 timeframe. 


Jira issue: CRDB-3922

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kv: pessimistic-mode, replicated read locks to enable large, long running transactions #52768

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

kv: pessimistic-mode, replicated read locks to enable large, long running transactions #52768

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions