kv: server-side refresh of non-WriteTooOld retry errors when replacing all reads with writes

[Read refreshes](https://github.com/cockroachdb/cockroach/blob/d9cc084baa9892e28fb7211bf715e086170344f5/pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go#L45) are the mechanism used by transactions to validate that their reads at ts1 are still valid at a later ts2. They are typically used ahead of transaction commit when a transaction has experienced read-write contention and needed to increase its `WriteTimestamp` beyond its initial `ReadTimestamp`.

CockroachDB contains an important optimization called "server-side refreshes", where a BatchRequest is [granted permission](https://github.com/cockroachdb/cockroach/blob/d9cc084baa9892e28fb7211bf715e086170344f5/pkg/roachpb/api.proto#L2466-L2473) by the txn client to [implicitly refresh](https://github.com/cockroachdb/cockroach/blob/d9cc084baa9892e28fb7211bf715e086170344f5/pkg/kv/kvserver/replica_evaluate.go#L534) the transaction's ReadTimestamp if the batch happens to experience contention. This optimization avoids a sequence of communication between a KV client and server in some cases. More importantly, it preserves the possibility that a txn can hit the 1PC fast-path (commit and resolve locks in the same Raft batch) in certain cases.

Server-side refreshes are currently permitted only if a transaction has not yet performed any reads:
https://github.com/cockroachdb/cockroach/blob/d9cc084baa9892e28fb7211bf715e086170344f5/pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go#L589-L603

## Proposal

We should extend this server-side refresh mechanism to support cases where a transaction has performed some reads.

Specifically, it should extend to cases where a transaction is replacing **all** of its prior reads with writes. In doing so, the writes serve as an implied refresh. If these writes have their timestamp pushed due to a read-write conflict but do not otherwise experience a WriteTooOld error, a server-side refresh should be allowed. On the other hand, a WriteTooOld error indicates a write-write conflict and implies that a refresh of the original reads would fail. In these cases, a server-side refresh should not be allowed.

## Impact

There are a collection of different transaction patterns that would benefit from this optimization. The one we care about most is a simple `UPDATE` statement that reads a row (`GetRequest`) and then updates its value (`PutRequest`). Today, if the `PutRequest` has its timestamp pushed, this operation will perform a full client-side refresh. This is expensive and also prevents the statement from hitting the 1PC fast-path. Not hitting the 1PC fast-path adds work to the system and expands the contention footprint of the txn, degrading throughput under contention.

To protect an UPDATE from experiencing this behavior, we acquire an implicit FOR UPDATE lock on its initial row scan. This blocks non-locking reads and prevents them from reading the same key. As a result, it prevents them from bumping the timestamp cache "beneath" the read-modify-write sequence performed by the UPDATE statement, so the UPDATE doesn't need to refresh. However, blocking non-locking reads on locks also limits concurrency in the system, is a deviation from Postgres, and works against [our proposal](https://docs.google.com/document/d/1ji6C0aDI6n61sVKPjf5-YUucbtgBlwfpsrNdcidW5a0/edit?usp=sharing) to eliminate blocking across write-read conflicts.

The quintessential example of such a workload is YCSB-A. Today, if we changed non-locking reads to ignore locks, we would see a large regression in YCSB-A. This optimization would avoid that regression. It would then let us make the locking changes proposed in the [non-blocking write-read conflict doc](https://docs.google.com/document/d/1ji6C0aDI6n61sVKPjf5-YUucbtgBlwfpsrNdcidW5a0/edit), which would improve concurrency and increase throughput in YCSB-A.

## Challenges

The key challenge here will be adding client-side code to detect cases where we can hit this optimization without that code being overly expensive. This change cannot cause a performance regression, especially for transactions that are not benefiting from the optimization.

## Alternatives

This proposal is similar in spirit (but not in implementation) to the intuition that a transaction should be able to avoid a refresh if all of its reads are pessimistic (acquire locks) and all of its writes check that those read locks are still present before committing. The benefit of what we propose here is that the optimization applies to pessimistic (SHARED locks) or optimistic (non-locking) reads.

Jira issue: CRDB-21251

Epic CRDB-25218

	// canForwardReadTimestampWithoutRefresh returns whether the transaction can
	// forward its read timestamp without refreshing any read spans. This allows for
	// the "server-side refresh" optimization, where batches are re-evaluated at a
	// higher read-timestamp without returning to transaction coordinator.
	//
	// This requires that the transaction has encountered no spans which require
	// refreshing at the forwarded timestamp and that the transaction's timestamp
	// has not leaked. If either of those conditions are true, a client-side refresh
	// is required.
	//
	// Note that when deciding whether a transaction can be bumped to a particular
	// timestamp, the transaction's deadline must also be taken into account.
	func (sr txnSpanRefresher) canForwardReadTimestampWithoutRefresh(txn roachpb.Transaction) bool {
	return sr.canForwardReadTimestamp(txn) && !sr.refreshInvalid && sr.refreshFootprint.empty()
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kv: server-side refresh of non-WriteTooOld retry errors when replacing all reads with writes #91417

Proposal

Impact

Challenges

Alternatives

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

kv: server-side refresh of non-WriteTooOld retry errors when replacing all reads with writes #91417

Description

Proposal

Impact

Challenges

Alternatives

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions