Skip to content

kv: server-side refresh of non-WriteTooOld retry errors when replacing all reads with writes #91417

@nvb

Description

@nvb

Read refreshes are the mechanism used by transactions to validate that their reads at ts1 are still valid at a later ts2. They are typically used ahead of transaction commit when a transaction has experienced read-write contention and needed to increase its WriteTimestamp beyond its initial ReadTimestamp.

CockroachDB contains an important optimization called "server-side refreshes", where a BatchRequest is granted permission by the txn client to implicitly refresh the transaction's ReadTimestamp if the batch happens to experience contention. This optimization avoids a sequence of communication between a KV client and server in some cases. More importantly, it preserves the possibility that a txn can hit the 1PC fast-path (commit and resolve locks in the same Raft batch) in certain cases.

Server-side refreshes are currently permitted only if a transaction has not yet performed any reads:

// canForwardReadTimestampWithoutRefresh returns whether the transaction can
// forward its read timestamp without refreshing any read spans. This allows for
// the "server-side refresh" optimization, where batches are re-evaluated at a
// higher read-timestamp without returning to transaction coordinator.
//
// This requires that the transaction has encountered no spans which require
// refreshing at the forwarded timestamp and that the transaction's timestamp
// has not leaked. If either of those conditions are true, a client-side refresh
// is required.
//
// Note that when deciding whether a transaction can be bumped to a particular
// timestamp, the transaction's deadline must also be taken into account.
func (sr *txnSpanRefresher) canForwardReadTimestampWithoutRefresh(txn *roachpb.Transaction) bool {
return sr.canForwardReadTimestamp(txn) && !sr.refreshInvalid && sr.refreshFootprint.empty()
}

Proposal

We should extend this server-side refresh mechanism to support cases where a transaction has performed some reads.

Specifically, it should extend to cases where a transaction is replacing all of its prior reads with writes. In doing so, the writes serve as an implied refresh. If these writes have their timestamp pushed due to a read-write conflict but do not otherwise experience a WriteTooOld error, a server-side refresh should be allowed. On the other hand, a WriteTooOld error indicates a write-write conflict and implies that a refresh of the original reads would fail. In these cases, a server-side refresh should not be allowed.

Impact

There are a collection of different transaction patterns that would benefit from this optimization. The one we care about most is a simple UPDATE statement that reads a row (GetRequest) and then updates its value (PutRequest). Today, if the PutRequest has its timestamp pushed, this operation will perform a full client-side refresh. This is expensive and also prevents the statement from hitting the 1PC fast-path. Not hitting the 1PC fast-path adds work to the system and expands the contention footprint of the txn, degrading throughput under contention.

To protect an UPDATE from experiencing this behavior, we acquire an implicit FOR UPDATE lock on its initial row scan. This blocks non-locking reads and prevents them from reading the same key. As a result, it prevents them from bumping the timestamp cache "beneath" the read-modify-write sequence performed by the UPDATE statement, so the UPDATE doesn't need to refresh. However, blocking non-locking reads on locks also limits concurrency in the system, is a deviation from Postgres, and works against our proposal to eliminate blocking across write-read conflicts.

The quintessential example of such a workload is YCSB-A. Today, if we changed non-locking reads to ignore locks, we would see a large regression in YCSB-A. This optimization would avoid that regression. It would then let us make the locking changes proposed in the non-blocking write-read conflict doc, which would improve concurrency and increase throughput in YCSB-A.

Challenges

The key challenge here will be adding client-side code to detect cases where we can hit this optimization without that code being overly expensive. This change cannot cause a performance regression, especially for transactions that are not benefiting from the optimization.

Alternatives

This proposal is similar in spirit (but not in implementation) to the intuition that a transaction should be able to avoid a refresh if all of its reads are pessimistic (acquire locks) and all of its writes check that those read locks are still present before committing. The benefit of what we propose here is that the optimization applies to pessimistic (SHARED locks) or optimistic (non-locking) reads.

Jira issue: CRDB-21251

Epic CRDB-25218

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-kv-transactionsRelating to MVCC and the transactional model.A-non-blocking-readsC-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)T-kvKV Team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions