Skip to content

kv: intents from transactions that have been successfully PUSH_TIMESTAMP-ed is O(num intents) #103126

@arulajmani

Description

@arulajmani

Describe the problem

Higher priority readers in CRDB are able to push lower priority writers above their timestamp, thus allowing readers to proceed without conflicting with the writer. However, if the writer has written intents at the lower timestamp, the reader needs to resolve them (i.e move them to the higher timestamp) before it can proceed with its scan.

Today, this process is O(num intents) -- the reader pushes the writer every time it discovers a conflicting intent and resolves that intent before proceeding to the next one. This happens here:

return w.ir.ResolveIntent(ctx, resolve, opts)

To Reproduce

-- session 1
CREATE TABLE keys (k BIGINT NOT NULL PRIMARY KEY);
BEGIN; INSERT INTO keys SELECT generate_series(1, 1000);

-- session 2
BEGIN PRIORITY HIGH; SELECT count(*) FROM keys;

-- takes ~7ms per intent

Proposed solution

Prior to #49218, this problem existed for finalized (committed, but more notably, aborted) transactions as well. That patch introduced the finalizedTxnCache which is added to here:

// If the transaction is finalized, add it to the finalizedTxnCache. This
// avoids needing to push it again if we find another one of its locks and
// allows for batching of intent resolution.
if pusheeTxn.Status.IsFinalized() {
w.lt.TransactionIsFinalized(pusheeTxn)
}

This ensures a subsequent re-scan of the lock table (post intent resolution) do not have to push the same transaction again to recognize that it is finalized. Instead, it can just collect intents from the finalized transaction and batch resolve them in one go. This means that intent resolution is O(num ranges) for finalized transactions instead of O(intents)[*].

We should extend this concept for transactions that are known to have been pushed to a higher timestamp as well. This would allow high priority readers to collect and batch resolve intents in similar fashion.

[*] Assuming no async-intent resolution and that the readers read set includes all intents written by the writer.

Additional context

Notably, this impacts backups (which eventually run high priority ExportRequests). Backups run high priority ExportRequests so that they aren't starved by concurrent writers; however, if we have writer that's writing a high enough amount of intents, the backup can indeed be starved.

cc @nvanbenschoten @adityamaru

Jira issue: CRDB-27844

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-kv-transactionsRelating to MVCC and the transactional model.C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.O-supportWould prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docsT-kvKV Teamv23.1.9

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions