Skip to content

storage: replayed large transactions must serially resolve each intent #36876

@nvb

Description

@nvb

Extracted from #18684 (comment).

The workload is running workload init tpcc, so it's bulk ingesting a large amount of data. The specific queries being run are INSERT INTO ... VAUES ... statements with 1000 rows each. What we see happen in the trace of this specific statement is that it divides up its InitPut requests across ranges serially because we blow the DistSender concurrency limit out of this water while bulk loading. We then see the first InitPut hit a conflicting intent. It checks the intent's txn, finds that it is aborted, resolves the intent, then writes. Harmless enough, although we do see that the whole process take around 200ms. Then the next InitPut hits a conflicting intent. Same story. Then all 998 other writes in the statement hit the intents for the exact same aborted transaction. We must be retrying after an aborted transaction for the same statement which laid down intents for all 1000 keys but failed to clean them up!

Jira issue: CRDB-4478

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-kv-transactionsRelating to MVCC and the transactional model.C-performancePerf of queries or internals. Solution not expected to change functional behavior.T-kvKV TeamX-staleno-issue-activity

    Type

    No type

    Projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions