-
Notifications
You must be signed in to change notification settings - Fork 4.1k
storage: replayed large transactions must serially resolve each intent #36876
Description
Extracted from #18684 (comment).
The workload is running workload init tpcc, so it's bulk ingesting a large amount of data. The specific queries being run are INSERT INTO ... VAUES ... statements with 1000 rows each. What we see happen in the trace of this specific statement is that it divides up its InitPut requests across ranges serially because we blow the DistSender concurrency limit out of this water while bulk loading. We then see the first InitPut hit a conflicting intent. It checks the intent's txn, finds that it is aborted, resolves the intent, then writes. Harmless enough, although we do see that the whole process take around 200ms. Then the next InitPut hits a conflicting intent. Same story. Then all 998 other writes in the statement hit the intents for the exact same aborted transaction. We must be retrying after an aborted transaction for the same statement which laid down intents for all 1000 keys but failed to clean them up!
Jira issue: CRDB-4478