Skip to content

kv: READ_UNCOMMITTED scan observes WriteIntentError, hits assertion #47219

@nvb

Description

@nvb

In #47120, we saw instances of the following assertion fire:

case *roachpb.WriteIntentError:
// Drop latches, but retain lock wait-queues.
g.AssertLatches()

At this point in the executeBatchWithConcurrencyRetries retry loop, we expect any request that hits a WriteIntentError to be holding latches. The assertion was firing because a meta2 scan, which uses the READ_UNCOMMITTED read consistency level and therefore does not acquire latches, was hitting a WriteIntentError. This should not be possible, as the MVCC layer is extra careful to never return this form of error to inconsistent reads.

Further debugging revealed that the WriteIntentError was actually coming from the ScanRequest's call into CollectIntentRows . CollectIntentRows is called to look up the provisional value associated with each intent observed by a READ_UNCOMMITTED scan. This is critical to avoid deadlock during range splits, where a range lookup may need to consider a meta2 intent to be the source of truth in terms of range addressing in order to resolve that intent.

It's completely unexpected that CollectIntentRows would be returning a WriteIntentError, because it uses MVCCGetAsTxn to lookup the provisional values of intents that we just saw during the scan. However, in the failure, we saw that the WriteIntentError was for a different transaction than the one responsible for the original intent. How could this be? Shouldn't we be working off of a consistent engine snapshot in MVCCScan?

Well, no. Read-only requests use a "read-only engine" returned from Engine.NewReadOnly. This engine by itself does not guarantee a consistent snapshot across all uses. The read-only engine contains two cached iterators, one for normal iteration and one for prefix iteration. Each iterator has an implicit snapshot of the DB, but those can be slightly different snapshots. In this case, the MVCCScan used a non-prefix iterator and the MVCCGetAsTxn used a prefix iterator. And since the READ_UNCOMMITTED scan wasn't holding latches, it had no guarantee of isolation within its keyspan across its two iterators.

So to summarize what happened here:

  • a meta2 READ_UNCOMMITTED came in and did not acquire latches
  • is scanned using a non-prefix iterator and hit an intent
  • it passed this intent to CollectIntentRows
  • this used a different prefix iterator and hit a different intent
  • some other write must have slipped in and resolved and replaced the first intent with the second intent between the two iterator creations
  • the scan returned a WriteIntentError to the concurrency manager
  • the assertion blew up

Metadata

Metadata

Assignees

Labels

A-kv-transactionsRelating to MVCC and the transactional model.C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.release-blockerIndicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions