Skip to content

storage: 4.5 seconds after a cluster creation is a bad time to run your transactions #32495

@andreimatei

Description

@andreimatei

A transaction running around 4.5s after cluster creation (think tests) can easily catch a TransactionAbortedError. This is because all ranges start up as having expiration-based leases (as a likely path dependency / accident - they all come from the first range, and splits maintain the original epoch-based lease on both sides). 4.5s later, these leases become eligible for a refresh. The new leases are generally epoch-based, and so it is not Equivalent() to the old one (and so it gets a new Sequence, which fact in turn causes the new lease acquisition to trigger this code which resets the timestamp cache.

After that ts cache wipe, a concurrent BeginTxn can fail its tscache check resulting in TransactionAbortedError(ABORT_REASON_TIMESTAMP_CACHE_REJECTED_POSSIBLE_REPLAY).
I believe we've seen this be a cause of flakiness for multiple tests.

Discussing with @bdarnell, it seems that we have a couple of options:

  1. if the rhs of a split wants epoch-based leases, have it not inherit the expiration-based lease from the lhs. Exactly how to do that is yet unclear. Can a range not have a lease at all? Perhaps we can give the rhs an expired lease.
  2. make Lease.Equivalent() understand this transition from exp to epo, and have it consider the two equivalent.

cc @tbg @benesch

Metadata

Metadata

Assignees

Labels

A-kv-replicationRelating to Raft, consensus, and coordination.C-test-failureBroken test (automatically or manually discovered).

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions