-
Notifications
You must be signed in to change notification settings - Fork 4.1k
kv: DeleteRange anomaly at SNAPSHOT #6240
Description
TestTxnDBLostUpdateAnomaly_DeleteRange currently fails with variations of
iso=[SNAPSHOT SNAPSHOT SNAPSHOT], pri=[3 2 1]
I1.1(A)[1] C1.1 DR3.1(A-C) R2.1(A)[1] C3.1 W2.1(B-A)[1] C2.1
and
iso=[SNAPSHOT SNAPSHOT SNAPSHOT], pri=[2 3 1]
I1.1(A)[1] C1.1 R2.1(A)[1] DR3.1(A-C) W2.1(B-A)[1] C3.1 C2.1
In the first
- the value A=1 is written and committed
- transaction 3 deletes A-C (which means leaving an intent on A and
populating the write timestamp cache interval A-C. - transaction 2 pushes the intent on A and reads A=1.
The write timestamp cache entry on A isn't updated properly in the process
(which is an unrelated issue that likely incurs anomalies). - transaction 2 writes B=1, commits.
This is the anomaly: logically, there is DeleteRange intent sitting there
with a higher timestamp, so this should be a write-write conflict. But there
isn't an actual intent there (since the key was previously vacant, just like
the countably infinite rest of vacant keys around it). - transaction 3 commits (at its higher timestamp), so now A=,
but note that we still have B=1 (written by transaction 2).
Note that this example should fail in the same way with only transaction 3 at
SNAPSHOT isolation.
The one obvious fix that comes to mind is to "somehow" push the whole range
entry in the timestamp cache forward along with the push of one of its intents.
However, that seems brittle and may not reliably be possible since the
timestamp cache's current implementation freely splits up its intervals as new
entries are added.
In the second one, the read of A and the range deletion are interchanged:
iso=[SNAPSHOT SNAPSHOT SNAPSHOT], pri=[2 3 1]
I1.1(A)[1] C1.1 R2.1(A)[1] DR3.1(A-C) W2.1(B-A)[1] C3.1 C2.1
In other words:
- write and commit A=1 as before
- txn2 reads A=1
- txn3 deletes A-C (so
ts(txn3) > ts(txn2)) - txn2 writes B=1 (pushing its timestamp due to tsCache entry on A-C); now
againts(txn3) < ts(txn2) - both commit and the visible values are A=, B=1; we've again lost txn3's
deletion of B.
This seems a little less anomalous than the above since essentially what
happens is that txn2 has write skew: It reads a stale value of A and writes
it somewhere, except that this "somewhere" would be a write-write conflict in
most monolithic databases.
See #6227 for the PR that introduced these tests.