-
Notifications
You must be signed in to change notification settings - Fork 4.1k
sql: any DELETE upgrades a SNAPSHOT txn to SERIALIZABLE #19876
Description
At the KV level, any DelRange request sets the RetryOnPush field on the txn, causing the transaction to return a retriable error if it is pushed (reason RETRY_DELETE_RANGE). For SNAPSHOT txns that means that they'll behave like SERIALIZABLE txns. Why? See [1].
SQL uses DelRange for deleting rows, even for deleting a single row with a known PK. So, as soon as a DELETE statement is used in a SNAPSHOT txn, the point of the SNAPSHOT txn goes out the window. Actually, it's even worse, as SQL does not perform eager retry for these pushed SNAPSHOT txns like it does for pushed SERIALIZABLE txns.
I believe SQL does not need to use DelRange. It could explicitly Delete every column family for the row(s) in question. It's not entirely clear to me if we'd need to Delete every col fam that could possibly exist or only the ones that actually exist (col fams with only NULL vals are not actually present in KV).
A concern that @danhhz brought up was about deletions performed during a schema change that removes a column family -> if we were to explicitly delete col fams, would the col fam in question be left as garbage and not cleaned up properly because the descriptor used for the ? I hope this is FUD and everything works fine due to the "delete_only" state that a col fam goes through during the schema change. But I haven't really thought about it.
[1] Details about why these txns need to restart can be seen in #6240. Essentially, since a DelRange only lays down intents for keys that exist, if the txn is pushed later on, it's possible for another txn to write a key that should be deleted but isn't, which is analogous to a lost update (I think), which is not permitted under snapshot isolation.
cc @vivekmenezes @tschottdorf @bdarnell
Vivek, this came about because some people are complaining about seeing these RETRY_DELETE_RANGE restarts in SNAPSHOT txns, and I feel for them. The point of SNAPSHOT is quite severely put in question by this fact, and so I think we should do something about it.