-
Notifications
You must be signed in to change notification settings - Fork 4.1k
sql: any DELETE upgrades a SNAPSHOT txn to SERIALIZABLE #19876
Copy link
Copy link
Closed
Labels
C-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)C-performancePerf of queries or internals. Solution not expected to change functional behavior.Perf of queries or internals. Solution not expected to change functional behavior.O-communityOriginated from the communityOriginated from the community
Milestone
Metadata
Metadata
Assignees
Labels
C-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)C-performancePerf of queries or internals. Solution not expected to change functional behavior.Perf of queries or internals. Solution not expected to change functional behavior.O-communityOriginated from the communityOriginated from the community
At the KV level, any
DelRangerequest sets theRetryOnPushfield on the txn, causing the transaction to return a retriable error if it is pushed (reasonRETRY_DELETE_RANGE). For SNAPSHOT txns that means that they'll behave like SERIALIZABLE txns. Why? See [1].SQL uses
DelRangefor deleting rows, even for deleting a single row with a known PK. So, as soon as a DELETE statement is used in a SNAPSHOT txn, the point of the SNAPSHOT txn goes out the window. Actually, it's even worse, as SQL does not perform eager retry for these pushed SNAPSHOT txns like it does for pushed SERIALIZABLE txns.I believe SQL does not need to use
DelRange. It could explicitlyDeleteevery column family for the row(s) in question. It's not entirely clear to me if we'd need toDeleteevery col fam that could possibly exist or only the ones that actually exist (col fams with only NULL vals are not actually present in KV).A concern that @danhhz brought up was about deletions performed during a schema change that removes a column family -> if we were to explicitly delete col fams, would the col fam in question be left as garbage and not cleaned up properly because the descriptor used for the ? I hope this is FUD and everything works fine due to the "delete_only" state that a col fam goes through during the schema change. But I haven't really thought about it.
[1] Details about why these txns need to restart can be seen in #6240. Essentially, since a
DelRangeonly lays down intents for keys that exist, if the txn is pushed later on, it's possible for another txn to write a key that should be deleted but isn't, which is analogous to a lost update (I think), which is not permitted under snapshot isolation.cc @vivekmenezes @tschottdorf @bdarnell
Vivek, this came about because some people are complaining about seeing these
RETRY_DELETE_RANGErestarts in SNAPSHOT txns, and I feel for them. The point of SNAPSHOT is quite severely put in question by this fact, and so I think we should do something about it.