-
Notifications
You must be signed in to change notification settings - Fork 4.1k
storage: introduce an "ignore list" for seqnums in MVCC reads #41612
Copy link
Copy link
Closed
Labels
A-storageRelating to our storage engine (Pebble) on-disk storage.Relating to our storage engine (Pebble) on-disk storage.C-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Description
Required for SQL savepoints, as discussed in #41569.
To support partial txn rollbacks (savepoint rollbacks) we need to skip over values written in the past that are associated with rolled back seqnums.
Today, the MVCC read logic is already equipped with logic to skip all values written after a specific seqnum stored in the meta txn proto (Sequence).
We want to extend the read logic to also skip over values written at seqnums part of an "ignore list":
- the txn meta proto is extended with a new field "ignore list". This contains a set (ordered if need be) of seqnum ranges to skip over
- during MVCC reads, when looking up a value "at" a particular seqnum, we extract the list of all writes before that seqnum, then, to find the most recent write, we iterate over that list in reverse order and stop at the first write that has a seqnum not part of the ignore list.
- during intent resolution the same logic should be applied.
This logic should be available for both the rocksdb and pebble engines.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
A-storageRelating to our storage engine (Pebble) on-disk storage.Relating to our storage engine (Pebble) on-disk storage.C-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)