-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
Bug Report
Currently, when TiKV runs UnsafeDestroyRange, its first step would be run delete_files_in_range for all CFs. However, it's actually not proper to do this on the lock CF. The delete_files_in_range operation is unsafe, and it might make keys that were already deleted to show up again. The reason is that delete_files_in_range is used to physically drop SST files that are completely in the specific range. If an SST file in the upper layer containing a tombstone of a key is deleted while another not-deleted SST file in lower layer is still kept, then the key may become visible again. That's one of the reasons why we call the API unsafe and we need to guarantee the range won't be accessed again once it's destroyed.
However, we currently don't have enough protection to avoid this kind of access from happening.
Consider this case:
- Transaction
$T$ writes key"k1","k5", and"k1"is the primary. -
$T$ commits the primary"k1"and its commit-secondary step is terminated abnormally, lefting the lock on"k5"uncommitted. It doesn't affect the transaction's committed state as the primary is committed. - GC performs
UnsafeDestroyRangeon range"k0", "k2", and causes the lock on"k1"` show up again. - Another transaction
$T_2$ meets the lock on"k5"and tries to resolve the lock. Then it finds the primary lock"k1", and rolls it back. Then the transaction$T$ loses the durability.
This case is very unlikely to happen, as TiDB's GC procedure always performs a global ReosolveLocks phase before deleting these ranges. However, as we mentioned before, we don't have enough protection to avoid this happen. For example, committing an transaction whose start_ts is before the GC safe point after finishing the ResolveLocks phase is not prevented, etc. To completely avoid this case, it's better not to do delete_files_in_range on lock CF.