Skip to content

UnsafeDestroyRange should not do delete_files_in_range for the lock CF #18091

@MyonKeminta

Description

@MyonKeminta

Bug Report

Currently, when TiKV runs UnsafeDestroyRange, its first step would be run delete_files_in_range for all CFs. However, it's actually not proper to do this on the lock CF. The delete_files_in_range operation is unsafe, and it might make keys that were already deleted to show up again. The reason is that delete_files_in_range is used to physically drop SST files that are completely in the specific range. If an SST file in the upper layer containing a tombstone of a key is deleted while another not-deleted SST file in lower layer is still kept, then the key may become visible again. That's one of the reasons why we call the API unsafe and we need to guarantee the range won't be accessed again once it's destroyed.

However, we currently don't have enough protection to avoid this kind of access from happening.

Consider this case:

  1. Transaction $T$ writes key "k1", "k5", and "k1" is the primary.
  2. $T$ commits the primary "k1" and its commit-secondary step is terminated abnormally, lefting the lock on "k5" uncommitted. It doesn't affect the transaction's committed state as the primary is committed.
  3. GC performs UnsafeDestroyRange on range "k0", "k2", and causes the lock on "k1"` show up again.
  4. Another transaction $T_2$ meets the lock on "k5" and tries to resolve the lock. Then it finds the primary lock "k1", and rolls it back. Then the transaction $T$ loses the durability.

This case is very unlikely to happen, as TiDB's GC procedure always performs a global ReosolveLocks phase before deleting these ranges. However, as we mentioned before, we don't have enough protection to avoid this happen. For example, committing an transaction whose start_ts is before the GC safe point after finishing the ResolveLocks phase is not prevented, etc. To completely avoid this case, it's better not to do delete_files_in_range on lock CF.

Metadata

Metadata

Assignees

No one assigned

    Labels

    affects-5.4This bug affects the 5.4.x(LTS) versions.affects-6.1This bug affects the 6.1.x(LTS) versions.affects-6.5This bug affects the 6.5.x(LTS) versions.affects-7.1This bug affects the 7.1.x(LTS) versions.affects-7.5This bug affects the 7.5.x(LTS) versions.affects-8.1This bug affects the 8.1.x(LTS) versions.affects-8.5This bug affects the 8.5.x(LTS) versions.severity/moderatetype/bugThe issue is confirmed as a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions