Skip to content

Memory leak in LockTable due to crossbeam-skiplist RefRange iterator bug #19285

@ekexium

Description

@ekexium

Bug Report

What happened

Memory usage in TiKV grows unboundedly over time, particularly in the concurrency_manager LockTable. After leader eviction (QPS drops to 0), memory usage remains high and does not decrease.

What did you expect to happen

Memory should be reclaimed after locks are released and GC runs.

Root Cause

The bug is in crossbeam-skiplist's RefRange iterator (base.rs).

RefRange::next() and next_back() use clone_from() to update self.head/self.tail:

self.head.clone_from(&next_head);

Since RefEntry has no Drop implementation (by design - callers must explicitly call release()), the old entry is dropped without decrementing its refcount. This causes permanent memory leaks.

The correct pattern (used in RefIter::next()) is:

if let Some(e) = mem::replace(&mut self.head, next_head.clone()) {
    unsafe { e.node.decrement(guard); }
}

Impact

Any code using SkipMap::range() iterators leaks memory. In TiKV, this affects:

  • LockTable::check_range()
  • LockTable::find_first()

Reproduction

Stress test results (10 seconds, range iteration + insert/remove cycle):

before fix:
t=10s ops=11786 len=7530 alloc=3MB
after fix:
t=10s ops=11710 len=3422 alloc=613MB

Versions

  • v6.5.6 <= version <= v8.5

Metadata

Metadata

Assignees

No one assigned

    Labels

    affects-6.5This bug affects the 6.5.x(LTS) versions.affects-7.1This bug affects the 7.1.x(LTS) versions.affects-7.5This bug affects the 7.5.x(LTS) versions.affects-8.1This bug affects the 8.1.x(LTS) versions.affects-8.5This bug affects the 8.5.x(LTS) versions.severity/majorsig/engineSIG: Enginesig/transactionSIG: Transactiontype/bugThe issue is confirmed as a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions