Checklist
Motivation
Summary
| # |
Proposal |
Worst-case (current) |
Worst-case (proposed) |
| 1 |
O(1) LRU scan resume via per-node insertion counter — bump a counter on every MRU insert; eviction maintains a cursor and resumes after each eviction (including cascade) instead of restarting from tail. |
O(M × K) — M evictions each re-walk the same K skip nodes (locked nodes / non-leaf ancestors) |
O(M + K) — each skip node visited at most once across the whole eviction batch |
| 2 |
Root-MRU chain ordering globally — flip reset_node_and_parents_mru so the root-side ancestor lands closest to head and the leaf closest to tail within the touched chain. Applies to FULL, Mamba, and SWA uniformly. |
O(M × N × D) — N recently-touched chains of depth D force (D−1) non-leaf skips per chain on every restart |
O(N × D) — non-leaf ancestors skipped exactly once during a single resume walk |
The two proposals are complementary: Proposal 1 eliminates restart cost; Proposal 2 ensures the resume walk visits each chain's leaf before its ancestors.
Proposal 1: O(1) LRU scan resume via per-node insertion counter
Statement
UnifiedRadixCache eviction restarts its LRU scan from tail after every leaf eviction. When the path between tail and the next evictable leaf contains many "skip" nodes (locked nodes or non-leaf ancestors left over from cascade tombstone cleanup), each restart re-walks the same prefix of the list, producing O(M × K) total scan cost for M evictions over K average skips. We propose adding a monotonically increasing insertion counter to each node (bumped on every insert_mru / reset_node_mru) so the scan can resume at the correct position in O(1) after each eviction (including cascade), reducing total cost to O(M + K).
Where the restart happens (code pointers)
FullComponent.drive_eviction (full_component.py:58-70) — restarts on every iteration:
def drive_eviction(self, params, tracker):
request = params.num_tokens
lru = self.cache.lru_lists[self.component_type]
while tracker[self.component_type] < request:
x = lru.get_leaf_lru_no_lock() # ← starts from tail every loop
if x is None:
break
self.cache._evict_component_and_detach_lru(x, self, is_leaf=True, tracker=tracker)
self.cache._cascade_evict(x, self, tracker)
SWAComponent.drive_eviction (swa_component.py:208-230) — restarts on every leaf eviction:
if len(x.children) > 0:
x_next = lru.get_prev_no_lock(x) # internal: cursor-style continue
self.cache._evict_component_and_detach_lru(x, self, is_leaf=False, tracker=tracker)
self.cache._cascade_evict(x, self, tracker)
x = x_next
else:
self.cache._evict_component_and_detach_lru(x, self, is_leaf=True, tracker=tracker)
self.cache._cascade_evict(x, self, tracker)
x = lru.get_lru_no_lock() # ← leaf: restart from tail
MambaComponent.drive_eviction (mamba_component.py:137-159) — same pattern as SWA, also restarts after leaf eviction.
UnifiedLRUList.get_leaf_lru_no_lock / get_lru_no_lock (unified_radix_cache.py:151-155) — both anchor on self.tail:
def get_lru_no_lock(self):
return self.get_prev_no_lock(self.tail, check_id=False) # ← anchored at tail
def get_leaf_lru_no_lock(self):
return self.get_prev_leaf_no_lock(self.tail, check_id=False) # ← anchored at tail
UnifiedLRUList.get_prev_leaf_no_lock (unified_radix_cache.py:140-149) — the inner walk that pays the per-restart cost; every restart re-skips the same locked / non-leaf nodes:
def get_prev_leaf_no_lock(self, node, check_id=True):
...
x = node.lru_prev[ct]
while x.component_data[ct].lock_ref > 0 or len(x.children) > 0:
x = x.lru_prev[ct] # ← re-walked from tail every iter
if x == self.head:
return None
return x
The fundamental issue: drive_eviction discards the scan position after each eviction, so the next iteration must rediscover it from scratch via another tail-anchored walk through the same skip nodes.
Worst-case scenario
LRU layout (skip nodes interleaved with evictable leaves):
head ┄ S₁ ┄ L₁ ┄ S₂ ┄ L₂ ┄ S₃ ┄ L₃ ┄ ... ┄ Sₖ ┄ Lₖ ┄ tail
│ │ │ │ │ │ │ │
skip evict skip evict skip evict skip evict
Eviction trace under current restart-from-tail:
iter 1: tail ─►Lₖ [0 skips]
iter 2: tail ─►Sₖ─►L(k-1) [1 skip ]
iter 3: tail ─►Sₖ─►S(k-1)─►L(k-2) [2 skips]
...
iter k: tail ─►Sₖ─►...─►S₂─►L₁ [k-1 skips]
Total = 1 + 2 + ... + k = O(k²)
Proposed change
- Add
lru_insert_seq: int to UnifiedTreeNode. Bump a per-UnifiedLRUList global counter inside _add_node / _add_node_after and assign it to the node.
- Maintain a per-eviction-call cursor inside
drive_eviction; replace the unconditional lru.get_leaf_lru_no_lock() / lru.get_lru_no_lock() restart with a resume from the cursor, walking only the remaining portion toward head.
- On cascade, when
_iteratively_delete_tombstone_leaf deletes ancestor P:
- If
P.lru_insert_seq < cursor.lru_insert_seq → P was behind the cursor; the deletion just updates list pointers, no extra scan needed.
- If
P.lru_insert_seq ≥ cursor.lru_insert_seq → P was ahead of the cursor; either evict opportunistically during cascade or advance the cursor past P's neighbors.
Proposal 2: Root-MRU chain ordering globally
Statement
reset_node_and_parents_mru currently uses leaf-MRU ordering: when a chain root → A → A1 → A11 is touched, A11 lands closest to head and A closest to tail within the chain. This forces leaf-eviction scans (FULL) to walk past every chain's non-leaf ancestors stacked between the leaf and tail. We propose flipping the ordering to root-MRU for all components uniformly. FULL benefits the most from cheaper leaf scans; Mamba and SWA gain the same scan-cost benefit while remaining functionally correct (their drive_eviction already walks any node in pure LRU order).
Worst-case scenario
Current leaf-MRU layout (FULL leaf-only scan from tail):
head ◄── A11 ◄── A1 ◄── A ◄── B11 ◄── B1 ◄── B ◄── C11 ◄── C1 ◄── C ◄── ...older... ◄── tail
leaf │ │ leaf │ │ leaf │ │
nonleaf nonleaf nonleaf nonleaf nonleaf nonleaf
▲ ▲ ▲ ▲ ▲ ▲
└───────┴──── must skip these to find each leaf ──┘
Every restart-from-tail re-walks each chain's (D-1) non-leaf ancestors.
Proposed root-MRU layout for the same chains:
head ◄── A ◄── A1 ◄── A11 ◄── B ◄── B1 ◄── B11 ◄── C ◄── C1 ◄── C11 ◄── ...older... ◄── tail
│ │ leaf │ │ leaf │ │ leaf
nonleaf nonleaf nonleaf nonleaf nonleaf nonleaf
Scan from tail (with Proposal 1's resume cursor):
── first hit is C11 (leaf — done)
── resume past C11 → C1 (skip) → C (skip) → B11 (leaf — done)
── resume past B11 → B1 (skip) → B (skip) → A11 (leaf — done)
Every non-leaf ancestor skipped exactly ONCE total.
Proposed change
Replace the body of reset_node_and_parents_mru:
def reset_node_and_parents_mru(self, node, root_node, should_include):
chain = []
while node != root_node:
if should_include(node):
chain.append(node)
node = node.parent
# chain = [leaf, parent, grandparent, ..., root_child]
# Insert root-side first so leaf lands closest to tail within the chain
prev = self.head
for n in reversed(chain):
self._remove_node(n)
self._add_node_after(prev, n)
prev = n
Also flip the last_access_time direction in _match_post_processor (unified_radix_cache.py:549-553) for consistency — root gets the newest stamp, leaf gets the oldest within the chain. This field is currently only used by __lt__, so the change is cosmetic but removes a confusing inconsistency.
Related resources
No response
Checklist
Motivation
Summary
tail.reset_node_and_parents_mruso the root-side ancestor lands closest toheadand the leaf closest totailwithin the touched chain. Applies to FULL, Mamba, and SWA uniformly.The two proposals are complementary: Proposal 1 eliminates restart cost; Proposal 2 ensures the resume walk visits each chain's leaf before its ancestors.
Proposal 1: O(1) LRU scan resume via per-node insertion counter
Statement
UnifiedRadixCacheeviction restarts its LRU scan fromtailafter every leaf eviction. When the path betweentailand the next evictable leaf contains many "skip" nodes (locked nodes or non-leaf ancestors left over from cascade tombstone cleanup), each restart re-walks the same prefix of the list, producing O(M × K) total scan cost for M evictions over K average skips. We propose adding a monotonically increasing insertion counter to each node (bumped on everyinsert_mru/reset_node_mru) so the scan can resume at the correct position in O(1) after each eviction (including cascade), reducing total cost to O(M + K).Where the restart happens (code pointers)
FullComponent.drive_eviction(full_component.py:58-70) — restarts on every iteration:SWAComponent.drive_eviction(swa_component.py:208-230) — restarts on every leaf eviction:MambaComponent.drive_eviction(mamba_component.py:137-159) — same pattern as SWA, also restarts after leaf eviction.UnifiedLRUList.get_leaf_lru_no_lock/get_lru_no_lock(unified_radix_cache.py:151-155) — both anchor onself.tail:UnifiedLRUList.get_prev_leaf_no_lock(unified_radix_cache.py:140-149) — the inner walk that pays the per-restart cost; every restart re-skips the same locked / non-leaf nodes:The fundamental issue:
drive_evictiondiscards the scan position after each eviction, so the next iteration must rediscover it from scratch via anothertail-anchored walk through the same skip nodes.Worst-case scenario
Proposed change
lru_insert_seq: inttoUnifiedTreeNode. Bump a per-UnifiedLRUListglobal counter inside_add_node/_add_node_afterand assign it to the node.drive_eviction; replace the unconditionallru.get_leaf_lru_no_lock()/lru.get_lru_no_lock()restart with a resume from the cursor, walking only the remaining portion toward head._iteratively_delete_tombstone_leafdeletes ancestorP:P.lru_insert_seq < cursor.lru_insert_seq→ P was behind the cursor; the deletion just updates list pointers, no extra scan needed.P.lru_insert_seq ≥ cursor.lru_insert_seq→ P was ahead of the cursor; either evict opportunistically during cascade or advance the cursor past P's neighbors.Proposal 2: Root-MRU chain ordering globally
Statement
reset_node_and_parents_mrucurrently uses leaf-MRU ordering: when a chainroot → A → A1 → A11is touched,A11lands closest toheadandAclosest totailwithin the chain. This forces leaf-eviction scans (FULL) to walk past every chain's non-leaf ancestors stacked between the leaf and tail. We propose flipping the ordering to root-MRU for all components uniformly. FULL benefits the most from cheaper leaf scans; Mamba and SWA gain the same scan-cost benefit while remaining functionally correct (theirdrive_evictionalready walks any node in pure LRU order).Worst-case scenario
Proposed change
Replace the body of
reset_node_and_parents_mru:Also flip the
last_access_timedirection in_match_post_processor(unified_radix_cache.py:549-553) for consistency — root gets the newest stamp, leaf gets the oldest within the chain. This field is currently only used by__lt__, so the change is cosmetic but removes a confusing inconsistency.Related resources
No response