Skip to content

[Roadmap] Unified Hybrid Radix Cache Refactor #20415

@ispobock

Description

@ispobock

Motivation

The current radix cache implementations (RadixCache, MambaRadixCache, SWARadixCache, etc.) share a large amount of logic but are maintained as separate, diverged copies. This leads to code duplication, inconsistent behavior, and high maintenance burden when extending cache functionality to new model types (e.g., hybrid linear models, sliding window attention models).

The goal of this refactor is to unify the hybrid radix cache hierarchy around a common base interface, making it easier to:

  • Add new cache variants without duplicating logic
  • Ensure consistency across all cache implementations
  • Enable future extensions (e.g., HiCache, PD disaggregation, new model architectures)

@hzh0425 @yizhang2077 @pansicheng @ispobock

Progress

Stage 0: Unify Radix Tree Interface

Stage 1: Support Unified HybridRadixTree V2

Stage 2: Tree Interface Cleanup and Optimization

  • Clarify whether req should be mutable and immutable across the entire tree interface. @ispobock @hzh0425
  • Redesign confusing APIs such as cache_finished and cache_unfinished so their names accurately reflect their behavior. @hnyls2002
  • Separate the responsibilities of match_prefix and match_prefix_and_split more clearly, especially around whether they mutate the tree. @pansicheng
  • Explore tree compression strategies, including leaf-node merging and block-size-based merging. Evaluate whether merge logic can be integrated into match_prefix without making the API semantics harder to understand. @hzh0425

Stage 3: Long-Term Rewrite

  • Replace List [int] with a continuous array in memory for token storage, as a prerequisite for Rust migration
  • Plan for a Rust rewrite once the tree design has sufficiently converged and the API semantics are stable.
  • Use the stabilized unified design and cleaned-up interfaces as the foundation for the Rust implementation.

Related Issues

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions