Skip to content

[Core] Add eviction for CB#2893

Merged
ApostaC merged 1 commit intodevfrom
localdev/blend-eviction
Apr 1, 2026
Merged

[Core] Add eviction for CB#2893
ApostaC merged 1 commit intodevfrom
localdev/blend-eviction

Conversation

@YaoJiayi
Copy link
Copy Markdown
Collaborator

What this PR does / why we need it:

Special notes for your reviewers:

If applicable:

  • this PR contains user facing changes - docs added
  • this PR contains unit tests

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a lazy eviction mechanism for the BlendTokenRangeMatcher by adding a remove_chunks method and updating the lookup logic to identify and clear stale entries. Review feedback highlighted a critical scalability issue where the lack of compact_id reuse will eventually cause an IndexError and a memory leak as the internal list grows beyond the fixed table size. Additionally, the implementation currently fails to handle duplicate token_hash registrations, which could lead to orphaned entries in the fingerprint table and potential data corruption during matching.

slot = int(chunk_hashes[i]) & int(self._mask)
self._chunk_token_hash.append(th)
self._token_hash_to_start[th] = i * self.chunk_size
self._compact_id_to_slot[cid] = slot
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There is a critical issue here that will lead to an IndexError and a crash.

The compact_ids are generated based on the length of self._chunk_token_hash. The remove_chunks method evicts entries by setting them to None in this list, but it never reclaims the space or allows compact_ids to be reused.

As a result, len(self._chunk_token_hash) grows without bound. Since self._compact_id_to_slot has a fixed size of _TABLE_SIZE, this will inevitably cause an IndexError here when a cid is generated that is >= _TABLE_SIZE. This also constitutes a memory leak, as the list will fill with None values that are never removed.

To fix this, you should implement a mechanism to reuse compact_ids. A common pattern is to maintain a free list of evicted IDs. When registering new chunks, you can pull from this free list before generating a new ID by increasing the size of _chunk_token_hash.

Comment on lines 182 to +189
for i in range(n):
th = token_hashes[i]
cid = int(compact_ids[i])
slot = int(chunk_hashes[i]) & int(self._mask)
self._chunk_token_hash.append(th)
self._token_hash_to_start[th] = i * self.chunk_size
self._compact_id_to_slot[cid] = slot
self._token_hash_to_compact_id[th] = cid
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation does not handle the case where the same token_hash is registered for multiple different chunks. This can lead to incorrect behavior and orphaned entries in the matcher.

If a token_hash is reused, the reverse-lookup maps (_token_hash_to_start and _token_hash_to_compact_id) are overwritten with the details of the latest registration. This can cause several problems:

  • remove_chunks will only be able to evict the most recently registered chunk for a given hash, leaving earlier chunks with the same hash as orphans in the table.
  • match_sub_sequence could return incorrect old_st positions for these orphaned chunks if they are matched, leading to data corruption downstream.

To ensure correctness, you should enforce that all registered token_hash values are unique. A straightforward way to enforce this is to check for existing hashes before registration and raise an error. This makes the requirement on the caller explicit.

        # Check for duplicates before modifying state
        for th in token_hashes:
            if th in self._token_hash_to_compact_id:
                raise ValueError(f"Attempted to register duplicate token hash: {th.hex()}")

        # Persist compact_id → token_hash, token_hash → start, and reverse maps
        for i in range(n):
            th = token_hashes[i]
            cid = int(compact_ids[i])
            slot = int(chunk_hashes[i]) & int(self._mask)
            self._chunk_token_hash.append(th)
            self._token_hash_to_start[th] = i * self.chunk_size
            self._compact_id_to_slot[cid] = slot
            self._token_hash_to_compact_id[th] = cid

Copy link
Copy Markdown
Contributor

@sammshen sammshen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ApostaC ApostaC added the full Run comprehensive tests on this PR label Mar 31, 2026
@ApostaC ApostaC enabled auto-merge (squash) March 31, 2026 23:54
@ApostaC ApostaC merged commit 0f51fab into dev Apr 1, 2026
34 checks passed
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
add eviction for cb

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
add eviction for cb

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants