fix(lmcache): fix KV cache hash inconsistency due to None in extra_keys #1897
fix(lmcache): fix KV cache hash inconsistency due to None in extra_keys #1897kobe0938 merged 3 commits intoLMCache:devfrom
Conversation
Summary of ChangesHello @JianDan0212, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical bug in LMCache's KV cache sharing mechanism that caused hash inconsistencies when the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request fixes a hash inconsistency issue in the KV cache caused by None values in extra_keys. The proposed change correctly resolves the inconsistency by ensuring no None values are part of the hashed tuple. However, the current implementation introduces a breaking change by altering the structure of the data being hashed, which will invalidate existing cache entries. My review includes a suggestion to mitigate this while still fixing the underlying issue, and I've pointed out a discrepancy between the implementation and the PR description.
| # Build the tuple to be hashed piece by piece, excluding any None values. | ||
| items_to_hash = [] | ||
| if prefix_hash is not None: | ||
| items_to_hash.append(prefix_hash) | ||
|
|
||
| items_to_hash.append(tokens_tuple) | ||
|
|
||
| if extra_keys is not None: | ||
| items_to_hash.append(tuple(extra_keys)) # Also make sure lists are converted to tuples | ||
|
|
||
| return self.hash_func(tuple(items_to_hash)) |
There was a problem hiding this comment.
This change correctly addresses the hash inconsistency issue. However, the implementation differs from the description, and it introduces a breaking change.
-
Implementation vs. Description: The PR description states the fix is to "treat
extra_keysas an empty tuple when it isNone". However, the code omitsextra_keysfrom the hashed tuple if it'sNone. Similarly,prefix_hashis also omitted ifNone. This makes the hashed tuple variable in length, which is a larger change than described. -
Breaking Change: Because the structure of the hashed tuple is changing (from a fixed-size 3-element tuple to a variable-sized one), this will invalidate all existing cache entries. Hashes for the same inputs will be different from what older versions of the code would produce. This is a significant side-effect and should be clearly communicated in the PR description.
If the goal is to stick to the described fix and minimize the breaking change, you could consider replacing None values with stable defaults instead of omitting them. This maintains a fixed tuple structure for hashing. Here is a suggestion:
_prefix_hash = prefix_hash if prefix_hash is not None else NONE_HASH
_extra_keys = tuple(extra_keys) if extra_keys is not None else ()
return self.hash_func((_prefix_hash, tokens_tuple, _extra_keys))|
run the pre-commit first please! |
335dbd6 to
59c8242
Compare
|
I encountered network issues with pre-commit hooks accessing GitHub. However, I have manually installed and ran isort and black on the files to fix the formatting, and I updated the hash logic as discussed. Please review. |
59c8242 to
cb6877f
Compare
|
failing <90 word lines |
|
can you explain why hashing |
|
The hash value of None itself is fixed, but it can cause unstable structure of hash inputs (e.g., changes in the data types of tuple elements), which in turn leads to inconsistent hash results. The code replaces None with fixed default values (empty string/empty tuple) to stabilize the structure of hash inputs and ensure consistent results. |
fix KV cache hash inconsistency due to None in extra_keys Signed-off-by: JianDan0212 <jd805245992@163.com>
cb6877f to
bc8a310
Compare
| # This ensures consistency across processes and avoids | ||
| # breaking changes in tuple length. | ||
|
|
||
| _prefix_hash = prefix_hash if prefix_hash is not None else "" |
There was a problem hiding this comment.
why not just change the default values of the arguments passed into the function?
I still don't understand why None is unstable?
There was a problem hiding this comment.
Code Review
This pull request correctly identifies and fixes a hash inconsistency bug in _hash_tokens caused by None values, which is crucial for cross-process cache sharing. The approach of providing default values for prefix_hash and extra_keys is sound. However, the change as-is will break existing unit tests (e.g., in test_segment_token_database.py) which rely on the old hashing logic. These tests must be updated. I've also provided a suggestion to improve the implementation by maintaining type consistency for prefix_hash and using more concise code for extra_keys.
| _prefix_hash = prefix_hash if prefix_hash is not None else "" | ||
| _extra_keys = tuple(extra_keys) if extra_keys is not None else () | ||
|
|
||
| return self.hash_func((_prefix_hash, tokens_tuple, _extra_keys)) |
There was a problem hiding this comment.
While this change correctly addresses the hash inconsistency issue, it will break existing unit tests that have hardcoded the old hashing logic (e.g., test_segment_token_database in tests/v1/test_token_database.py uses hash((None, ..., None))). The tests must be updated to reflect the new hashing logic.
Additionally, I have a couple of suggestions to improve the implementation:
- For
_prefix_hash, using""as a default for a variable of typeOptional[int]is unconventional and mixes types. Using0would maintain type consistency and is a common default for hash-related integers.NONE_HASHalso often defaults to0in fallback scenarios. - The assignment for
_extra_keyscan be made more concise and Pythonic using anorexpression.
Here is a suggested implementation incorporating these points:
| _prefix_hash = prefix_hash if prefix_hash is not None else "" | |
| _extra_keys = tuple(extra_keys) if extra_keys is not None else () | |
| return self.hash_func((_prefix_hash, tokens_tuple, _extra_keys)) | |
| _prefix_hash = prefix_hash if prefix_hash is not None else 0 | |
| _extra_keys = tuple(extra_keys or []) | |
| return self.hash_func((_prefix_hash, tokens_tuple, _extra_keys)) |
|
Oh I see what you're trying to do. A potential helper to make it clear. def _canonicalize_hash_inputs(
self,
prefix_hash: Optional[int],
tokens_tuple: tuple[int, ...],
extra_keys: Optional[list[Any]],
) -> tuple[int, tuple[int, ...], tuple[Any, ...]]:
"""
Canonicalize hash inputs so that semantically identical requests
produce structurally identical hash inputs across instances.
- prefix_hash: int or NONE_HASH if None
- tokens_tuple: tuple of token IDs
- extra_keys: tuple of additional keys, empty if None
"""
return (
prefix_hash if prefix_hash is not None else NONE_HASH,
tokens_tuple,
tuple(extra_keys) if extra_keys is not None else (),
)
```
|
|
btw the pre-commit is still failing |
Signed-off-by: JianDan0212 <zhangyj0212@gmail.com>
eaec63f to
dd6820c
Compare
|
Thanks for the suggestion! The _canonicalize_hash_inputs helper definitely makes the logic cleaner and explicitly handles the type consistency. I have refactored the code to use your suggested structure and used 0 (as NONE_HASH) instead of an empty string to maintain type stability for prefix_hash. Regarding the failure case: The inconsistency happened when prefix_hash was explicitly passed as None in some calls, but defaulted to different values (or was serialized differently) in cross-process sharing scenarios. By forcing None to a fixed integer (0) and extra_keys to an empty tuple (), we ensure that the hash input structure remains strict (int, tuple, tuple) regardless of how the arguments were passed. I have also updated the unit tests to reflect this new hashing logic and fixed the failing pre-commit checks. Ready for review again! |
|
Could you tell me if there is any issue here that hasn’t been checked yet? |
|
PTAL @feixiangpeng |
|
@JianDan0212 Can you test it again. I was using centralised KV sharing and did not get this problem. Can you try with the new LMCache version and see if you still have this problem. I used: Commands I used: Server storage set up: Terminal vllm1 : Terminal vllm 2: Requests: Then same request to vllm2 (port 8001): I also ran export PYTHONHASHSEED=0 in the terminals as reccomended |
|
Thank you for your work. I haven't used the 0.13.0 version of lmcache yet. Could you confirm if there are any new changes in this release? I can confirm that in previous versions of lmcache, caching across instances did not work correctly, because the hash calculation for None was inconsistent between different processes. I will verify this with 0.3.13 lmcache shortly. |
|
@JianDan0212 can you try a longer prompt. Can you try my prompt: "$(printf 'Explain the significance of KV cache in language models. %.0s' {1..50})". |
|
@feixiangpeng Unfortunately, even after following your suggestion to use more tokens, the result was the same — I still couldn't get any cache hits at all. When I debugged this issue earlier, I noticed that the official program generates different hash values for each request. You can reproduce this with a fresh Docker container: even identical requests will produce different hashes. The root cause of this inconsistent hashing is the handling of None values. For this reason, I will keep my modification in place. |
|
@JianDan0212 can you update to the newest version of vllm: 0.15.0 |
|
@feixiangpeng I'm not sure why this approach was suggested, but I followed your request to update to vLLM 0.15.0, and the result was identical — the second request still didn't hit any tokens. |
|
@JianDan0212 Just to confirm, is the config file you're using to set up both instances: LMCache/examples/kv_cache_reuse/share_across_instances/centralized_sharing/example.yaml? |
example.yaml:Environment & CommandsEnvironment:Commands:Terminal 1:Terminal 2:Terminal 3:Terminal 4: |
|
@JianDan0212 |
|
@feixiangpeng No, the results on my end are completely normal. |
|
@JianDan0212 I've replicated your same error. I'm trying to see why I didn't have this before - think it has some dependency differences |
|
@feixiangpeng I don't understand why it works on your end, but it has never worked for me before. One possibility for your success is that you are using a different hash function—one that computes the same hash value for None. This is exactly the issue addressed in my PR above. You can investigate along this line. Additionally, I believe merging the code from the PR is necessary, as it would effectively resolve this problem. |
|
@JianDan0212 are you exporting PYTHONHASHSEED=0 in every terminal? Mine still works now. aiofile 3.9.0 |
|
@JianDan0212 following up on this |
|
|
@kobe0938 The Comprehensive Tests failed due to a timeout in the local_disk.yaml benchmark (expected 0.09s, actual 0.12s). It looks like a flaky CI issue rather than a code error. Since I don't have the permission to trigger a rebuild on Buildkite, could you please help re-run the CI? Thank you! |
#2511 (comment) |
…ys (LMCache#1897) Signed-off-by: shaoxiawjc <wjc2800@163.com>
…ys (LMCache#1897) Signed-off-by: Aaron Wu <aaron.wu@dell.com>














fix KV cache hash inconsistency due to None in extra_keys
Fixes #1842
What this PR does / why we need it:
This PR fixes the cross-instance KV cache sharing failure in LMCache >= 0.3.4 caused by hash inconsistency. The root cause is that the _hash_tokens function in lmcache/v1/token_database.py includes extra_keys in hash calculation but doesn't handle cases where extra_keys is None. This leads to inconsistent hash values for the same input across different processes (even with PYTHONHASHSEED set), preventing instances from retrieving each other's KV caches.
The fix modifies _hash_tokens to treat extra_keys as an empty tuple when it is None, ensuring consistent hashing across processes. This restores cross-instance KV cache sharing functionality (both P2P and centralized storage) while maintaining compatibility with scenarios where extra_keys is explicitly set.
Special notes for your reviewers:
The core change is in lmcache/v1/token_database.py where extra_keys is normalized to an empty tuple if None, avoiding None in the hashed tuple.
Verified that hash values for identical inputs are consistent across processes after the fix, and cross-instance cache reuse works as expected with LMCache 0.3.6.
No impact on existing functionality when extra_keys is provided (non-None cases).
If applicable: