Skip to content

fix(lmcache): fix KV cache hash inconsistency due to None in extra_keys #1897

Merged
kobe0938 merged 3 commits intoLMCache:devfrom
JianDan0212:fix/kv-cache-sharing-hash-none
Mar 4, 2026
Merged

fix(lmcache): fix KV cache hash inconsistency due to None in extra_keys #1897
kobe0938 merged 3 commits intoLMCache:devfrom
JianDan0212:fix/kv-cache-sharing-hash-none

Conversation

@JianDan0212
Copy link
Copy Markdown
Contributor

fix KV cache hash inconsistency due to None in extra_keys
Fixes #1842

What this PR does / why we need it:
This PR fixes the cross-instance KV cache sharing failure in LMCache >= 0.3.4 caused by hash inconsistency. The root cause is that the _hash_tokens function in lmcache/v1/token_database.py includes extra_keys in hash calculation but doesn't handle cases where extra_keys is None. This leads to inconsistent hash values for the same input across different processes (even with PYTHONHASHSEED set), preventing instances from retrieving each other's KV caches.
The fix modifies _hash_tokens to treat extra_keys as an empty tuple when it is None, ensuring consistent hashing across processes. This restores cross-instance KV cache sharing functionality (both P2P and centralized storage) while maintaining compatibility with scenarios where extra_keys is explicitly set.
Special notes for your reviewers:
The core change is in lmcache/v1/token_database.py where extra_keys is normalized to an empty tuple if None, avoiding None in the hashed tuple.
Verified that hash values for identical inputs are consistent across processes after the fix, and cross-instance cache reuse works as expected with LMCache 0.3.6.
No impact on existing functionality when extra_keys is provided (non-None cases).
If applicable:

  • this PR contains user facing changes - docs added
  • this PR contains unit tests

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @JianDan0212, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug in LMCache's KV cache sharing mechanism that caused hash inconsistencies when the extra_keys parameter was None. By refining the _hash_tokens function to normalize None extra_keys to an empty tuple, the change ensures consistent hash generation across different processes. This effectively re-enables reliable cross-instance KV cache sharing, which was previously failing due to these inconsistencies.

Highlights

  • KV Cache Hash Inconsistency Fix: Resolved an issue where the KV cache hash calculation was inconsistent when extra_keys was None, leading to failures in cross-instance KV cache sharing.
  • Consistent Hashing Logic: Modified the _hash_tokens function to explicitly handle None values for extra_keys by treating them as an empty tuple, ensuring deterministic hash values across different processes.
  • Restored Cross-Instance Cache Sharing: The fix restores the functionality of cross-instance KV cache sharing (both P2P and centralized storage) in LMCache versions 0.3.4 and above.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a hash inconsistency issue in the KV cache caused by None values in extra_keys. The proposed change correctly resolves the inconsistency by ensuring no None values are part of the hashed tuple. However, the current implementation introduces a breaking change by altering the structure of the data being hashed, which will invalidate existing cache entries. My review includes a suggestion to mitigate this while still fixing the underlying issue, and I've pointed out a discrepancy between the implementation and the PR description.

Comment thread lmcache/v1/token_database.py Outdated
Comment on lines +142 to +152
# Build the tuple to be hashed piece by piece, excluding any None values.
items_to_hash = []
if prefix_hash is not None:
items_to_hash.append(prefix_hash)

items_to_hash.append(tokens_tuple)

if extra_keys is not None:
items_to_hash.append(tuple(extra_keys)) # Also make sure lists are converted to tuples

return self.hash_func(tuple(items_to_hash))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This change correctly addresses the hash inconsistency issue. However, the implementation differs from the description, and it introduces a breaking change.

  1. Implementation vs. Description: The PR description states the fix is to "treat extra_keys as an empty tuple when it is None". However, the code omits extra_keys from the hashed tuple if it's None. Similarly, prefix_hash is also omitted if None. This makes the hashed tuple variable in length, which is a larger change than described.

  2. Breaking Change: Because the structure of the hashed tuple is changing (from a fixed-size 3-element tuple to a variable-sized one), this will invalidate all existing cache entries. Hashes for the same inputs will be different from what older versions of the code would produce. This is a significant side-effect and should be clearly communicated in the PR description.

If the goal is to stick to the described fix and minimize the breaking change, you could consider replacing None values with stable defaults instead of omitting them. This maintains a fixed tuple structure for hashing. Here is a suggestion:

        _prefix_hash = prefix_hash if prefix_hash is not None else NONE_HASH
        _extra_keys = tuple(extra_keys) if extra_keys is not None else ()
        return self.hash_func((_prefix_hash, tokens_tuple, _extra_keys))

@sammshen
Copy link
Copy Markdown
Contributor

run the pre-commit first please!

@JianDan0212 JianDan0212 force-pushed the fix/kv-cache-sharing-hash-none branch from 335dbd6 to 59c8242 Compare December 19, 2025 05:50
@JianDan0212
Copy link
Copy Markdown
Contributor Author

I encountered network issues with pre-commit hooks accessing GitHub. However, I have manually installed and ran isort and black on the files to fix the formatting, and I updated the hash logic as discussed. Please review.

@JianDan0212 JianDan0212 force-pushed the fix/kv-cache-sharing-hash-none branch from 59c8242 to cb6877f Compare December 19, 2025 06:17
@sammshen
Copy link
Copy Markdown
Contributor

failing <90 word lines

@sammshen
Copy link
Copy Markdown
Contributor

can you explain why hashing None causes inconsistency?

@JianDan0212
Copy link
Copy Markdown
Contributor Author

The hash value of None itself is fixed, but it can cause unstable structure of hash inputs (e.g., changes in the data types of tuple elements), which in turn leads to inconsistent hash results. The code replaces None with fixed default values (empty string/empty tuple) to stabilize the structure of hash inputs and ensure consistent results.

fix KV cache hash inconsistency due to None in extra_keys

Signed-off-by: JianDan0212 <jd805245992@163.com>
@JianDan0212 JianDan0212 force-pushed the fix/kv-cache-sharing-hash-none branch from cb6877f to bc8a310 Compare December 24, 2025 09:20
Comment thread lmcache/v1/token_database.py Outdated
# This ensures consistency across processes and avoids
# breaking changes in tuple length.

_prefix_hash = prefix_hash if prefix_hash is not None else ""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just change the default values of the arguments passed into the function?

I still don't understand why None is unstable?

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly identifies and fixes a hash inconsistency bug in _hash_tokens caused by None values, which is crucial for cross-process cache sharing. The approach of providing default values for prefix_hash and extra_keys is sound. However, the change as-is will break existing unit tests (e.g., in test_segment_token_database.py) which rely on the old hashing logic. These tests must be updated. I've also provided a suggestion to improve the implementation by maintaining type consistency for prefix_hash and using more concise code for extra_keys.

Comment thread lmcache/v1/token_database.py Outdated
Comment on lines +244 to +247
_prefix_hash = prefix_hash if prefix_hash is not None else ""
_extra_keys = tuple(extra_keys) if extra_keys is not None else ()

return self.hash_func((_prefix_hash, tokens_tuple, _extra_keys))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

While this change correctly addresses the hash inconsistency issue, it will break existing unit tests that have hardcoded the old hashing logic (e.g., test_segment_token_database in tests/v1/test_token_database.py uses hash((None, ..., None))). The tests must be updated to reflect the new hashing logic.

Additionally, I have a couple of suggestions to improve the implementation:

  1. For _prefix_hash, using "" as a default for a variable of type Optional[int] is unconventional and mixes types. Using 0 would maintain type consistency and is a common default for hash-related integers. NONE_HASH also often defaults to 0 in fallback scenarios.
  2. The assignment for _extra_keys can be made more concise and Pythonic using an or expression.

Here is a suggested implementation incorporating these points:

Suggested change
_prefix_hash = prefix_hash if prefix_hash is not None else ""
_extra_keys = tuple(extra_keys) if extra_keys is not None else ()
return self.hash_func((_prefix_hash, tokens_tuple, _extra_keys))
_prefix_hash = prefix_hash if prefix_hash is not None else 0
_extra_keys = tuple(extra_keys or [])
return self.hash_func((_prefix_hash, tokens_tuple, _extra_keys))

@sammshen
Copy link
Copy Markdown
Contributor

sammshen commented Dec 28, 2025

Oh I see what you're trying to do.

A potential helper to make it clear.

def _canonicalize_hash_inputs(
    self,
    prefix_hash: Optional[int],
    tokens_tuple: tuple[int, ...],
    extra_keys: Optional[list[Any]],
) -> tuple[int, tuple[int, ...], tuple[Any, ...]]:
    """
    Canonicalize hash inputs so that semantically identical requests
    produce structurally identical hash inputs across instances.

    - prefix_hash: int or NONE_HASH if None
    - tokens_tuple: tuple of token IDs
    - extra_keys: tuple of additional keys, empty if None
    """
    return (
        prefix_hash if prefix_hash is not None else NONE_HASH,
        tokens_tuple,
        tuple(extra_keys) if extra_keys is not None else (),
    )
    ```
    

@sammshen
Copy link
Copy Markdown
Contributor

btw the pre-commit is still failing

Signed-off-by: JianDan0212 <zhangyj0212@gmail.com>
@JianDan0212 JianDan0212 force-pushed the fix/kv-cache-sharing-hash-none branch from eaec63f to dd6820c Compare January 5, 2026 07:47
@JianDan0212
Copy link
Copy Markdown
Contributor Author

Thanks for the suggestion! The _canonicalize_hash_inputs helper definitely makes the logic cleaner and explicitly handles the type consistency. I have refactored the code to use your suggested structure and used 0 (as NONE_HASH) instead of an empty string to maintain type stability for prefix_hash.

Regarding the failure case: The inconsistency happened when prefix_hash was explicitly passed as None in some calls, but defaulted to different values (or was serialized differently) in cross-process sharing scenarios. By forcing None to a fixed integer (0) and extra_keys to an empty tuple (), we ensure that the hash input structure remains strict (int, tuple, tuple) regardless of how the arguments were passed.

I have also updated the unit tests to reflect this new hashing logic and fixed the failing pre-commit checks. Ready for review again!

@JianDan0212
Copy link
Copy Markdown
Contributor Author

Could you tell me if there is any issue here that hasn’t been checked yet?

@sammshen
Copy link
Copy Markdown
Contributor

sammshen commented Jan 29, 2026

PTAL @feixiangpeng

Copy link
Copy Markdown
Contributor

@sammshen sammshen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@feixiangpeng
Copy link
Copy Markdown

feixiangpeng commented Feb 2, 2026

@JianDan0212 Can you test it again. I was using centralised KV sharing and did not get this problem. Can you try with the new LMCache version and see if you still have this problem. I used:
LMCache version 0.3.13

Commands I used:

Server storage set up:
lmcache_server localhost 65432

Terminal vllm1 :
LMCACHE_CONFIG_FILE=example.yaml CUDA_VISIBLE_DEVICES=0 vllm serve mistralai/Mistral-7B-Instruct-v0.2 --gpu-memory-utilization 0.8 --port 8000 --kv-transfer-config '{"kv_connector":"LMCacheConnectorV1", "kv_role":"kv_both"}'

Terminal vllm 2:
LMCACHE_CONFIG_FILE=example.yaml CUDA_VISIBLE_DEVICES=4 vllm serve mistralai/Mistral-7B-Instruct-v0.2 --gpu-memory-utilization 0.8 --port 8001 --kv-transfer-config '{"kv_connector":"LMCacheConnectorV1", "kv_role":"kv_both"}'

Requests:
Request to vllm1 (port 8000) — LONG prompt:
curl -X POST http://localhost:8000/v1/completions
-H "Content-Type: application/json"
-d "{
"model": "mistralai/Mistral-7B-Instruct-v0.2",
"prompt": "$(printf 'Explain the significance of KV cache in language models. %.0s' {1..50})",
"max_tokens": 10
}"

Then same request to vllm2 (port 8001):
curl -X POST http://localhost:8001/v1/completions
-H "Content-Type: application/json"
-d "{
"model": "mistralai/Mistral-7B-Instruct-v0.2",
"prompt": "$(printf 'Explain the significance of KV cache in language models. %.0s' {1..50})",
"max_tokens": 10
}"

I also ran export PYTHONHASHSEED=0 in the terminals as reccomended

Terminal results I saw after request to vllm2:
Screenshot 2026-02-01 at 23 39 21

@JianDan0212
Copy link
Copy Markdown
Contributor Author

Thank you for your work. I haven't used the 0.13.0 version of lmcache yet. Could you confirm if there are any new changes in this release? I can confirm that in previous versions of lmcache, caching across instances did not work correctly, because the hash calculation for None was inconsistent between different processes. I will verify this with 0.3.13 lmcache shortly.

@JianDan0212
Copy link
Copy Markdown
Contributor Author

Hello, I'm not sure why I'm still unable to get caching to work correctly with lmcache version 0.3.13. I'm using lmcache 0.3.13 together with vLLM 0.14.0, and these are my test results. I also set export PYTHONHASHSEED=0, but it didn't produce the expected outcome — the second request didn't cache any tokens at all.

image image image image

@feixiangpeng
Copy link
Copy Markdown

@JianDan0212 can you try a longer prompt. Can you try my prompt: "$(printf 'Explain the significance of KV cache in language models. %.0s' {1..50})".
There's a separate issue that is ongoing for prompts smaller than 256 tokens and I think it might be related to this. Can you either try my prompt or one that is longer than 256 tokens.

@JianDan0212
Copy link
Copy Markdown
Contributor Author

@feixiangpeng Unfortunately, even after following your suggestion to use more tokens, the result was the same — I still couldn't get any cache hits at all. When I debugged this issue earlier, I noticed that the official program generates different hash values for each request. You can reproduce this with a fresh Docker container: even identical requests will produce different hashes. The root cause of this inconsistent hashing is the handling of None values. For this reason, I will keep my modification in place.
image
image
image

@feixiangpeng
Copy link
Copy Markdown

@JianDan0212 can you update to the newest version of vllm: 0.15.0

@JianDan0212
Copy link
Copy Markdown
Contributor Author

@feixiangpeng I'm not sure why this approach was suggested, but I followed your request to update to vLLM 0.15.0, and the result was identical — the second request still didn't hit any tokens.
image

@feixiangpeng
Copy link
Copy Markdown

@JianDan0212 Just to confirm, is the config file you're using to set up both instances: LMCache/examples/kv_cache_reuse/share_across_instances/centralized_sharing/example.yaml?
In addition to this, do you mind sending me all your terminal commands from start to finish?

@JianDan0212
Copy link
Copy Markdown
Contributor Author

example.yaml:

chunk_size: 256
local_cpu: True
remote_url: "lm://localhost:65432"
remote_serde: "cachegen"

Environment & Commands

Environment:

export PYTHONHASHSEED=0

Commands:

Terminal 1:

lmcache_server localhost 65432

Terminal 2:

LMCACHE_CONFIG_FILE=example.yaml CUDA_VISIBLE_DEVICES=0 vllm serve Qwen/Qwen3-4B --gpu-memory-utilization 0.8 --port 8000 --kv-transfer-config '{"kv_connector":"LMCacheConnectorV1", "kv_role":"kv_both"}'

Terminal 3:

LMCACHE_CONFIG_FILE=example.yaml CUDA_VISIBLE_DEVICES=1 vllm serve Qwen/Qwen3-4B --gpu-memory-utilization 0.8 --port 8001 --kv-transfer-config '{"kv_connector":"LMCacheConnectorV1", "kv_role":"kv_both"}'

Terminal 4:

curl command

@feixiangpeng
Copy link
Copy Markdown

feixiangpeng commented Feb 5, 2026

@JianDan0212
Screenshot 2026-02-05 at 00 33 44
Does your lmcache server terminal ( the one you ran lmcache_server localhost 65432 ) have this error in image? Something like this

@JianDan0212
Copy link
Copy Markdown
Contributor Author

@feixiangpeng No, the results on my end are completely normal.
image

@feixiangpeng
Copy link
Copy Markdown

@JianDan0212 I've replicated your same error. I'm trying to see why I didn't have this before - think it has some dependency differences

@JianDan0212
Copy link
Copy Markdown
Contributor Author

@feixiangpeng I don't understand why it works on your end, but it has never worked for me before. One possibility for your success is that you are using a different hash function—one that computes the same hash value for None. This is exactly the issue addressed in my PR above. You can investigate along this line. Additionally, I believe merging the code from the PR is necessary, as it would effectively resolve this problem.

@sammshen sammshen requested a review from kobe0938 February 6, 2026 05:18
@kobe0938 kobe0938 enabled auto-merge (squash) February 6, 2026 05:33
@github-actions github-actions Bot added the full Run comprehensive tests on this PR label Feb 6, 2026
@feixiangpeng
Copy link
Copy Markdown

@JianDan0212 are you exporting PYTHONHASHSEED=0 in every terminal? Mine still works now.
Screenshot 2026-02-06 at 00 08 59
Screenshot 2026-02-06 at 00 09 10
Screenshot 2026-02-06 at 00 09 20
Think the difference between my working version now and what wasn't working was just dependencies, these are my dependencies:
LMCache) tensormesh@GPU-H100-lccn13:~/p2p_bug_test/LMCache$ uv pip list
Package Version Editable project location


aiofile 3.9.0
aiofiles 25.1.0
aiohappyeyeballs 2.6.1
aiohttp 3.13.3
aiosignal 1.4.0
annotated-doc 0.0.4
annotated-types 0.7.0
anthropic 0.78.0
anyio 4.12.1
apache-tvm-ffi 0.1.8.post2
astor 0.8.1
attrs 25.4.0
awscrt 0.31.1
blake3 1.0.8
cachetools 7.0.0
caio 0.9.25
cbor2 5.8.0
certifi 2026.1.4
cffi 2.0.0
charset-normalizer 3.4.4
click 8.3.1
cloudpickle 3.1.2
compressed-tensors 0.13.0
cryptography 46.0.4
cuda-bindings 13.1.1
cuda-pathfinder 1.3.3
cuda-python 13.1.1
cufile-python 0.2.0
cupy-cuda12x 13.6.0
depyf 0.20.0
dill 0.4.1
diskcache 5.6.3
distro 1.9.0
dnspython 2.8.0
docstring-parser 0.17.0
einops 0.8.2
email-validator 2.3.0
fastapi 0.128.2
fastapi-cli 0.0.20
fastapi-cloud-cli 0.11.0
fastar 0.8.0
fastrlock 0.8.3
filelock 3.20.3
flashinfer-python 0.6.1
frozenlist 1.8.0
fsspec 2026.2.0
gguf 0.17.1
grpcio 1.76.0
grpcio-reflection 1.76.0
h11 0.16.0
hf-xet 1.2.0
httpcore 1.0.9
httptools 0.7.1
httpx 0.28.1
httpx-sse 0.4.3
huggingface-hub 0.36.1
idna 3.11
ijson 3.4.0.post0
interegular 0.3.3
jinja2 3.1.6
jiter 0.13.0
jmespath 1.1.0
jsonschema 4.26.0
jsonschema-specifications 2025.9.1
lark 1.2.2
llguidance 1.3.0
llvmlite 0.44.0
lm-format-enforcer 0.11.3
lmcache 0.3.14.dev15 /home/tensormesh/p2p_bug_test/LMCache
loguru 0.7.3
markdown-it-py 4.0.0
markupsafe 3.0.3
mcp 1.26.0
mdurl 0.1.2
mistral-common 1.9.0
model-hosting-container-standards 0.1.13
mpmath 1.3.0
msgpack 1.1.2
msgspec 0.20.0
multidict 6.7.1
networkx 3.6.1
ninja 1.13.0
nixl 0.9.0
nixl-cu12 0.9.0
numba 0.61.2
numpy 2.2.6
nvidia-cublas-cu12 12.8.4.1
nvidia-cuda-cupti-cu12 12.8.90
nvidia-cuda-nvrtc-cu12 12.8.93
nvidia-cuda-runtime-cu12 12.8.90
nvidia-cudnn-cu12 9.10.2.21
nvidia-cudnn-frontend 1.18.0
nvidia-cufft-cu12 11.3.3.83
nvidia-cufile-cu12 1.13.1.3
nvidia-curand-cu12 10.3.9.90
nvidia-cusolver-cu12 11.7.3.90
nvidia-cusparse-cu12 12.5.8.93
nvidia-cusparselt-cu12 0.7.1
nvidia-cutlass-dsl 4.3.5
nvidia-ml-py 13.590.48
nvidia-nccl-cu12 2.27.5
nvidia-nvjitlink-cu12 12.8.93
nvidia-nvshmem-cu12 3.3.20
nvidia-nvtx-cu12 12.8.90
nvtx 0.2.14
openai 2.17.0
openai-harmony 0.0.8
opencv-python-headless 4.13.0.92
outlines-core 0.2.11
packaging 26.0
partial-json-parser 0.2.1.1.post7
pillow 12.1.0
prometheus-client 0.24.1
prometheus-fastapi-instrumentator 7.1.0
propcache 0.4.1
protobuf 6.33.5
psutil 7.2.2
py-cpuinfo 9.0.0
pybase64 1.4.3
pycountry 24.6.1
pycparser 3.0
pydantic 2.12.5
pydantic-core 2.41.5
pydantic-extra-types 2.11.0
pydantic-settings 2.12.0
pygments 2.19.2
pyjwt 2.11.0
python-dotenv 1.2.1
python-json-logger 4.0.0
python-multipart 0.0.22
pyyaml 6.0.3
pyzmq 27.1.0
ray 2.53.0
redis 7.1.0
referencing 0.37.0
regex 2026.1.15
requests 2.32.5
rich 14.3.2
rich-toolkit 0.18.1
rignore 0.7.6
rpds-py 0.30.0
safetensors 0.7.0
sentencepiece 0.2.1
sentry-sdk 2.52.0
setproctitle 1.3.7
setuptools 80.10.2
setuptools-scm 9.2.2
shellingham 1.5.4
six 1.17.0
sniffio 1.3.1
sortedcontainers 2.4.0
sse-starlette 3.2.0
starlette 0.50.0
supervisor 4.3.0
sympy 1.14.0
tabulate 0.9.0
tiktoken 0.12.0
tokenizers 0.22.2
torch 2.9.1
torchaudio 2.9.1
torchvision 0.24.1
tqdm 4.67.3
transformers 4.57.6
triton 3.5.1
typer 0.21.1
typing-extensions 4.15.0
typing-inspection 0.4.2
urllib3 2.6.3
uvicorn 0.40.0
uvloop 0.22.1
vllm 0.15.1
watchfiles 1.1.1
websockets 16.0
xgrammar 0.1.29
yarl 1.22.0

@feixiangpeng
Copy link
Copy Markdown

@JianDan0212 following up on this

@JianDan0212
Copy link
Copy Markdown
Contributor Author

@JianDan0212 following up on this
I confirm that I have set PYTHONHASHSEED=0 in every terminal. Since we develop on an internal network that does not allow any outgoing messages, I cannot share my environment information. However, I have encountered this issue many times, and I believe the most reliable approach is to handle None separately.

@JianDan0212
Copy link
Copy Markdown
Contributor Author

@kobe0938 The Comprehensive Tests failed due to a timeout in the local_disk.yaml benchmark (expected 0.09s, actual 0.12s). It looks like a flaky CI issue rather than a code error. Since I don't have the permission to trigger a rebuild on Buildkite, could you please help re-run the CI? Thank you!

@JianDan0212
Copy link
Copy Markdown
Contributor Author

@JianDan0212 following up on this

#2511 (comment)
The user here has also encountered the same issue. However, based on his tests, can we infer that this is caused by a Python version compatibility problem? Does upgrading to Python 3.12 resolve this issue entirely?

@kobe0938 kobe0938 merged commit 2849282 into LMCache:dev Mar 4, 2026
24 checks passed
hlin99 pushed a commit to hlin99/LMCache that referenced this pull request Mar 4, 2026
mauryaavinash95 pushed a commit to mauryaavinash95/LMCache that referenced this pull request Mar 7, 2026
shaoxiawjc pushed a commit to shaoxiawjc/LMCache that referenced this pull request Mar 11, 2026
realAaronWu pushed a commit to realAaronWu/LMCache that referenced this pull request Mar 20, 2026
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LMCache version 0.3.6 is unable to achieve cross-instance KVCache sharing.

4 participants