[Misc] Remove Hash from IPCCacheEngineKey by Oasis-Git · Pull Request #2700 · LMCache/LMCache

Oasis-Git · 2026-03-05T23:55:12Z

What this PR does / why we need it:

Remove the hash from IPCCacheEngineKey. Generate Hash list directly with IPCCacheEngineKey and set in ObjKey

Special notes for your reviewers:

If applicable:

this PR contains user facing changes - docs added
this PR contains unit tests

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>

gemini-code-assist · 2026-03-05T23:55:32Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors how IPCCacheEngineKey instances are handled by removing the chunk_hash field from the key itself. This change clarifies IPCCacheEngineKey's role as a purely token-based identifier, separating the concerns of token identification from chunk hashing. Hashing is now performed externally using TokenHasher, and the resulting chunk hashes are passed explicitly when converting to ObjectKey for storage operations. This improves modularity and maintains a cleaner data model within the cache engine.

Highlights

IPCCacheEngineKey Simplification: The IPCCacheEngineKey dataclass no longer includes a chunk_hash field, simplifying its definition to represent only token-based keys without embedded hash information.
Externalized Hash Computation: The to_hash_keys method has been removed from IPCCacheEngineKey, centralizing hash computation logic to be performed externally using TokenHasher.
Refactored ObjectKey Conversion: The ipc_keys_to_object_keys function was refactored and renamed to ipc_key_to_object_keys. It now accepts a single IPCCacheEngineKey and a list of pre-computed chunk_hashes, promoting a clearer separation of concerns.
Updated Call Sites: All relevant call sites in blend_server.py and server.py were updated to explicitly compute chunk hashes using TokenHasher before passing them to the new ipc_key_to_object_keys function for ObjectKey conversion.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

lmcache/v1/distributed/api.py
- Renamed ipc_keys_to_object_keys to ipc_key_to_object_keys and updated its signature to accept a single IPCCacheEngineKey and a list of chunk_hashes.
- Removed logic for handling multiple IPCCacheEngineKey instances and world size consistency checks.
lmcache/v1/multiprocess/blend_server.py
- Removed the create_ipc_key_with_hashes helper function.
- Updated create_temp_ipc_key_by_range to no longer set chunk_hash.
- Modified cb_lookup_pre_computed, cb_store_pre_computed, cb_retrieve_pre_computed, and cb_store_final to compute chunk_hashes directly using TokenHasher and pass them to ipc_key_to_object_keys.
- Imported TokenHasher.
lmcache/v1/multiprocess/custom_types.py
- Removed the chunk_hash field from the IPCCacheEngineKey dataclass.
- Removed the to_hash_keys method from IPCCacheEngineKey.
- Updated docstrings to reflect these changes and the new ipc_key_to_object_keys function.
- Removed TYPE_CHECKING import for TokenHasher as it's no longer directly used in IPCCacheEngineKey.
lmcache/v1/multiprocess/server.py
- Removed update_session_for_key and resolve_key helper functions.
- Updated store, retrieve, lookup, and free_lookup_locks methods to directly compute chunk_hashes from the session or TokenHasher and then use ipc_key_to_object_keys.
- Updated import for ipc_keys_to_object_keys to ipc_key_to_object_keys.
lmcache/v1/multiprocess/token_hasher.py
- Updated the docstring for hash_to_bytes to clarify its use for ObjectKey.chunk_hash.
tests/v1/multiprocess/test_free_locks.py
- Updated test cases to mock TokenHasher.compute_chunk_hashes and ipc_key_to_object_keys directly, removing reliance on IPCCacheEngineKey.to_hash_keys.

Activity

No human activity (comments, reviews, etc.) has been recorded for this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>

gemini-code-assist

Code Review

This pull request refactors the code to remove the chunk_hash from IPCCacheEngineKey, generating chunk hashes on-the-fly and passing them to a new ipc_key_to_object_keys function, which is a good design improvement. However, the changes introduce multiple Denial of Service (DoS) vulnerabilities due to a lack of validation for client-provided input in the IPCCacheEngineKey. Specifically, unvalidated world_size, request_id, start, end, and token_ids can be exploited to crash worker threads or exhaust server resources. Additionally, my review includes suggestions to improve maintainability by reducing code duplication that has emerged from these changes.

gemini-code-assist · 2026-03-05T23:58:40Z

+    for chunk_hash in chunk_hashes:
        if ipc_key.worker_id is None:
            # For look up request, we want to expand to all workers
            for worker_id in range(ipc_key.world_size):


The ipc_key_to_object_keys function uses key.world_size directly in a range() call within a loop. Since key is provided by the client and its fields are not validated, an attacker can provide a very large world_size to cause excessive CPU and memory consumption on the server, leading to a Denial of Service (DoS). It is recommended to validate that world_size is within a reasonable range (e.g., 1 to 1024) before processing. Additionally, the if/else block contains duplicated code for creating ObjectKey instances; consider refactoring for conciseness.

gemini-code-assist · 2026-03-05T23:58:40Z

+        session = self.session_manager.get_or_create(key.request_id)
+        session.set_tokens(list(key.token_ids))
+        chunk_hashes = [
+            TokenHasher.hash_to_bytes(h) for h in session.get_hashes(key.start, key.end)


The store and retrieve methods call session.get_hashes(key.start, key.end), which contains assertions that start and end must be multiples of chunk_size. Since these client-provided values are not validated, an attacker can provide unaligned values to trigger an assertion failure and crash the worker thread, leading to a Denial of Service. It is recommended to validate the alignment of start and end and handle errors gracefully. Additionally, the logic for initializing a session and retrieving chunk hashes is duplicated; consider extracting it into a private helper method.

gemini-code-assist · 2026-03-05T23:58:41Z

+            chunk_hashes = [
+                TokenHasher.hash_to_bytes(h)
+                for h in self.token_hasher.compute_chunk_hashes(
+                    list(temp_ipc_key.token_ids), True, self.BLEND_HASH_PREFIX


In cb_lookup_pre_computed, list(temp_ipc_key.token_ids) creates a copy of client-provided token IDs. If an attacker sends a request with an extremely large number of token IDs, this can lead to excessive memory consumption and a Denial of Service. It is recommended to validate the length of token_ids before processing. Furthermore, this block for computing chunk hashes is repeated multiple times; consider extracting it into a private helper method within the BlendEngine class to reduce duplication.

gemini-code-assist · 2026-03-05T23:58:41Z

        # 1. the world size is the same between keys
        # 2. the lookup sort the keys in prefix order and breaks at the first failure
-        found_count = found_count // ipc_keys[0].world_size
+        found_count = found_count // key.world_size


The lookup method uses key.world_size as a divisor without checking if it is zero. A malicious client can provide a world_size of 0 in the IPCCacheEngineKey to trigger a ZeroDivisionError, which will crash the worker thread handling the request. This can be exploited to cause a Denial of Service by repeatedly crashing all available worker threads.

gemini-code-assist · 2026-03-05T23:58:41Z

        """
-        update_session_for_key(key, self.session_manager)
-        ipc_keys = resolve_key(key, self.session_manager)
+        session = self.session_manager.get_or_create(key.request_id)


The store and retrieve methods call self.session_manager.get_or_create(key.request_id), which creates and stores a new Session object for every unique request_id provided by the client. There is no limit on the number of active sessions and no automatic cleanup of expired sessions in the request handling flow. An attacker can send a large number of requests with unique request_ids to exhaust the server's memory. It is recommended to implement a session limit and periodic cleanup of expired sessions.

gemini-code-assist · 2026-03-05T23:58:41Z

+        chunk_hashes = [
+            TokenHasher.hash_to_bytes(h)
+            for h in self.token_hasher.compute_chunk_hashes(list(key.token_ids), True)
+        ]


The logic to compute chunk hashes is duplicated in the free_lookup_locks method (lines 420-423). To improve code reuse, consider extracting this into a private helper method.

ApostaC

Otherwise LGTM!

ApostaC · 2026-03-09T20:59:41Z

+            TokenHasher.hash_to_bytes(h)
+            for h in self.token_hasher.compute_chunk_hashes(list(key.token_ids), True)


Can we put hash_to_bytes into compute_chunk_hashes

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>

royyhuang

LGTM!

ApostaC

Otherwise LGTM!

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>

ApostaC

LGTM!

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com> Signed-off-by: Aaron Wu <aaron.wu@dell.com>

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>

misc

f35b5da

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>

Oasis-Git added the full Run comprehensive tests on this PR label Mar 5, 2026

clean up

12cc73a

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>

gemini-code-assist Bot reviewed Mar 5, 2026

View reviewed changes

Merge branch 'dev' into misc

970893a

ApostaC reviewed Mar 9, 2026

View reviewed changes

Oasis-Git added 5 commits March 11, 2026 21:29

fix

fd76be5

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>

merge main

c2fec39

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>

fix lint

b973b88

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>

fix

25e8dc4

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>

fix

ee0de40

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>

royyhuang approved these changes Mar 11, 2026

View reviewed changes

ApostaC reviewed Mar 12, 2026

View reviewed changes

Comment thread lmcache/v1/multiprocess/token_hasher.py Outdated

Comment thread lmcache/v1/multiprocess/token_hasher.py Outdated

Comment thread lmcache/v1/multiprocess/token_hasher.py Outdated

fix

25ad979

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>

ApostaC approved these changes Mar 12, 2026

View reviewed changes

ApostaC enabled auto-merge (squash) March 12, 2026 00:54

ApostaC merged commit dd8e6f2 into LMCache:dev Mar 12, 2026
27 of 29 checks passed

Oasis-Git deleted the misc branch March 12, 2026 21:45

realAaronWu pushed a commit to realAaronWu/LMCache that referenced this pull request Mar 20, 2026

[Misc] Remove Hash from IPCCacheEngineKey (LMCache#2700)

d3b90b2

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com> Signed-off-by: Aaron Wu <aaron.wu@dell.com>

jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026

[Misc] Remove Hash from IPCCacheEngineKey (LMCache#2700)

a26d1f1

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>

jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026

[Misc] Remove Hash from IPCCacheEngineKey (LMCache#2700)

e5eac86

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>

		TokenHasher.hash_to_bytes(h)
		for h in self.token_hasher.compute_chunk_hashes(list(key.token_ids), True)

Conversation

Oasis-Git commented Mar 5, 2026

Uh oh!

gemini-code-assist Bot commented Mar 5, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

ApostaC left a comment

Choose a reason for hiding this comment

Uh oh!

ApostaC Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

royyhuang left a comment

Choose a reason for hiding this comment

Uh oh!

ApostaC left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ApostaC left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants