[MP][optimize] unified touch all keys in end session request #3020
[MP][optimize] unified touch all keys in end session request #3020chunxiaozheng merged 9 commits intoLMCache:devfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request implements a unified touch mechanism for L1 cache keys, deferring key access updates until the end of a session to improve efficiency. Key changes include updating the session management to track lookup keys, enhancing get_hashes to support negative indices with automatic chunk alignment, and adding a new on_l1_keys_accessed event across the distributed storage layer. Feedback focuses on ensuring thread safety in L1Manager.touch_keys by adding the @l1_mgr_synchronized decorator and adhering to the style guide by providing explicit -> None return type hints for all new public methods.
88d7a75 to
29fe5f9
Compare
|
@ApostaC @maobaolong could you help take a look? |
| keys: The list of object keys to touch. | ||
| """ | ||
| for listener in self._registered_listeners: | ||
| listener.on_l1_keys_accessed(keys) |
There was a problem hiding this comment.
Missing return type hints on new public functions
Low Severity
touch_keys and touch_l1_keys are new public functions missing -> None return type annotations. The project's AGENTS.md states "All functions and methods must have type hints for their arguments and return values," and the review style guide classifies missing type hints on public functions as error-level. Other methods in the same classes (e.g., clear) include explicit -> None.
Additional Locations (1)
Triggered by project rule: LMCache Code Review Style Guide
Reviewed by Cursor Bugbot for commit f7b89b9. Configure here.
ApostaC
left a comment
There was a problem hiding this comment.
Thanks! Please see the comments.
| pass | ||
|
|
||
| def on_l1_keys_read_finished(self, keys: list[ObjectKey]): | ||
| self._policy.on_keys_touched(keys) |
There was a problem hiding this comment.
We probably want to keep this as well. In the P2P scenario, the keys might be read by a remote peer.
(A potential design concern here is whether the P2P peer will cause the same problem as vLLM APC).
There was a problem hiding this comment.
@ApostaC Thanks for your reminder, I have kept this place. (
The current implements will trigger touch both in retrieve and store, and then touch all keys at the end_session, this may have some redundancy, but it doesn't seem like a big problem.) WDYT.
There was a problem hiding this comment.
@ApostaC By the way, as show in the picture, when mla is enabled and tp > 1, the keys in all worker is same but will trigger multi finished_read, the keys will be touched multiple times.

Is it more reasonable to only touch keys that non-temp and non-read-locked, I also updated this in current PR.
for key in keys:
...
if not entry.read_lock.is_locked():
if entry.is_temporary:
# NOTE: temporary objects shouldn't have write-locks
need_to_free.append(entry.memory_obj)
need_to_free_keys.append(key)
del self._objects[key]
else:
# only the following two conditions are met,
# trigger `on_keys_touched` to update the LRU order.
# 1. non-temporary: temporary keys will be deleted;
# 2. non-read-locked: read-locked keys will not be evicted.
touch_keys.append(key)
...
for listener in self._registered_listeners:
listener.on_l1_keys_read_finished(touch_keys)
...
...
f7b89b9 to
9cfa32a
Compare
|
@ApostaC Thanks for your review, I have updated! PTAL. |
9cfa32a to
2d8a356
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 3 total unresolved issues (including 2 from previous reviews).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 40d9d62. Configure here.
| Args: | ||
| keys (list[ObjectKey]): The keys that have been created | ||
| """ | ||
| if not keys: |
There was a problem hiding this comment.
This check can avoid hold lock and release
40d9d62 to
3863688
Compare
ApostaC
left a comment
There was a problem hiding this comment.
Oh I just realized I haven't submitted my review. My bad...
| else: | ||
| # only the following two conditions are met, | ||
| # trigger `on_keys_touched` to update the LRU order. | ||
| # 1. non-temporary: temporary keys will be deleted; | ||
| # 2. non-read-locked: read-locked keys will not be evicted. | ||
| touch_keys.append(key) |
There was a problem hiding this comment.
Quick question: can you explain why we don't touch the read-locked keys? The touch operation simply changes the "priority" of the keys in the eviction policy. I'm not sure why the key being locked or not impacts the touch operation.
There was a problem hiding this comment.
@ApostaC The touchis to update the order of lru queue, but only keys without locks will be evicted. Therefore, for read-locaked keys, just wait for the last touch.
Without this change, trigger touch in each finish_read, when mla is enabled and tp > 1, the same keys will be touched tp times.
In my tests, tp=2, and the log is as follows:

There was a problem hiding this comment.
@chunxiaozheng thanks for the explanation! I would suggest we keep the old code since it does not create foreseeable correctness issues, and the old code is quite easier to understand -- unless we see a huge performance penalty of touching the chunk multiple times. WDYT?
|
|
||
| for listener in self._registered_listeners: | ||
| listener.on_l1_keys_read_finished(successful_keys) | ||
| listener.on_l1_keys_read_finished(touch_keys) |
There was a problem hiding this comment.
A small nit here: naming the variable touch_key but actually calling on_l1_keys_read_finished is a bit weird.
(Still trying to understand why we need to introduce touch_keys here)
There was a problem hiding this comment.
@ApostaC thanks for your suggestion, how about naming read_finished_keys?
Signed-off-by: idellzheng <idellzheng@tencent.com>
Signed-off-by: idellzheng <idellzheng@tencent.com>
3863688 to
e37c426
Compare


Note
Medium Risk
Changes L1 eviction recency tracking by introducing an explicit
on_l1_keys_accessedevent and by touching keys only when read locks fully release, which can alter eviction order and cache hit rates across MP requests.Overview
Implements a unified “touch” path for L1 eviction recency: adds
L1ManagerListener.on_l1_keys_accessed, wires it throughL1Manager.touch_keys/StorageManager.touch_l1_keys, and bridges it inL1EvictionPolicytoEvictionPolicy.on_keys_touched.Updates MP request lifecycle so
lookuprecords the session’slookup_ipc_key, andend_sessioncomputes all chunkObjectKeys from the session and touches them in one call (covering both retrieved and stored chunks).Adjusts
L1Manager.finish_readto only report “touched” keys once the final read lock is released and the key is non-temporary, and hardensLRUEvictionPolicyagainst empty key lists. Adds comprehensive tests for the new touch semantics and session API changes (includingSession.get_hashes(start)andSessionManager.removereturning the removed session).Reviewed by Cursor Bugbot for commit 3863688. Bugbot is set up for automated code reviews on this repo. Configure here.