Skip to content

[MP][optimize] unified touch all keys in end session request #3020

Merged
chunxiaozheng merged 9 commits intoLMCache:devfrom
chunxiaozheng:mp/lru-fix
Apr 18, 2026
Merged

[MP][optimize] unified touch all keys in end session request #3020
chunxiaozheng merged 9 commits intoLMCache:devfrom
chunxiaozheng:mp/lru-fix

Conversation

@chunxiaozheng
Copy link
Copy Markdown
Collaborator

@chunxiaozheng chunxiaozheng commented Apr 13, 2026

Note

Medium Risk
Changes L1 eviction recency tracking by introducing an explicit on_l1_keys_accessed event and by touching keys only when read locks fully release, which can alter eviction order and cache hit rates across MP requests.

Overview
Implements a unified “touch” path for L1 eviction recency: adds L1ManagerListener.on_l1_keys_accessed, wires it through L1Manager.touch_keys/StorageManager.touch_l1_keys, and bridges it in L1EvictionPolicy to EvictionPolicy.on_keys_touched.

Updates MP request lifecycle so lookup records the session’s lookup_ipc_key, and end_session computes all chunk ObjectKeys from the session and touches them in one call (covering both retrieved and stored chunks).

Adjusts L1Manager.finish_read to only report “touched” keys once the final read lock is released and the key is non-temporary, and hardens LRUEvictionPolicy against empty key lists. Adds comprehensive tests for the new touch semantics and session API changes (including Session.get_hashes(start) and SessionManager.remove returning the removed session).

Reviewed by Cursor Bugbot for commit 3863688. Bugbot is set up for automated code reviews on this repo. Configure here.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a unified touch mechanism for L1 cache keys, deferring key access updates until the end of a session to improve efficiency. Key changes include updating the session management to track lookup keys, enhancing get_hashes to support negative indices with automatic chunk alignment, and adding a new on_l1_keys_accessed event across the distributed storage layer. Feedback focuses on ensuring thread safety in L1Manager.touch_keys by adding the @l1_mgr_synchronized decorator and adhering to the style guide by providing explicit -> None return type hints for all new public methods.

Comment thread lmcache/v1/distributed/l1_manager.py
Comment thread lmcache/v1/distributed/eviction.py
Comment thread lmcache/v1/distributed/internal_api.py
Comment thread lmcache/v1/distributed/storage_controllers/store_controller.py Outdated
Comment thread lmcache/v1/distributed/storage_manager.py
Comment thread lmcache/v1/distributed/storage_controllers/store_controller.py
@chunxiaozheng
Copy link
Copy Markdown
Collaborator Author

@ApostaC @maobaolong could you help take a look?

keys: The list of object keys to touch.
"""
for listener in self._registered_listeners:
listener.on_l1_keys_accessed(keys)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing return type hints on new public functions

Low Severity

touch_keys and touch_l1_keys are new public functions missing -> None return type annotations. The project's AGENTS.md states "All functions and methods must have type hints for their arguments and return values," and the review style guide classifies missing type hints on public functions as error-level. Other methods in the same classes (e.g., clear) include explicit -> None.

Additional Locations (1)
Fix in Cursor Fix in Web

Triggered by project rule: LMCache Code Review Style Guide

Reviewed by Cursor Bugbot for commit f7b89b9. Configure here.

Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Please see the comments.

Comment thread lmcache/v1/distributed/l1_manager.py Outdated
pass

def on_l1_keys_read_finished(self, keys: list[ObjectKey]):
self._policy.on_keys_touched(keys)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to keep this as well. In the P2P scenario, the keys might be read by a remote peer.
(A potential design concern here is whether the P2P peer will cause the same problem as vLLM APC).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ApostaC Thanks for your reminder, I have kept this place. (
The current implements will trigger touch both in retrieve and store, and then touch all keys at the end_session, this may have some redundancy, but it doesn't seem like a big problem.) WDYT.

Copy link
Copy Markdown
Collaborator Author

@chunxiaozheng chunxiaozheng Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ApostaC By the way, as show in the picture, when mla is enabled and tp > 1, the keys in all worker is same but will trigger multi finished_read, the keys will be touched multiple times.
Clipboard_Screenshot_1776151455

Is it more reasonable to only touch keys that non-temp and non-read-locked, I also updated this in current PR.

    for key in keys:
            ...
            if not entry.read_lock.is_locked():
                if entry.is_temporary:
                    # NOTE: temporary objects shouldn't have write-locks
                    need_to_free.append(entry.memory_obj)
                    need_to_free_keys.append(key)
                    del self._objects[key]
                else:
                    # only the following two conditions are met,
                    # trigger `on_keys_touched` to update the LRU order.
                    # 1. non-temporary: temporary keys will be deleted;
                    # 2. non-read-locked: read-locked keys will not be evicted.
                    touch_keys.append(key)

        ...
        for listener in self._registered_listeners:
            listener.on_l1_keys_read_finished(touch_keys)
            ...
        ...

Comment thread lmcache/v1/distributed/storage_manager.py Outdated
Comment thread lmcache/v1/multiprocess/session.py Outdated
Comment thread tests/v1/multiprocess/test_unified_touch.py Outdated
@chunxiaozheng
Copy link
Copy Markdown
Collaborator Author

@ApostaC Thanks for your review, I have updated! PTAL.

Comment thread lmcache/v1/distributed/l1_manager.py Outdated
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 3 total unresolved issues (including 2 from previous reviews).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 40d9d62. Configure here.

Comment thread lmcache/v1/distributed/eviction.py
Args:
keys (list[ObjectKey]): The keys that have been created
"""
if not keys:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check can avoid hold lock and release

Copy link
Copy Markdown
Collaborator

@maobaolong maobaolong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chunxiaozheng chunxiaozheng added mp Buildkite trigger for multi-processing mode test full Run comprehensive tests on this PR labels Apr 15, 2026
Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I just realized I haven't submitted my review. My bad...

Comment thread lmcache/v1/distributed/l1_manager.py Outdated
Comment on lines +356 to +361
else:
# only the following two conditions are met,
# trigger `on_keys_touched` to update the LRU order.
# 1. non-temporary: temporary keys will be deleted;
# 2. non-read-locked: read-locked keys will not be evicted.
touch_keys.append(key)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question: can you explain why we don't touch the read-locked keys? The touch operation simply changes the "priority" of the keys in the eviction policy. I'm not sure why the key being locked or not impacts the touch operation.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ApostaC The touchis to update the order of lru queue, but only keys without locks will be evicted. Therefore, for read-locaked keys, just wait for the last touch.

Without this change, trigger touch in each finish_read, when mla is enabled and tp > 1, the same keys will be touched tp times.
In my tests, tp=2, and the log is as follows:
image

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chunxiaozheng thanks for the explanation! I would suggest we keep the old code since it does not create foreseeable correctness issues, and the old code is quite easier to understand -- unless we see a huge performance penalty of touching the chunk multiple times. WDYT?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay

Comment thread lmcache/v1/distributed/l1_manager.py Outdated

for listener in self._registered_listeners:
listener.on_l1_keys_read_finished(successful_keys)
listener.on_l1_keys_read_finished(touch_keys)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small nit here: naming the variable touch_key but actually calling on_l1_keys_read_finished is a bit weird.
(Still trying to understand why we need to introduce touch_keys here)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ApostaC thanks for your suggestion, how about naming read_finished_keys?

Signed-off-by: idellzheng <idellzheng@tencent.com>
Signed-off-by: idellzheng <idellzheng@tencent.com>
Signed-off-by: idellzheng <idellzheng@tencent.com>
Signed-off-by: idellzheng <idellzheng@tencent.com>
Signed-off-by: idellzheng <idellzheng@tencent.com>
Signed-off-by: idellzheng <idellzheng@tencent.com>
Signed-off-by: idellzheng <idellzheng@tencent.com>
Signed-off-by: idellzheng <idellzheng@tencent.com>
Signed-off-by: idellzheng <idellzheng@tencent.com>
Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@chunxiaozheng chunxiaozheng enabled auto-merge (squash) April 18, 2026 08:27
@chunxiaozheng chunxiaozheng merged commit aad1230 into LMCache:dev Apr 18, 2026
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR mp Buildkite trigger for multi-processing mode test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants