[MP] Add a new argument to specify whether retain_in_l1 by maobaolong · Pull Request #2813 · LMCache/LMCache

maobaolong · 2026-03-18T09:17:04Z

What this PR does / why we need it:

Special notes for your reviewers:

If applicable:

this PR contains user facing changes - docs added
this PR contains unit tests

Note

Medium Risk
Adds a new prefetch policy that changes L1 lifetime semantics for prefetched data and can increase memory usage if enabled. The default behavior remains temporary prefetch, so impact is limited to users selecting the new policy.

Overview
Adds an L2 prefetch retention mechanism by extending PrefetchPolicy with select_l1_retentions() and wiring it into PrefetchController so reserve_write() can mark prefetched entries as temporary vs permanent.

Introduces a new retain prefetch policy (RetainPrefetchPolicy) that keeps all prefetched keys permanently in L1, updates MP docs for the new --l2-prefetch-policy retain option, and adds unit tests covering both default (temporary) and retain behaviors.

^{Reviewed by Cursor Bugbot for commit 529032c. Bugbot is set up for automated code reviews on this repo. Configure here.}

chatgpt-codex-connector · 2026-03-18T09:17:12Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

gemini-code-assist · 2026-03-18T09:17:23Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the LMCache's prefetching capabilities by introducing a new configuration option that dictates whether prefetched L2 data should persist in the L1 cache after its initial consumption. This allows for more flexible cache management, enabling users to optimize for either temporary data handling or a more persistent "hot cache" mode where prefetched items are subject to the EvictionController.

Highlights

New Configuration Argument: Introduced a new command-line argument, --prefetch-retain-l1, to control the retention behavior of prefetched L2 data in the L1 cache.
L1 Cache Retention Policy: Modified the prefetching mechanism to allow prefetched data to either be temporary (default behavior, deleted after consumption) or retained in L1, where it is managed by the EvictionController for hot cache scenarios.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new configuration option, prefetch_retain_l1, which allows users to control whether prefetched L2 data is retained in the L1 cache after consumption. By default, this option is False, meaning prefetched data is temporary and deleted from L1. When set to True, the data persists in L1 and is managed by the EvictionController, effectively enabling a 'hot cache' mode for prefetched items. The change includes updating the StorageManagerConfig, adding a command-line argument for this setting, modifying the PrefetchController to utilize this flag when reserving L1 cache space, and documenting the new option.

ApostaC

Can we do it as part of the prefetch policy?
We can introduce a new interface in the prefetch policy, for example

def select_l1_retentions(
    self,
    keys: list[ObjectKey],
) -> list[bool]
    """
    Determine which keys need to be retained/deleted after the prefetched objects are consumed.
    Args: ......
    Returns: ...... (the number of bools needs to be the same as the number of input keys)
    """

Then, we can call it just before the l1_mgr.reserve_write in the prefetch controller

maobaolong · 2026-04-04T01:30:49Z

@ApostaC Thanks for the previous review, addressed the comment, PTAL.

ApostaC

LGTM!

chunxiaozheng · 2026-04-12T02:10:23Z

+            A list of bools with the same length as *keys*.
+            ``True`` = retain (permanent), ``False`` = temporary.
+        """
+        return [False] * len(keys)


@maobaolong Good catch, but I have a little question, all the implementations directly return false, when will it return True?

Lost a RetainPrefetchPolicy implementation, and now added.

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 8d065cb. Configure here.}

maobaolong · 2026-04-12T03:18:51Z

@chunxiaozheng Thanks for the remind, PTAL.

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

chunxiaozheng

lgtm!

* [MP] Add a new argument to specify whether retain_in_l1 Signed-off-by: baoloongmao <baoloongmao@tencent.com> * Remove redundant override method. Signed-off-by: baoloongmao <baoloongmao@tencent.com> --------- Signed-off-by: baoloongmao <baoloongmao@tencent.com>

maobaolong requested a review from ApostaC March 18, 2026 09:17

gemini-code-assist Bot reviewed Mar 18, 2026

View reviewed changes

maobaolong added the full Run comprehensive tests on this PR label Mar 18, 2026

ApostaC reviewed Mar 18, 2026

View reviewed changes

ApostaC approved these changes Apr 12, 2026

View reviewed changes

chunxiaozheng reviewed Apr 12, 2026

View reviewed changes

maobaolong requested review from deng451e, hickeyma and sammshen as code owners April 12, 2026 03:13

[MP] Add a new argument to specify whether retain_in_l1

8d065cb

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

maobaolong force-pushed the specifyPrefetchToHotCache branch from 53010fb to 8d065cb Compare April 12, 2026 03:16

cursor Bot reviewed Apr 12, 2026

View reviewed changes

Comment thread lmcache/v1/distributed/storage_controllers/prefetch_policy.py Outdated

Remove redundant override method.

529032c

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

chunxiaozheng approved these changes Apr 12, 2026

View reviewed changes

chunxiaozheng enabled auto-merge (squash) April 12, 2026 04:08

chunxiaozheng merged commit 755362a into LMCache:dev Apr 12, 2026
39 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MP] Add a new argument to specify whether retain_in_l1#2813

[MP] Add a new argument to specify whether retain_in_l1#2813
chunxiaozheng merged 2 commits intoLMCache:devfrom
maobaolong:specifyPrefetchToHotCache

maobaolong commented Mar 18, 2026 •

edited by cursor Bot

Loading

Uh oh!

chatgpt-codex-connector Bot commented Mar 18, 2026

Uh oh!

gemini-code-assist Bot commented Mar 18, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

ApostaC left a comment

Uh oh!

maobaolong commented Apr 4, 2026

Uh oh!

ApostaC left a comment

Uh oh!

chunxiaozheng Apr 12, 2026

Uh oh!

maobaolong Apr 12, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

maobaolong commented Apr 12, 2026

Uh oh!

chunxiaozheng left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

maobaolong commented Mar 18, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot commented Mar 18, 2026

Uh oh!

gemini-code-assist Bot commented Mar 18, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

ApostaC left a comment

Choose a reason for hiding this comment

Uh oh!

maobaolong commented Apr 4, 2026

Uh oh!

ApostaC left a comment

Choose a reason for hiding this comment

Uh oh!

chunxiaozheng Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

maobaolong Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

maobaolong commented Apr 12, 2026

Uh oh!

chunxiaozheng left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

maobaolong commented Mar 18, 2026 •

edited by cursor Bot

Loading