Skip to content

[MP] Add a new argument to specify whether retain_in_l1#2813

Merged
chunxiaozheng merged 2 commits intoLMCache:devfrom
maobaolong:specifyPrefetchToHotCache
Apr 12, 2026
Merged

[MP] Add a new argument to specify whether retain_in_l1#2813
chunxiaozheng merged 2 commits intoLMCache:devfrom
maobaolong:specifyPrefetchToHotCache

Conversation

@maobaolong
Copy link
Copy Markdown
Collaborator

@maobaolong maobaolong commented Mar 18, 2026

What this PR does / why we need it:

Special notes for your reviewers:

If applicable:

  • this PR contains user facing changes - docs added
  • this PR contains unit tests

Note

Medium Risk
Adds a new prefetch policy that changes L1 lifetime semantics for prefetched data and can increase memory usage if enabled. The default behavior remains temporary prefetch, so impact is limited to users selecting the new policy.

Overview
Adds an L2 prefetch retention mechanism by extending PrefetchPolicy with select_l1_retentions() and wiring it into PrefetchController so reserve_write() can mark prefetched entries as temporary vs permanent.

Introduces a new retain prefetch policy (RetainPrefetchPolicy) that keeps all prefetched keys permanently in L1, updates MP docs for the new --l2-prefetch-policy retain option, and adds unit tests covering both default (temporary) and retain behaviors.

Reviewed by Cursor Bugbot for commit 529032c. Bugbot is set up for automated code reviews on this repo. Configure here.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@maobaolong maobaolong requested a review from ApostaC March 18, 2026 09:17
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the LMCache's prefetching capabilities by introducing a new configuration option that dictates whether prefetched L2 data should persist in the L1 cache after its initial consumption. This allows for more flexible cache management, enabling users to optimize for either temporary data handling or a more persistent "hot cache" mode where prefetched items are subject to the EvictionController.

Highlights

  • New Configuration Argument: Introduced a new command-line argument, --prefetch-retain-l1, to control the retention behavior of prefetched L2 data in the L1 cache.
  • L1 Cache Retention Policy: Modified the prefetching mechanism to allow prefetched data to either be temporary (default behavior, deleted after consumption) or retained in L1, where it is managed by the EvictionController for hot cache scenarios.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new configuration option, prefetch_retain_l1, which allows users to control whether prefetched L2 data is retained in the L1 cache after consumption. By default, this option is False, meaning prefetched data is temporary and deleted from L1. When set to True, the data persists in L1 and is managed by the EvictionController, effectively enabling a 'hot cache' mode for prefetched items. The change includes updating the StorageManagerConfig, adding a command-line argument for this setting, modifying the PrefetchController to utilize this flag when reserving L1 cache space, and documenting the new option.

@maobaolong maobaolong added the full Run comprehensive tests on this PR label Mar 18, 2026
Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do it as part of the prefetch policy?
We can introduce a new interface in the prefetch policy, for example

def select_l1_retentions(
    self,
    keys: list[ObjectKey],
) -> list[bool]
    """
    Determine which keys need to be retained/deleted after the prefetched objects are consumed.
    Args: ......
    Returns: ...... (the number of bools needs to be the same as the number of input keys)
    """

Then, we can call it just before the l1_mgr.reserve_write in the prefetch controller

@maobaolong
Copy link
Copy Markdown
Collaborator Author

@ApostaC Thanks for the previous review, addressed the comment, PTAL.

Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

A list of bools with the same length as *keys*.
``True`` = retain (permanent), ``False`` = temporary.
"""
return [False] * len(keys)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maobaolong Good catch, but I have a little question, all the implementations directly return false, when will it return True?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lost a RetainPrefetchPolicy implementation, and now added.

Signed-off-by: baoloongmao <baoloongmao@tencent.com>
@maobaolong maobaolong force-pushed the specifyPrefetchToHotCache branch from 53010fb to 8d065cb Compare April 12, 2026 03:16
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 8d065cb. Configure here.

Comment thread lmcache/v1/distributed/storage_controllers/prefetch_policy.py Outdated
@maobaolong
Copy link
Copy Markdown
Collaborator Author

@chunxiaozheng Thanks for the remind, PTAL.

Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Copy link
Copy Markdown
Collaborator

@chunxiaozheng chunxiaozheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@chunxiaozheng chunxiaozheng enabled auto-merge (squash) April 12, 2026 04:08
@chunxiaozheng chunxiaozheng merged commit 755362a into LMCache:dev Apr 12, 2026
39 checks passed
Oasis-Git pushed a commit to Oasis-Git/LMCache that referenced this pull request Apr 13, 2026
* [MP] Add a new argument to specify whether retain_in_l1

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Remove redundant override method.

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

---------

Signed-off-by: baoloongmao <baoloongmao@tencent.com>
maobaolong added a commit to maobaolong/LMCache that referenced this pull request Apr 14, 2026
* [MP] Add a new argument to specify whether retain_in_l1

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Remove redundant override method.

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

---------

Signed-off-by: baoloongmao <baoloongmao@tencent.com>
ftian1 pushed a commit to ftian1/LMCache that referenced this pull request Apr 20, 2026
* [MP] Add a new argument to specify whether retain_in_l1

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Remove redundant override method.

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

---------

Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants