Skip to content

[Core] Add L2 eviction in mp mode#2824

Merged
YaoJiayi merged 18 commits intodevfrom
localdev/l2-eviction
Mar 30, 2026
Merged

[Core] Add L2 eviction in mp mode#2824
YaoJiayi merged 18 commits intodevfrom
localdev/l2-eviction

Conversation

@YaoJiayi
Copy link
Copy Markdown
Collaborator

@YaoJiayi YaoJiayi commented Mar 19, 2026

What this PR does / why we need it:

Special notes for your reviewers:

If applicable:

  • this PR contains user facing changes - docs added
  • this PR contains unit tests

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a robust and configurable L2 eviction system, enabling individual L2 adapters to manage their storage capacity effectively. It refactors the eviction policy framework to be more modular and extensible, clearly separating core eviction logic from the cache tier-specific event handling. The changes ensure that L2 adapters can actively participate in the cache's lifecycle by reporting usage and responding to eviction commands, thereby improving overall cache efficiency and resource management.

Highlights

  • L2 Eviction Mechanism Introduced: A new, per-adapter, and opt-in L2 eviction mechanism has been implemented, allowing each L2 adapter instance to independently declare an eviction policy via its JSON configuration.
  • Dedicated L2EvictionController: A dedicated L2EvictionController background thread is now created for each L2 adapter with an eviction configuration. This controller monitors storage utilization and evicts keys according to the configured policy (e.g., LRU).
  • Refactored Eviction Policy Architecture: The core EvictionPolicy has been refactored into a pure abstract base class, separating policy logic from tier-specific event handling. New L1EvictionPolicy and L2EvictionPolicy classes now bridge events from L1ManagerListener and L2AdapterListener respectively.
  • Enhanced L2AdapterInterface: The L2AdapterInterface now includes methods for delete(keys) and get_usage(), along with a listener registration mechanism (register_listener) and notification helpers (_notify_keys_stored, _notify_keys_accessed, _notify_keys_deleted) to support eviction.
  • Adapter Implementations Updated: MockL2Adapter and NixlStoreL2Adapter have been updated to implement the new delete and get_usage methods, and to fire appropriate listener events. FSL2Adapter and NativeConnectorL2Adapter provide no-op implementations for these methods.
  • StorageManager Integration: The StorageManager now creates and manages L2EvictionController instances for L2 adapters that have eviction configured, alongside the existing L1EvictionController.
  • New Design Document: A comprehensive design document, docs/design/l2_adapters/l2_eviction.md, has been added to detail the L2 adapter eviction mechanism, its architecture, configuration, and implementation guidelines.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new L2 eviction mechanism, mirroring the existing L1 eviction. Key changes include defining an L2AdapterListener interface for L2 adapter events, refactoring EvictionPolicy into a pure abstract base class with L1EvictionPolicy and L2EvictionPolicy subclasses to bridge events, and implementing delete() and get_usage() methods in L2 adapters (MockL2Adapter, NixlStoreL2Adapter, FSL2Adapter, NativeConnectorL2Adapter). A new L2EvictionController is added to manage L2 eviction based on adapter usage and configured policies, and the StorageManager is updated to instantiate these L2 controllers per adapter. Review comments highlight several improvement opportunities: ensuring the EvictionController base class calls super().__init__(), enhancing ValueError messages with available options, including exception information in log messages for better debugging, delegating L1ManagerListener implementations to the eviction policy, clarifying the naming of the L1 eviction controller, and adding a missing assertion in a test case to verify storage pool slot release.

Comment thread lmcache/v1/distributed/internal_api.py
Comment thread lmcache/v1/distributed/storage_controllers/eviction_controller.py Outdated
Comment thread lmcache/v1/distributed/config.py
Comment thread lmcache/v1/distributed/eviction.py
Comment thread lmcache/v1/distributed/l2_adapters/base.py Outdated
Comment thread lmcache/v1/distributed/l2_adapters/config.py
Comment thread lmcache/v1/distributed/l2_adapters/mock_l2_adapter.py
Comment thread lmcache/v1/distributed/storage_controllers/eviction_controller.py Outdated
Comment thread lmcache/v1/distributed/storage_manager.py
Comment thread tests/v1/distributed/test_nixl_store_l2_adapter.py
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main comment on the overall design: let's have a unified L2 eviction controller, not one per L2 adapter

Other things:

  • Let's avoid multi inheritance
  • The eviction controller don't need to extend the "Listener" interface, because they do not need to be aware of the exact key access/insertion/deletion, but only the memory usage information. Only the eviction policy need to extend the listener interface to monitor the accesses of keys.

Comment thread lmcache/v1/distributed/storage_controllers/prefetch_controller.py
policy: StorePolicy,
) -> None:
super().__init__(l1_manager)
self._l1_manager = l1_manager
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question, what's the issue with using super().__init__?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because l2 eviction controller does not need to hold a reference to l1_manager for now.

Comment thread docs/design/l2_adapters/l2_eviction.md Outdated
Comment thread lmcache/v1/mp_observability/logger/l2_stats_logger.py Outdated
Comment thread lmcache/v1/distributed/l2_adapters/base.py Outdated
Comment thread lmcache/v1/distributed/l2_adapters/base.py Outdated
Comment thread lmcache/v1/distributed/l2_adapters/mock_l2_adapter.py
Comment on lines +252 to +259
def delete(self, keys: list[ObjectKey]) -> None:
# Not implemented for the native connector adapter.
pass

def get_usage(self) -> tuple[float, float]:
# Not implemented for the native connector adapter.
return (0.0, 0.0)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @sammshen , we may need to consider supporting eviction in the native L2 adapter

Comment thread lmcache/v1/distributed/storage_manager.py Outdated
Comment thread lmcache/v1/distributed/storage_controllers/eviction_controller.py Outdated
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just nit comments, otherwise LGTM!

Comment thread lmcache/v1/distributed/storage_controllers/eviction_controller.py Outdated
Comment thread lmcache/v1/distributed/storage_controllers/eviction_controller.py Outdated
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
@YaoJiayi YaoJiayi requested a review from ApostaC March 26, 2026 22:09
@ApostaC ApostaC added the full Run comprehensive tests on this PR label Mar 26, 2026
Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown
Contributor

@sammshen sammshen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@YaoJiayi YaoJiayi enabled auto-merge (squash) March 27, 2026 03:37
YaoJiayi and others added 7 commits March 27, 2026 00:12
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
@YaoJiayi YaoJiayi merged commit 5634dbc into dev Mar 30, 2026
30 checks passed
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
* add l2 eviction

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>

* add docs

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>

* fix observability

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>

* fix l2-related tests

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>

* add minor changes

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>

* fix return val confusion

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>

* fix comments

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>

* add unit tests

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>

* improve according on comments

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>

---------

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
* add l2 eviction

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>

* add docs

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>

* fix observability

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>

* fix l2-related tests

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>

* add minor changes

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>

* fix return val confusion

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>

* fix comments

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>

* add unit tests

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>

* improve according on comments

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>

---------

Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants