Skip to content

[4/4][MP] L2 Prefetch controller foundation#2658

Merged
YaoJiayi merged 3 commits intoLMCache:devfrom
ApostaC:local-dev/l2-adapter-4
Mar 2, 2026
Merged

[4/4][MP] L2 Prefetch controller foundation#2658
YaoJiayi merged 3 commits intoLMCache:devfrom
ApostaC:local-dev/l2-adapter-4

Conversation

@ApostaC
Copy link
Copy Markdown
Contributor

@ApostaC ApostaC commented Feb 28, 2026

What this PR does / why we need it:
Part of #2562

This is the foundation PR for the L2 prefetch controller. It adds the building blocks needed before implementing the full controller (next PR):

  1. L1Manager.finish_write_and_reserve_read() — Atomic transition from write-locked to read-locked state. After loading L2 data into write-reserved L1 buffers, this prevents a race window where eviction could remove the key between separate finish_write() + reserve_read() calls.

  2. PrefetchPolicy ABC + DefaultPrefetchPolicy — Policy interface for deciding which L2 adapter loads which key when multiple adapters have the same data. The default policy assigns each key to the lowest-indexed adapter that has it.

  3. PrefetchHandle redesign — Extended from a single prefix_hit_chunks field to include request_id (for async L2 tracking), l1_prefix_hit_count, and total_requested_keys.

  4. More interfaces in Bitmap — Expose more interfaces in Bitmap for the prefetch controller to use.

Note: This PR needs to be rebased after #2646 is merged. And it requires #2662 to be merged first.

If applicable:

  • this PR contains user facing changes - docs added
  • this PR contains unit tests

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request lays the foundational groundwork for the L2 prefetch controller within LMCache. It introduces critical building blocks to enable robust and efficient data management between L1 and L2 caches. The changes focus on enhancing atomic operations for cache consistency, establishing a flexible policy system for L2 data distribution, and improving prefetch request tracking. These additions are essential prerequisites for the subsequent implementation of the full L2 prefetch controller, ensuring a stable and extensible architecture for distributed caching.

Highlights

  • Atomic L1 Write-to-Read Transition: Introduced L1Manager.finish_write_and_reserve_read() to atomically transition keys from a write-locked to a read-locked state, preventing race conditions during L2 data loading into L1 buffers.
  • L2 Prefetch Policy Interface: Defined PrefetchPolicy as an abstract base class and provided DefaultPrefetchPolicy to determine which L2 adapter should load a specific key when multiple adapters hold the same data, prioritizing the lowest-indexed adapter.
  • PrefetchHandle Redesign: Extended the PrefetchHandle dataclass to include request_id for asynchronous L2 tracking, l1_prefix_hit_count for initial L1 hits, and total_requested_keys, while maintaining backward compatibility for query_prefetch_status().
  • Asynchronous L1-to-L2 Store Controller: Added StoreController to asynchronously copy data from L1 to L2 after writes complete, using a background thread and an event-driven loop, and introduced StorePolicy for configurable L2 storage and L1 deletion decisions.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • lmcache/v1/distributed/l1_manager.py
    • Added finish_write_and_reserve_read method to atomically transition keys from write-locked to read-locked state, preventing race conditions.
  • lmcache/v1/distributed/l2_adapters/init.py
    • Added create_l2_adapter function to instantiate L2 adapter instances from their configuration.
  • lmcache/v1/distributed/l2_adapters/config.py
    • Added get_type_name_for_config function for reverse-lookup of registered L2 adapter type names.
  • lmcache/v1/distributed/l2_adapters/mock_l2_adapter.py
    • Added debug/test-only methods (debug_get_stored_object_count, debug_get_locked_key_count, debug_has_key) for inspecting the mock adapter's state.
  • lmcache/v1/distributed/storage_controllers/init.py
    • Imported and exposed the new StoreController in the package's __all__ list.
  • lmcache/v1/distributed/storage_controllers/prefetch_policy.py
    • Added new file defining the PrefetchPolicy abstract base class and DefaultPrefetchPolicy for selecting L2 adapters for key loading.
  • lmcache/v1/distributed/storage_controllers/store_controller.py
    • Added new file implementing StoreController for asynchronous L1-to-L2 data storage, including StoreListener and InFlightStoreTask.
  • lmcache/v1/distributed/storage_controllers/store_policy.py
    • Added new file defining the StorePolicy abstract base class, AdapterDescriptor, and DefaultStorePolicy for L1-to-L2 storage decisions and L1 deletion.
  • lmcache/v1/distributed/storage_manager.py
    • Updated PrefetchHandle dataclass to include request_id, l1_prefix_hit_count, and total_requested_keys.
    • Initialized L2 adapters and the new StoreController during StorageManager instantiation.
    • Removed TODO comments related to L2 controller triggering from finish_write.
    • Updated submit_prefetch_task to use the new PrefetchHandle fields.
    • Modified query_prefetch_status to return handle.l1_prefix_hit_count.
    • Added _store_controller.stop() and L2 adapter close() calls to the close method for proper resource management.
  • tests/v1/distributed/test_l1_manager.py
    • Added a new test suite TestFinishWriteAndReserveRead with multiple test cases for the new finish_write_and_reserve_read method, covering normal transitions, error conditions, and mixed results.
  • tests/v1/distributed/test_prefetch_policy.py
    • Added new file containing unit tests for PrefetchPolicy and DefaultPrefetchPolicy, verifying key assignment logic across single and multiple adapters with various overlap scenarios.
  • tests/v1/distributed/test_store_controller.py
    • Added new file containing integration tests for StoreController, covering its lifecycle, L1 write triggers, L2 storage, read lock release, and custom policy interactions.
  • tests/v1/distributed/test_store_policy.py
    • Added new file containing unit tests for StorePolicy and DefaultStorePolicy, verifying target selection and L1 deletion behavior.
Activity
  • ApostaC created this pull request to introduce foundational components for the L2 prefetch controller.
  • The pull request includes new atomic L1 operations, a prefetch policy interface, and an asynchronous store controller, along with corresponding tests.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the foundational components for the L2 prefetch controller, including an atomic finish_write_and_reserve_read operation in L1Manager, policies for prefetching and storing data (PrefetchPolicy, StorePolicy), and the asynchronous StoreController to manage data transfer from L1 to L2. The changes are well-structured, robust, and thoroughly tested. My review identifies one minor opportunity for a performance optimization.

Comment on lines +69 to +72
for name, cls in _L2_ADAPTER_CONFIG_REGISTRY.items():
if type(config) is cls:
return name
raise ValueError(f"Unregistered L2 adapter config type: {type(config).__name__}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This reverse lookup iterates through the entire registry on each call, which has O(N) complexity where N is the number of adapter types. While this is acceptable for a small number of adapters, it could become a performance concern if the registry grows.

A more performant approach would be to maintain a reverse mapping from the config class to its type name, allowing for an O(1) lookup. This could be populated when adapters are registered.

@ApostaC ApostaC mentioned this pull request Feb 28, 2026
8 tasks
ApostaC added 2 commits March 1, 2026 03:04
Signed-off-by: ApostaC <yihua98@uchicago.edu>
Signed-off-by: ApostaC <yihua98@uchicago.edu>
@ApostaC ApostaC force-pushed the local-dev/l2-adapter-4 branch from 4612841 to 0a6fd2d Compare March 1, 2026 03:04
Signed-off-by: ApostaC <yihua98@uchicago.edu>
@ApostaC ApostaC added the full Run comprehensive tests on this PR label Mar 1, 2026

# Reserve write (key is now write-locked)
write_result = manager.reserve_write([key], [False], basic_layout)
assert write_result[key][0] == L1Error.SUCCESS
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming-wise L1Error.SUCCESS sounds weird. L1 Error is an error and it should not be with SUCCESS. I'm fine to not change it in this PR tho.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I followed the naming convention of CUDA errors -- i.e., cudaSuccess is the first value of cudaError_t
image

Copy link
Copy Markdown
Contributor

@KuntaiDu KuntaiDu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@YaoJiayi YaoJiayi self-requested a review March 2, 2026 02:29
@YaoJiayi YaoJiayi merged commit 94edee0 into LMCache:dev Mar 2, 2026
32 of 33 checks passed
hlin99 pushed a commit to hlin99/LMCache that referenced this pull request Mar 2, 2026
* [add] backbone for prefetch controller

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* [Add] bitmap operations for better prefetch support

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* fix precommit issue

Signed-off-by: ApostaC <yihua98@uchicago.edu>

---------

Signed-off-by: ApostaC <yihua98@uchicago.edu>
oferki pushed a commit to oferki/LMCache that referenced this pull request Mar 3, 2026
* [add] backbone for prefetch controller

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* [Add] bitmap operations for better prefetch support

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* fix precommit issue

Signed-off-by: ApostaC <yihua98@uchicago.edu>

---------

Signed-off-by: ApostaC <yihua98@uchicago.edu>
Signed-off-by: Ofer Kiselov Nahman <ofer.kiselovnahman@weka.io>
oferki pushed a commit to oferki/LMCache that referenced this pull request Mar 3, 2026
* [add] backbone for prefetch controller

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* [Add] bitmap operations for better prefetch support

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* fix precommit issue

Signed-off-by: ApostaC <yihua98@uchicago.edu>

---------

Signed-off-by: ApostaC <yihua98@uchicago.edu>
mauryaavinash95 pushed a commit to mauryaavinash95/LMCache that referenced this pull request Mar 7, 2026
* [add] backbone for prefetch controller

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* [Add] bitmap operations for better prefetch support

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* fix precommit issue

Signed-off-by: ApostaC <yihua98@uchicago.edu>

---------

Signed-off-by: ApostaC <yihua98@uchicago.edu>
shaoxiawjc pushed a commit to shaoxiawjc/LMCache that referenced this pull request Mar 11, 2026
* [add] backbone for prefetch controller

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* [Add] bitmap operations for better prefetch support

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* fix precommit issue

Signed-off-by: ApostaC <yihua98@uchicago.edu>

---------

Signed-off-by: ApostaC <yihua98@uchicago.edu>
Signed-off-by: shaoxiawjc <wjc2800@163.com>
realAaronWu pushed a commit to realAaronWu/LMCache that referenced this pull request Mar 20, 2026
* [add] backbone for prefetch controller

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* [Add] bitmap operations for better prefetch support

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* fix precommit issue

Signed-off-by: ApostaC <yihua98@uchicago.edu>

---------

Signed-off-by: ApostaC <yihua98@uchicago.edu>
Signed-off-by: Aaron Wu <aaron.wu@dell.com>
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
* [add] backbone for prefetch controller

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* [Add] bitmap operations for better prefetch support

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* fix precommit issue

Signed-off-by: ApostaC <yihua98@uchicago.edu>

---------

Signed-off-by: ApostaC <yihua98@uchicago.edu>
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
* [add] backbone for prefetch controller

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* [Add] bitmap operations for better prefetch support

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* fix precommit issue

Signed-off-by: ApostaC <yihua98@uchicago.edu>

---------

Signed-off-by: ApostaC <yihua98@uchicago.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants