Skip to content

Plugin L2 Adapter Framework for MP Mode#2715

Merged
chunxiaozheng merged 10 commits intoLMCache:devfrom
maobaolong:l2_storage_plugin
Mar 15, 2026
Merged

Plugin L2 Adapter Framework for MP Mode#2715
chunxiaozheng merged 10 commits intoLMCache:devfrom
maobaolong:l2_storage_plugin

Conversation

@maobaolong
Copy link
Copy Markdown
Collaborator

What this PR does / why we need it:

Adds a "plugin" L2 adapter type that allows third-party developers to extend LMCache with custom L2 storage backends in MP mode — without modifying any LMCache source code.

  • key configuration
--l2-adapter '{"type":"plugin","module_path":"lmc_external_l2_adapter","class_name":"InMemoryL2Adapter","adapter_params":{"max_size_gb":1.0,"mock_bandwidth_gb":20.0}}'
  • E2E test(TP=1, chunk_size=256, prompt_tokens=272)

    • Step 1: Start LMCache MP Server with plugin L2 adapter
python3 -m lmcache.v1.multiprocess.server \
    --host localhost --port 15556 --chunk-size 256 --l1-size-gb 5 \
    --eviction-policy LRU --max-workers 1 \
    --l2-adapter '{"type":"plugin","module_path":"lmc_external_l2_adapter","class_name":"InMemoryL2Adapter","adapter_params":{"max_size_gb":1.0,"mock_bandwidth_gb":20.0}}'
  • Step 2:Start vLLM
python3 -m vllm.entrypoints.cli.main serve $MODEL_PATH \
    -tp 1 --load-format dummy --gpu_memory_utilization 0.85 \
    --no-enable-prefix-caching --enforce-eager --max-model-len 8192 --port 8001 \
    --kv-transfer-config '{"kv_connector":"LMCacheMPConnector","kv_role":"kv_both","kv_connector_extra_config":{"lmcache.mp.port":15556}}'
  • Step 3: Send long prompt (272 tokens > 256 chunk_size) twice

  • Result: Both requests succeed (prompt_tokens=272, completion_tokens=512).

  • Server logs confirm:

InMemoryL2Adapter created: max_size_gb=1.0, mock_bandwidth_gb=20.0
Stored 256 tokens in 0.024 seconds
Stored 256 tokens in 0.001 seconds
Stored 256 tokens in 0.001 seconds
Prefetch request completed (L1+L2): 1/1 prefix hits (1 L1, 0 L2) in 0.0 ms
Retrieved 256 tokens in 0.001 seconds

Special notes for your reviewers:

If applicable:

  • this PR contains user facing changes - docs added
  • this PR contains unit tests

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the extensibility of LMCache by introducing a plugin-based architecture for L2 adapters in multi-process (MP) mode. This change allows external developers to integrate custom L2 storage solutions seamlessly, fostering a more modular and adaptable caching system. The new design streamlines the process of adding new adapter types by moving towards a self-registration model, reducing the need for modifications to the core framework when new adapters are introduced.

Highlights

  • Plugin L2 Adapter Framework: Introduced a new 'plugin' L2 adapter type, enabling third-party developers to extend LMCache with custom L2 storage backends without modifying core LMCache source code. This is achieved through dynamic loading of external Python modules.
  • Dynamic Adapter Discovery and Factory Registry: Refactored the L2 adapter creation mechanism to use an auto-discovery and factory registry pattern. Adapter modules now self-register their configuration types and factory functions, simplifying the addition of new adapter types.
  • Example External L2 Adapter: Added a complete, pip-installable example of an external in-memory L2 adapter (InMemoryL2Adapter) along with its configuration, build system (pyproject.toml), installation/test script, and unit tests. This serves as a reference implementation for plugin authors.
  • Updated Adapter Configuration and Creation: Modified the create_l2_adapter function to leverage the new factory registry, removing explicit type-checking branches for each adapter. Existing MockL2Adapter and NixlStoreL2Adapter were updated to use the self-registration mechanism.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • examples/lmc_external_l2_adapter/configs/l2_adapter_example.json
    • Added an example JSON configuration for the external InMemoryL2Adapter plugin.
  • examples/lmc_external_l2_adapter/pyproject.toml
    • Added project configuration for the lmc-external-l2-adapter example plugin.
  • examples/lmc_external_l2_adapter/scripts/install_and_test.sh
    • Added a shell script to install and test the example external L2 adapter plugin.
  • examples/lmc_external_l2_adapter/src/lmc_external_l2_adapter/init.py
    • Added the __init__.py file to define the lmc_external_l2_adapter package and expose InMemoryL2Adapter.
  • examples/lmc_external_l2_adapter/src/lmc_external_l2_adapter/adapter.py
    • Added the InMemoryL2Adapter class, a minimal yet functional in-memory L2 adapter intended as a reference implementation for external plugins.
  • examples/lmc_external_l2_adapter/tests/test_plugin.py
    • Added comprehensive unit tests for the lmc_external_l2_adapter plugin, covering configuration parsing, import, and full store/lookup/load round-trips with eviction.
  • lmcache/v1/distributed/l2_adapters/init.py
    • Modified the L2 adapter factory to automatically discover and import modules within the package, enabling self-registration of adapter types and factories.
    • Updated the create_l2_adapter function to use the new create_l2_adapter_from_registry for dynamic instantiation.
  • lmcache/v1/distributed/l2_adapters/config.py
    • Introduced L2AdapterFactory and _L2_ADAPTER_FACTORY_REGISTRY to manage adapter factory callables.
    • Added register_l2_adapter_factory and create_l2_adapter_from_registry functions to support the new dynamic adapter creation mechanism.
    • Moved the NixlStoreL2AdapterConfig class definition and its registration to after the parse_args_to_l2_adapters_config function, and added its factory registration.
  • lmcache/v1/distributed/l2_adapters/design_docs/plugin.md
    • Added a design document detailing the Plugin L2 Adapter framework, its components, loading flow, plugin contract, threading model, and an example.
  • lmcache/v1/distributed/l2_adapters/mock_l2_adapter.py
    • Moved the MockL2AdapterConfig class definition within the file.
    • Added self-registration for MockL2AdapterConfig and its factory using the new registry functions.
  • lmcache/v1/distributed/l2_adapters/nixl_store_l2_adapter.py
    • Updated import statements to reflect changes in config.py.
  • lmcache/v1/distributed/l2_adapters/plugin_l2_adapter.py
    • Added the PluginL2AdapterConfig class for configuring external L2 adapter plugins.
    • Added the _create_plugin_adapter factory function responsible for dynamically loading and instantiating external adapter classes.
    • Implemented self-registration for the 'plugin' type and its factory.
  • tests/v1/distributed/test_distributed_storage_manager.py
    • Updated the import path for MockL2AdapterConfig.
  • tests/v1/distributed/test_mock_l2_adapter.py
    • Updated the import path for MockL2Adapter and MockL2AdapterConfig.
  • tests/v1/distributed/test_prefetch_controller.py
    • Updated the import path for MockL2Adapter and MockL2AdapterConfig.
  • tests/v1/distributed/test_prefetch_policy.py
    • Updated the import path for MockL2AdapterConfig.
  • tests/v1/distributed/test_store_controller.py
    • Updated the import path for MockL2Adapter and MockL2AdapterConfig.
  • tests/v1/distributed/test_store_policy.py
    • Updated the import path for MockL2AdapterConfig.
Activity
  • The pull request introduces a new architectural feature and provides a comprehensive example and design documentation.
  • The author has included end-to-end tests for the example plugin, demonstrating its functionality and adherence to the L2 adapter interface.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a well-designed plugin framework for L2 adapters, which is a great step towards making LMCache more extensible. However, it introduces a significant security risk by allowing arbitrary Python modules to be imported based on user-supplied configuration strings, potentially leading to Remote Code Execution (RCE) if the configuration source is untrusted or the Python search path is compromised. It is recommended to restrict allowed modules to a specific namespace or use a more secure plugin discovery mechanism. Additionally, my review suggests improving robustness through more specific error logging and enhancing the consistency of the new plugin architecture.

Comment thread lmcache/v1/distributed/l2_adapters/plugin_l2_adapter.py
Comment thread lmcache/v1/distributed/l2_adapters/config.py Outdated
"L2AdapterInterface" % (config.module_path, config.class_name)
)

return adapter_cls(**kwargs, **config.adapter_params)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maobaolong adapter_cls(**kwargs, **config.adapter_params) is not very universal, I think adapter_cls( config.adapter_params, **kwargs) is more suitable, what do you think?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@ApostaC
Copy link
Copy Markdown
Contributor

ApostaC commented Mar 10, 2026

@maobaolong Do we need a rebase after #2704 is merged?

@maobaolong
Copy link
Copy Markdown
Collaborator Author

@ApostaC It's ok for me whether merge this PR or #2704

Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
@maobaolong
Copy link
Copy Markdown
Collaborator Author

@ApostaC @chunxiaozheng As #2704 merged, this PR can be reviewed again, thanks!

Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general. Mostly nit comments

Comment on lines +19 to +24
from lmcache.v1.distributed.l2_adapters.base import (
L2AdapterInterface,
)

# First Party
from lmcache.v1.distributed.l2_adapters.base import L2AdapterInterface as _L2AI
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We have imported L2AdapterInterface both in TYPE_CHECKING and normal import. Import them once should be enough

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Constructor accepts either an
``InMemoryL2AdapterConfig`` instance (matching
the built-in adapter convention) **or** a plain
dict (legacy plugin mode) for backward compatibility.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the meaning of "legacy plugin mode" and "backward compatibility" here? I suppose the plugin system a new module introduced by this PR?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for a middle state of my development. Updated the comment.


def _create_plugin_adapter(
config: L2AdapterConfigBase,
l1_memory_desc: "Optional[L1MemoryDesc]" = None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The l1_memory_desc is never used in L2 plugins. Is it expected, or do we have some reason to ignore it?

If the plugin needs to register memory (like using RDMA), the l1_memory_desc will be useful.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

l1_memory_desc is now forwarded to the plugin adapter constructor via **kwargs when it is not None. This way, plugins that need to register L1 memory (e.g., for RDMA) can access it through the l1_memory_desc keyword argument, while existing plugins that don't need it remain unaffected.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some simple unit tests to secure the registration and initialization process of the plugin adapters? In this case, the future PRs will be less likely to break the L2 plugins system.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added! The new tests in test_l2_adapter_factory.py

Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
@maobaolong
Copy link
Copy Markdown
Collaborator Author

@ApostaC Thanks for the review, PTAL.

Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Copy link
Copy Markdown
Collaborator

@chunxiaozheng chunxiaozheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution, I have tested in my env, LGTM!

@chunxiaozheng chunxiaozheng enabled auto-merge (squash) March 15, 2026 04:39
@github-actions github-actions Bot added the full Run comprehensive tests on this PR label Mar 15, 2026
@chunxiaozheng chunxiaozheng merged commit bd757eb into LMCache:dev Mar 15, 2026
35 of 38 checks passed
hyunyul-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Mar 20, 2026
* Plugin L2 Adapter Framework for MP Mode

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Remove useless file

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Fix related failed UT.

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Skip if not import lmc_external_l2_adapter

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Add comments from chunxiaozheng

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Adapter existing adapter constructor

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* address comment

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* add missing

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* fix

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* fix

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

---------

Signed-off-by: baoloongmao <baoloongmao@tencent.com>
realAaronWu pushed a commit to realAaronWu/LMCache that referenced this pull request Mar 20, 2026
* Plugin L2 Adapter Framework for MP Mode

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Remove useless file

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Fix related failed UT.

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Skip if not import lmc_external_l2_adapter

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Add comments from chunxiaozheng

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Adapter existing adapter constructor

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* address comment

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* add missing

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* fix

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* fix

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

---------

Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: Aaron Wu <aaron.wu@dell.com>
deng451e pushed a commit to deng451e/LMCache that referenced this pull request Mar 25, 2026
* Plugin L2 Adapter Framework for MP Mode

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Remove useless file

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Fix related failed UT.

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Skip if not import lmc_external_l2_adapter

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Add comments from chunxiaozheng

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Adapter existing adapter constructor

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* address comment

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* add missing

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* fix

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* fix

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

---------

Signed-off-by: baoloongmao <baoloongmao@tencent.com>
deng451e pushed a commit to deng451e/LMCache that referenced this pull request Mar 27, 2026
* Plugin L2 Adapter Framework for MP Mode

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Remove useless file

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Fix related failed UT.

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Skip if not import lmc_external_l2_adapter

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Add comments from chunxiaozheng

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Adapter existing adapter constructor

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* address comment

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* add missing

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* fix

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* fix

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

---------

Signed-off-by: baoloongmao <baoloongmao@tencent.com>
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
* Plugin L2 Adapter Framework for MP Mode

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Remove useless file

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Fix related failed UT.

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Skip if not import lmc_external_l2_adapter

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Add comments from chunxiaozheng

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Adapter existing adapter constructor

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* address comment

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* add missing

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* fix

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* fix

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

---------

Signed-off-by: baoloongmao <baoloongmao@tencent.com>
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
* Plugin L2 Adapter Framework for MP Mode

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Remove useless file

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Fix related failed UT.

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Skip if not import lmc_external_l2_adapter

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Add comments from chunxiaozheng

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* Adapter existing adapter constructor

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* address comment

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* add missing

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* fix

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

* fix

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

---------

Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants