Skip to content

[MP] Introduce MP runtime plugin framework#2956

Merged
ApostaC merged 17 commits intoLMCache:devfrom
maobaolong:mp_plugin
Apr 22, 2026
Merged

[MP] Introduce MP runtime plugin framework#2956
ApostaC merged 17 commits intoLMCache:devfrom
maobaolong:mp_plugin

Conversation

@maobaolong
Copy link
Copy Markdown
Collaborator

@maobaolong maobaolong commented Apr 5, 2026

What this PR does / why we need it:

  • Tested locally.
(base) ~/projects/LMCache git:[mp_plugin]
python -m lmcache.v1.multiprocess.server \
    --host localhost --port 5555 \
    --l1-size-gb 10 \
    --eviction-policy LRU \
    --runtime-plugin-locations examples/mp_runtime_plugins/
[2026-04-07 16:25:48,567] LMCache INFO: OTel MeterProvider initialised with Prometheus fallback (http://0.0.0.0:9090/metrics) (otel_init.py:71:lmcache.v1.mp_observability.otel_init)
[2026-04-07 16:25:48,574] LMCache INFO: Starting L1EvictionController... (eviction_controller.py:41:lmcache.v1.distributed.storage_controllers.eviction_controller)
[2026-04-07 16:25:48,574] LMCache INFO: Starting L2EvictionController... (eviction_controller.py:175:lmcache.v1.distributed.storage_controllers.eviction_controller)
[2026-04-07 16:25:48,574] LMCache INFO: Starting StoreController... (store_controller.py:211:lmcache.v1.distributed.storage_controllers.store_controller)
[2026-04-07 16:25:48,574] LMCache INFO: Starting PrefetchController... (prefetch_controller.py:374:lmcache.v1.distributed.storage_controllers.prefetch_controller)
[2026-04-07 16:25:48,575] LMCache INFO: Using blake3 hash function (token_hasher.py:79:lmcache.v1.multiprocess.token_hasher)
INFO 04-07 16:25:51 __init__.py:207] Automatically detected platform cpu.
[2026-04-07 16:25:51,750] LMCache INFO: Computed NONE_HASH=b"~>\xff\x9e\xf7a:P4\x0e\xb1&w\xee\x03\xb8:\xc7Y\xfc\xba\x9b\xb3'\x05\xf0rH\xd5mG;" using hash function (token_hasher.py:171:lmcache.v1.multiprocess.token_hasher)
[2026-04-07 16:25:51,751] LMCache INFO: TokenHasher initialized: chunk_size=256, hash_algorithm=blake3 (token_hasher.py:67:lmcache.v1.multiprocess.token_hasher)
[2026-04-07 16:25:51,754] LMCache INFO: LMCache ZMQ cache server is running on tcp://localhost:5555 (server.py:974:__main__)
[2026-04-07 16:25:51,754] LMCache INFO: LMCache cache server is running... (server.py:984:__main__)
[2026-04-07 16:25:51,767] LMCache INFO: Launched runtime plugin: examples/mp_runtime_plugins/mp_plugin.py with /opt/homebrew/Caskroom/miniconda/base/bin/python (runtime_plugin_launcher.py:126:lmcache.v1.plugin.runtime_plugin_launcher)
[2026-04-07 16:25:51,785] LMCache INFO: Launched runtime plugin: examples/mp_runtime_plugins/mp_heartbeat.sh with /bin/bash (runtime_plugin_launcher.py:126:lmcache.v1.plugin.runtime_plugin_launcher)
[2026-04-07 16:25:51,786] LMCache INFO: [examples/mp_runtime_plugins/mp_heartbeat.sh] [mp_heartbeat] Started (runtime_plugin_launcher.py:180:lmcache.v1.plugin.runtime_plugin_launcher)
[2026-04-07 16:25:51,790] LMCache INFO: [examples/mp_runtime_plugins/mp_plugin.py] [mp_plugin] Started (runtime_plugin_launcher.py:180:lmcache.v1.plugin.runtime_plugin_launcher)
[2026-04-07 16:25:51,790] LMCache INFO: [examples/mp_runtime_plugins/mp_plugin.py] [mp_plugin] config: {"mp_config": {"host": "localhost", "port": 5555, "chunk_size": 256, "max_workers": 1, "max_gpu_workers": 1, "max_cpu_workers": 1, "hash_algorithm": "blake3", "engine_type": "default", "runtime_plugin_locations": ["examples/mp_runtime_plugins/"]}, "storage_manager_config": {"l1_manager_config": {"memory_config": {"size_in_bytes": 10737418240, "use_lazy": true, "init_size_in_bytes": 10737418240, "align_bytes": 4096}, "write_ttl_seconds": 600, "read_ttl_seconds": 300}, "eviction_config": {"eviction_policy": "LRU", "trigger_watermark": 0.8, "eviction_ratio": 0.2}, "l2_adapter_config": {"adapters": []}, "store_policy": "default", "prefetch_policy": "default", "prefetch_max_in_flight": 8}, "obs_config": {"enabled": true, "max_queue_size": 10000, "metrics_enabled": true, "logging_enabled": true, "tracing_enabled": false, "otlp_endpoint": null, "prometheus_port": 9090}} (runtime_plugin_launcher.py:180:lmcache.v1.plugin.runtime_plugin_launcher)
[2026-04-07 16:25:51,790] LMCache INFO: [examples/mp_runtime_plugins/mp_plugin.py] [mp_plugin] heartbeat #0 (runtime_plugin_launcher.py:180:lmcache.v1.plugin.runtime_plugin_launcher)
[2026-04-07 16:25:51,792] LMCache INFO: [examples/mp_runtime_plugins/mp_heartbeat.sh] [mp_heartbeat] config: {"mp_config":{"host":"localhost","port":5555,"chunk_size":256,"max_workers":1,"max_gpu_workers":1,"max_cpu_workers":1,"hash_algorithm":"blake3","engine_type":"default","runtime_plugin_locations":["examples/mp_runtime_plugins/"]},"storage_manager_config":{"l1_manager_config":{"memory_config":{"size_in_bytes":10737418240,"use_lazy":true,"init_size_in_bytes":10737418240,"align_bytes":4096},"write_ttl_seconds":600,"read_ttl_seconds":300},"eviction_config":{"eviction_policy":"LRU","trigger_watermark":0.8,"eviction_ratio":0.2},"l2_adapter_config":{"adapters":[]},"store_policy":"default","prefetch_policy":"default","prefetch_max_in_flight":8},"obs_config":{"enabled":true,"max_queue_size":10000,"metrics_enabled":true,"logging_enabled":true,"tracing_enabled":false,"otlp_endpoint":null,"prometheus_port":9090}} (runtime_plugin_launcher.py:180:lmcache.v1.plugin.runtime_plugin_launcher)
[2026-04-07 16:25:51,792] LMCache INFO: [examples/mp_runtime_plugins/mp_heartbeat.sh] [mp_heartbeat] heartbeat #0 (runtime_plugin_launcher.py:180:lmcache.v1.plugin.runtime_plugin_launcher)
[2026-04-07 16:26:21,793] LMCache INFO: [examples/mp_runtime_plugins/mp_plugin.py] [mp_plugin] heartbeat #1 (runtime_plugin_launcher.py:180:lmcache.v1.plugin.runtime_plugin_launcher)
[2026-04-07 16:26:21,807] LMCache INFO: [examples/mp_runtime_plugins/mp_heartbeat.sh] [mp_heartbeat] heartbeat #1 (runtime_plugin_launcher.py:180:lmcache.v1.plugin.runtime_plugin_launcher)

Special notes for your reviewers:

If applicable:

  • this PR contains user facing changes - docs added
  • this PR contains unit tests

Note

Medium Risk
Adds optional subprocess launching during server startup/shutdown and changes plugin filtering/env-var behavior (including PYTHONUNBUFFERED), which could impact deployments that rely on runtime plugins.

Overview
Adds an MP-mode runtime plugin framework that can launch .py/.sh scripts alongside the multiprocess server, passing an aggregated JSON config blob via LMCACHE_RUNTIME_PLUGIN_CONFIG.

This introduces RuntimePluginConfig and new CLI flags (--runtime-plugin-locations, --runtime-plugin-config) in MP config parsing, wires plugin lifecycle into multiprocess/http_server.py startup/shutdown via a new MPRuntimePluginLauncher, and updates the base RuntimePluginLauncher to support role=None (disabling role/worker filename filtering) and to force unbuffered Python output for real-time log capture. Docs, examples, and unit tests are added to document and validate the MP launcher behavior.

Reviewed by Cursor Bugbot for commit 67d201a. Bugbot is set up for automated code reviews on this repo. Configure here.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a multiprocess runtime plugin framework, including the new MPRuntimePluginLauncher class and integration into the cache server and HTTP server lifecycles. It also extends the server configuration to support plugin locations via CLI arguments. Feedback indicates that the new feature lacks required unit tests and design documentation. Additionally, several functions, including the new launcher's constructor and the updated run_cache_server implementations, require updated type hints and docstrings to comply with the project's style guide.

Comment thread lmcache/v1/multiprocess/mp_runtime_plugin_launcher.py
Comment thread lmcache/v1/multiprocess/mp_runtime_plugin_launcher.py Outdated
Comment thread lmcache/v1/multiprocess/blend_server_v2.py Outdated
Comment thread lmcache/v1/multiprocess/server.py Outdated
Comment thread lmcache/v1/multiprocess/server.py
Comment thread lmcache/v1/multiprocess/mp_runtime_plugin_launcher.py
Comment thread tests/v1/multiprocess/test_mp_runtime_plugin_launcher.py
Comment thread tests/v1/multiprocess/test_mp_runtime_plugin_launcher.py Outdated
Comment thread lmcache/v1/multiprocess/server.py
@maobaolong
Copy link
Copy Markdown
Collaborator Author

@sammshen @chunxiaozheng Would you like to take a look at this PR? Thanks!

Comment thread lmcache/v1/plugin/runtime_plugin_launcher.py Outdated
Comment thread lmcache/v1/multiprocess/server.py Outdated
maobaolong added a commit to maobaolong/LMCache that referenced this pull request Apr 8, 2026
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
@maobaolong maobaolong added the mp Buildkite trigger for multi-processing mode test label Apr 9, 2026
@sammshen
Copy link
Copy Markdown
Contributor

particularly for MP mode, now that we are introducing the LMCache CLI, I believe modification to the internals of the Configs or running sidecars should be done through that interface. Using Kubernetes as an example, I think that etcd is only interfaced through kube api server and controllers and scheduler and kubelet all change the cluster through a controlled interface.

in the future, maybe we could start with a super concrete design of which existing modules a brand new feature will plug into with a clear definition of inputs and outputs?

cc @ApostaC @KuntaiDu

Comment thread lmcache/v1/multiprocess/config.py
Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

Comment thread lmcache/v1/multiprocess/http_server.py Outdated
Comment thread docs/design/v1/multiprocess/mp_runtime_plugin.md
Comment thread lmcache/v1/multiprocess/mp_runtime_plugin_launcher.py
Comment thread lmcache/v1/multiprocess/mp_runtime_plugin_launcher.py
maobaolong and others added 6 commits April 22, 2026 07:54
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit adec2fb. Configure here.

Comment thread examples/mp_runtime_plugins/README.md
Copy link
Copy Markdown
Collaborator

@chunxiaozheng chunxiaozheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ApostaC ApostaC enabled auto-merge (squash) April 22, 2026 20:11
@github-actions github-actions Bot added the full Run comprehensive tests on this PR label Apr 22, 2026
@ApostaC ApostaC merged commit 6fbec46 into LMCache:dev Apr 22, 2026
51 of 58 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR mp Buildkite trigger for multi-processing mode test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants