[MP] Introduce MP runtime plugin framework by maobaolong · Pull Request #2956 · LMCache/LMCache

maobaolong · 2026-04-05T01:42:06Z

What this PR does / why we need it:

Tested locally.

(base) ~/projects/LMCache git:[mp_plugin]
python -m lmcache.v1.multiprocess.server \
    --host localhost --port 5555 \
    --l1-size-gb 10 \
    --eviction-policy LRU \
    --runtime-plugin-locations examples/mp_runtime_plugins/
[2026-04-07 16:25:48,567] LMCache INFO: OTel MeterProvider initialised with Prometheus fallback (http://0.0.0.0:9090/metrics) (otel_init.py:71:lmcache.v1.mp_observability.otel_init)
[2026-04-07 16:25:48,574] LMCache INFO: Starting L1EvictionController... (eviction_controller.py:41:lmcache.v1.distributed.storage_controllers.eviction_controller)
[2026-04-07 16:25:48,574] LMCache INFO: Starting L2EvictionController... (eviction_controller.py:175:lmcache.v1.distributed.storage_controllers.eviction_controller)
[2026-04-07 16:25:48,574] LMCache INFO: Starting StoreController... (store_controller.py:211:lmcache.v1.distributed.storage_controllers.store_controller)
[2026-04-07 16:25:48,574] LMCache INFO: Starting PrefetchController... (prefetch_controller.py:374:lmcache.v1.distributed.storage_controllers.prefetch_controller)
[2026-04-07 16:25:48,575] LMCache INFO: Using blake3 hash function (token_hasher.py:79:lmcache.v1.multiprocess.token_hasher)
INFO 04-07 16:25:51 __init__.py:207] Automatically detected platform cpu.
[2026-04-07 16:25:51,750] LMCache INFO: Computed NONE_HASH=b"~>\xff\x9e\xf7a:P4\x0e\xb1&w\xee\x03\xb8:\xc7Y\xfc\xba\x9b\xb3'\x05\xf0rH\xd5mG;" using hash function (token_hasher.py:171:lmcache.v1.multiprocess.token_hasher)
[2026-04-07 16:25:51,751] LMCache INFO: TokenHasher initialized: chunk_size=256, hash_algorithm=blake3 (token_hasher.py:67:lmcache.v1.multiprocess.token_hasher)
[2026-04-07 16:25:51,754] LMCache INFO: LMCache ZMQ cache server is running on tcp://localhost:5555 (server.py:974:__main__)
[2026-04-07 16:25:51,754] LMCache INFO: LMCache cache server is running... (server.py:984:__main__)
[2026-04-07 16:25:51,767] LMCache INFO: Launched runtime plugin: examples/mp_runtime_plugins/mp_plugin.py with /opt/homebrew/Caskroom/miniconda/base/bin/python (runtime_plugin_launcher.py:126:lmcache.v1.plugin.runtime_plugin_launcher)
[2026-04-07 16:25:51,785] LMCache INFO: Launched runtime plugin: examples/mp_runtime_plugins/mp_heartbeat.sh with /bin/bash (runtime_plugin_launcher.py:126:lmcache.v1.plugin.runtime_plugin_launcher)
[2026-04-07 16:25:51,786] LMCache INFO: [examples/mp_runtime_plugins/mp_heartbeat.sh] [mp_heartbeat] Started (runtime_plugin_launcher.py:180:lmcache.v1.plugin.runtime_plugin_launcher)
[2026-04-07 16:25:51,790] LMCache INFO: [examples/mp_runtime_plugins/mp_plugin.py] [mp_plugin] Started (runtime_plugin_launcher.py:180:lmcache.v1.plugin.runtime_plugin_launcher)
[2026-04-07 16:25:51,790] LMCache INFO: [examples/mp_runtime_plugins/mp_plugin.py] [mp_plugin] config: {"mp_config": {"host": "localhost", "port": 5555, "chunk_size": 256, "max_workers": 1, "max_gpu_workers": 1, "max_cpu_workers": 1, "hash_algorithm": "blake3", "engine_type": "default", "runtime_plugin_locations": ["examples/mp_runtime_plugins/"]}, "storage_manager_config": {"l1_manager_config": {"memory_config": {"size_in_bytes": 10737418240, "use_lazy": true, "init_size_in_bytes": 10737418240, "align_bytes": 4096}, "write_ttl_seconds": 600, "read_ttl_seconds": 300}, "eviction_config": {"eviction_policy": "LRU", "trigger_watermark": 0.8, "eviction_ratio": 0.2}, "l2_adapter_config": {"adapters": []}, "store_policy": "default", "prefetch_policy": "default", "prefetch_max_in_flight": 8}, "obs_config": {"enabled": true, "max_queue_size": 10000, "metrics_enabled": true, "logging_enabled": true, "tracing_enabled": false, "otlp_endpoint": null, "prometheus_port": 9090}} (runtime_plugin_launcher.py:180:lmcache.v1.plugin.runtime_plugin_launcher)
[2026-04-07 16:25:51,790] LMCache INFO: [examples/mp_runtime_plugins/mp_plugin.py] [mp_plugin] heartbeat #0 (runtime_plugin_launcher.py:180:lmcache.v1.plugin.runtime_plugin_launcher)
[2026-04-07 16:25:51,792] LMCache INFO: [examples/mp_runtime_plugins/mp_heartbeat.sh] [mp_heartbeat] config: {"mp_config":{"host":"localhost","port":5555,"chunk_size":256,"max_workers":1,"max_gpu_workers":1,"max_cpu_workers":1,"hash_algorithm":"blake3","engine_type":"default","runtime_plugin_locations":["examples/mp_runtime_plugins/"]},"storage_manager_config":{"l1_manager_config":{"memory_config":{"size_in_bytes":10737418240,"use_lazy":true,"init_size_in_bytes":10737418240,"align_bytes":4096},"write_ttl_seconds":600,"read_ttl_seconds":300},"eviction_config":{"eviction_policy":"LRU","trigger_watermark":0.8,"eviction_ratio":0.2},"l2_adapter_config":{"adapters":[]},"store_policy":"default","prefetch_policy":"default","prefetch_max_in_flight":8},"obs_config":{"enabled":true,"max_queue_size":10000,"metrics_enabled":true,"logging_enabled":true,"tracing_enabled":false,"otlp_endpoint":null,"prometheus_port":9090}} (runtime_plugin_launcher.py:180:lmcache.v1.plugin.runtime_plugin_launcher)
[2026-04-07 16:25:51,792] LMCache INFO: [examples/mp_runtime_plugins/mp_heartbeat.sh] [mp_heartbeat] heartbeat #0 (runtime_plugin_launcher.py:180:lmcache.v1.plugin.runtime_plugin_launcher)
[2026-04-07 16:26:21,793] LMCache INFO: [examples/mp_runtime_plugins/mp_plugin.py] [mp_plugin] heartbeat #1 (runtime_plugin_launcher.py:180:lmcache.v1.plugin.runtime_plugin_launcher)
[2026-04-07 16:26:21,807] LMCache INFO: [examples/mp_runtime_plugins/mp_heartbeat.sh] [mp_heartbeat] heartbeat #1 (runtime_plugin_launcher.py:180:lmcache.v1.plugin.runtime_plugin_launcher)

Special notes for your reviewers:

If applicable:

this PR contains user facing changes - docs added
this PR contains unit tests

Note

Medium Risk
Adds optional subprocess launching during server startup/shutdown and changes plugin filtering/env-var behavior (including PYTHONUNBUFFERED), which could impact deployments that rely on runtime plugins.

Overview
Adds an MP-mode runtime plugin framework that can launch .py/.sh scripts alongside the multiprocess server, passing an aggregated JSON config blob via LMCACHE_RUNTIME_PLUGIN_CONFIG.

This introduces RuntimePluginConfig and new CLI flags (--runtime-plugin-locations, --runtime-plugin-config) in MP config parsing, wires plugin lifecycle into multiprocess/http_server.py startup/shutdown via a new MPRuntimePluginLauncher, and updates the base RuntimePluginLauncher to support role=None (disabling role/worker filename filtering) and to force unbuffered Python output for real-time log capture. Docs, examples, and unit tests are added to document and validate the MP launcher behavior.

^{Reviewed by Cursor Bugbot for commit 67d201a. Bugbot is set up for automated code reviews on this repo. Configure here.}

gemini-code-assist

Code Review

This pull request introduces a multiprocess runtime plugin framework, including the new MPRuntimePluginLauncher class and integration into the cache server and HTTP server lifecycles. It also extends the server configuration to support plugin locations via CLI arguments. Feedback indicates that the new feature lacks required unit tests and design documentation. Additionally, several functions, including the new launcher's constructor and the updated run_cache_server implementations, require updated type hints and docstrings to comply with the project's style guide.

maobaolong · 2026-04-07T09:15:10Z

@sammshen @chunxiaozheng Would you like to take a look at this PR? Thanks!

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

sammshen · 2026-04-15T04:33:31Z

particularly for MP mode, now that we are introducing the LMCache CLI, I believe modification to the internals of the Configs or running sidecars should be done through that interface. Using Kubernetes as an example, I think that etcd is only interfaced through kube api server and controllers and scheduler and kubelet all change the cluster through a controlled interface.

in the future, maybe we could start with a super concrete design of which existing modules a brand new feature will plug into with a clear definition of inputs and outputs?

cc @ApostaC @KuntaiDu

ApostaC

Otherwise LGTM

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit adec2fb. Configure here.}

chunxiaozheng

lgtm

gemini-code-assist Bot reviewed Apr 5, 2026

View reviewed changes

Comment thread lmcache/v1/multiprocess/mp_runtime_plugin_launcher.py

Comment thread lmcache/v1/multiprocess/mp_runtime_plugin_launcher.py Outdated

Comment thread lmcache/v1/multiprocess/blend_server_v2.py Outdated

Comment thread lmcache/v1/multiprocess/server.py Outdated

cursor Bot reviewed Apr 5, 2026

View reviewed changes

Comment thread lmcache/v1/multiprocess/server.py

Comment thread lmcache/v1/multiprocess/mp_runtime_plugin_launcher.py

cursor Bot reviewed Apr 5, 2026

View reviewed changes

Comment thread tests/v1/multiprocess/test_mp_runtime_plugin_launcher.py

maobaolong force-pushed the mp_plugin branch from 4f067de to 8278207 Compare April 5, 2026 02:48

maobaolong requested review from ApostaC, deng451e, hickeyma, royyhuang and sammshen as code owners April 7, 2026 08:21

cursor Bot reviewed Apr 7, 2026

View reviewed changes

Comment thread tests/v1/multiprocess/test_mp_runtime_plugin_launcher.py Outdated

cursor Bot reviewed Apr 7, 2026

View reviewed changes

Comment thread lmcache/v1/multiprocess/server.py

cursor Bot reviewed Apr 7, 2026

View reviewed changes

Comment thread lmcache/v1/plugin/runtime_plugin_launcher.py Outdated

cursor Bot reviewed Apr 7, 2026

View reviewed changes

Comment thread lmcache/v1/multiprocess/server.py Outdated

maobaolong added a commit to maobaolong/LMCache that referenced this pull request Apr 8, 2026

Backport: [MP] Introduce MP runtime plugin framework LMCache#2956

46f873a

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

maobaolong added the mp Buildkite trigger for multi-processing mode test label Apr 9, 2026

maobaolong force-pushed the mp_plugin branch from 9ed3161 to d8f4985 Compare April 20, 2026 02:51

cursor Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread lmcache/v1/multiprocess/config.py

ApostaC approved these changes Apr 20, 2026

View reviewed changes

Comment thread lmcache/v1/multiprocess/http_server.py Outdated

cursor Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread docs/design/v1/multiprocess/mp_runtime_plugin.md

cursor Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread lmcache/v1/multiprocess/mp_runtime_plugin_launcher.py

maobaolong force-pushed the mp_plugin branch from 6e35c0e to 4c7bb72 Compare April 21, 2026 11:13

cursor Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread lmcache/v1/multiprocess/mp_runtime_plugin_launcher.py

maobaolong and others added 6 commits April 22, 2026 07:54

Introduce MP runtime plugin framework

4089f8d

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

fix comment

52c1a92

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

Add examples

89fa215

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

Add examples

bcdf1a6

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

Add examples

0b29435

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

fix suggestion

688ac9f

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

maobaolong added 10 commits April 22, 2026 07:54

fix suggestion

93f9e0c

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

fix suggestion

6094efb

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

Fix UT

9c4e487

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

Revert the change of server.py

fe554d3

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

Revert the change of blend_server_v2.py

1634117

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

Address comments: improve design doc and use for http_server only

94b5124

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

Rename the http_config

fae08d0

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

Move design doc to the correct place.

e5512fc

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

fix to_json

9978155

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

Add log for MPRuntimePluginLauncher

adec2fb

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

maobaolong force-pushed the mp_plugin branch from cf8d152 to adec2fb Compare April 21, 2026 23:54

cursor Bot reviewed Apr 22, 2026

View reviewed changes

Comment thread examples/mp_runtime_plugins/README.md

chunxiaozheng approved these changes Apr 22, 2026

View reviewed changes

Merge branch 'dev' into mp_plugin

67d201a

ApostaC enabled auto-merge (squash) April 22, 2026 20:11

github-actions Bot added the full Run comprehensive tests on this PR label Apr 22, 2026

ApostaC merged commit 6fbec46 into LMCache:dev Apr 22, 2026
51 of 58 checks passed

This was referenced Apr 22, 2026

[MP]: Refactor http server to make it extensible #3017

Merged

[Chore][docs] daily drift check — multi-process mode (2026-04-24) #3132

Merged

Conversation

maobaolong commented Apr 5, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

maobaolong commented Apr 7, 2026

Uh oh!

Uh oh!

Uh oh!

sammshen commented Apr 15, 2026

Uh oh!

Uh oh!

ApostaC left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chunxiaozheng left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

maobaolong commented Apr 5, 2026 •

edited by cursor Bot

Loading