[LMCache] Pass TP size in lookup for MLA multi-reader locking by maobaolong · Pull Request #36129 · vllm-project/vllm

maobaolong · 2026-03-05T11:31:04Z

Summary

When MLA is enabled, the world_size passed to LMCache is divided by tp_size (since all TP ranks share the same KV cache in MLA models). However, the read lock count during lookup still needs the original TP size to acquire the correct number of locks for all workers that will independently retrieve the same cached chunks.

This PR adds a tp_size parameter to LMCacheMPSchedulerAdapter and propagates it through IPCCacheEngineKey so the LMCache server can use it for multi-reader locking.

Changes

vllm side (this PR)

create_scheduler_adapter: extracts tp_size from parallel_config and passes it
_adapter_accepts_tp_size(): inspect-based compatibility check so newer vLLM works with older LMCache
Fallback LMCacheMPSchedulerAdapter: accepts tp_size and propagates it in key creation

LMCache side (companion PR: LMCache/LMCache#2697)

IPCCacheEngineKey: adds tp_size: int = 1 field (backward compatible default)
LMCacheMPSchedulerAdapter: accepts tp_size parameter
server.py lookup(): uses key.tp_size instead of key.world_size for extra_readers

Backward Compatibility

New vllm + old LMCache: _adapter_accepts_tp_size() detects the old signature and skips passing tp_size
Old vllm + new LMCache: tp_size defaults to 1 in both the adapter and the key
New vllm + new LMCache: full TP-aware multi-reader locking enabled

gemini-code-assist

Code Review

This pull request introduces the tp_size parameter to LMCacheMPSchedulerAdapter to enable multi-reader locking in LMCache for MLA models, incorporating a backward-compatibility check. While no security vulnerabilities were identified, a critical issue remains: the LMCacheMPWorkerAdapter has not been updated to handle tp_size. This omission is likely to cause key mismatches between scheduler lookups and worker retrievals, thereby breaking the cache functionality.

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

maobaolong · 2026-03-05T13:06:55Z

@ApostaC Would you like to take a look at this PR? Thanks

ApostaC

LGTM! Let's merge it after the LMCache-side PR is merged.

maobaolong · 2026-03-09T08:20:04Z

@ApostaC Thanks for the review, the LMCache side has been merged.

It looks flaky for the ci/pr, would you like to help me to trigger the retry, thanks a lots.

…roject#36129) Signed-off-by: baoloongmao <baoloongmao@tencent.com> Co-authored-by: Yihua Cheng <yihua98@uchicago.edu>

maobaolong requested review from ApostaC, NickLucche and orozery as code owners March 5, 2026 11:31

mergify Bot added the kv-connector label Mar 5, 2026

gemini-code-assist Bot reviewed Mar 5, 2026

View reviewed changes

Comment thread vllm/distributed/kv_transfer/kv_connector/v1/lmcache_mp_connector.py

maobaolong force-pushed the lmcache/pass-tp-size-in-lookup branch from 302c5d5 to cea1669 Compare March 5, 2026 11:47

Pass TP size to LMCacheMPSchedulerAdapter for MLA multi-reader locking

730d142

Signed-off-by: baoloongmao <baoloongmao@tencent.com>

maobaolong force-pushed the lmcache/pass-tp-size-in-lookup branch from cea1669 to 730d142 Compare March 5, 2026 11:57

maobaolong mentioned this pull request Mar 5, 2026

Fix to support mla multiple tp failed to read issue LMCache/LMCache#2697

Merged

2 tasks

ApostaC approved these changes Mar 6, 2026

View reviewed changes

ApostaC added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 6, 2026

ApostaC added 3 commits March 9, 2026 18:01

Merge branch 'main' into lmcache/pass-tp-size-in-lookup

6ef43b9

Merge branch 'main' into lmcache/pass-tp-size-in-lookup

f3c7d75

Merge branch 'main' into lmcache/pass-tp-size-in-lookup

d5679ef

ApostaC enabled auto-merge (squash) March 11, 2026 17:49

ApostaC merged commit 12001f2 into vllm-project:main Mar 11, 2026
48 checks passed

mtparet pushed a commit to blackfuel-ai/vllm that referenced this pull request Apr 9, 2026

[LMCache] Pass TP size in lookup for MLA multi-reader locking (vllm-p…

a79ae21

…roject#36129) Signed-off-by: baoloongmao <baoloongmao@tencent.com> Co-authored-by: Yihua Cheng <yihua98@uchicago.edu>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LMCache] Pass TP size in lookup for MLA multi-reader locking#36129

[LMCache] Pass TP size in lookup for MLA multi-reader locking#36129
ApostaC merged 4 commits intovllm-project:mainfrom
maobaolong:lmcache/pass-tp-size-in-lookup

maobaolong commented Mar 5, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

maobaolong commented Mar 5, 2026

Uh oh!

ApostaC left a comment

Uh oh!

maobaolong commented Mar 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

maobaolong commented Mar 5, 2026

Summary

Changes

vllm side (this PR)

LMCache side (companion PR: LMCache/LMCache#2697)

Backward Compatibility

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

maobaolong commented Mar 5, 2026

Uh oh!

ApostaC left a comment

Choose a reason for hiding this comment

Uh oh!

maobaolong commented Mar 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants