Copy a snapshot of lmcache_mp_connector.py for vllm 0.18.0#2887
Copy a snapshot of lmcache_mp_connector.py for vllm 0.18.0#2887ApostaC merged 2 commits intoLMCache:devfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces the LMCacheMPConnector to integrate LMCache with vLLM's multi-process KV transfer system, including request tracking and metadata management. The reviewer identified several style guide violations, such as missing docstrings for public methods and properties, and a missing return type hint for the wait_for_save method.
|
This file is copied from vllm repository, should I fix the code suggestions from gemini-bot ? @ApostaC |
ApostaC
left a comment
There was a problem hiding this comment.
@maobaolong I have left a comment in vllm-project/vllm#38314. I think that PR looks good to me, but it's better to hear another owner's opinion.
In the worst case, we can probably rename the class name here to something like LMCacheMPConnectorDev in order to avoid naming conflicts? WDYT?
@ApostaC It can be better if we can rename the file name, but not the class name since |
…MCache#2887 Signed-off-by: baoloongmao <baoloongmao@tencent.com>
ApostaC
left a comment
There was a problem hiding this comment.
LGTM! We probably want to have some CI for this file as the next step.
|
@sammshen Baolong and his team may want a dev-version of the |
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
Signed-off-by: baoloongmao <baoloongmao@tencent.com>
20a53b3 to
c858999
Compare
What this PR does / why we need it:
Special notes for your reviewers:
If applicable:
Note
Medium Risk
Adds a large new vLLM integration module that implements async KV cache lookup/retrieve/store orchestration; while mostly additive, it touches performance- and correctness-sensitive cache-transfer logic (GPU/ZMQ/stream synchronization).
Overview
Adds a version-pinned
LMCacheMPConnectorimplementation (lmcache_mp_connector_0180.py) for vLLM0.18.0, including request tracking/state transitions, lookup vs. vLLM-hit reconciliation, batched async retrieve/store submission, and MLA-aware rank/world-size handling.Updates tooling configs to exclude
lmcache_mp_connector_*.pysnapshots frompre-commit,ruff,mypy, andcodespell, reducing lint/typecheck noise for these vendored compatibility files.Reviewed by Cursor Bugbot for commit c858999. Bugbot is set up for automated code reviews on this repo. Configure here.