Skip to content

[Chore][Revert] MP adapter signature shim from #3100#3111

Merged
ApostaC merged 1 commit intoLMCache:devfrom
sammshen:revert-3100-mp-adapter-only
Apr 23, 2026
Merged

[Chore][Revert] MP adapter signature shim from #3100#3111
ApostaC merged 1 commit intoLMCache:devfrom
sammshen:revert-3100-mp-adapter-only

Conversation

@sammshen
Copy link
Copy Markdown
Contributor

@sammshen sammshen commented Apr 22, 2026

PR #3100 bundled two unrelated changes: the CI sitecustomize/SCM fix and a backward-compat shim on LMCacheMPScheduler/WorkerAdapter that accepted the old positional (world_size, kv_rank, vllm_block_size) call form used by the vllm-bundled lmcache_mp_connector.

This reverts only the adapter shim (restores the pre-#3100 signature requiring parallel_strategy). The CI sitecustomize and SETUPTOOLS_SCM_PRETEND_VERSION changes from #3100 are kept.

Callers on an un-updated vllm that still pass the old positional args will break until the vllm side is updated to pass parallel_strategy.

What this PR does / why we need it:

Special notes for your reviewers:

If applicable:

  • this PR contains user facing changes - docs added
  • this PR contains unit tests

Note

Medium Risk
Breaking change to LMCacheMPSchedulerAdapter/LMCacheMPWorkerAdapter constructor signatures; older vLLM callers using positional (world_size, kv_rank, vllm_block_size) will fail until updated.

Overview
Restores the pre-#3100 API for vLLM multiprocess adapters by removing the backward-compat constructor shim on LMCacheMPSchedulerAdapter and LMCacheMPWorkerAdapter.

Both constructors now require vllm_block_size and a non-optional parallel_strategy, and no longer accept/derive values from world_size, kv_rank, or tp_size (including dropping related typing/imports).

Reviewed by Cursor Bugbot for commit 18fbeab. Bugbot is set up for automated code reviews on this repo. Configure here.

PR LMCache#3100 bundled two unrelated changes: the CI sitecustomize/SCM fix
and a backward-compat shim on LMCacheMPScheduler/WorkerAdapter that
accepted the old positional (world_size, kv_rank, vllm_block_size)
call form used by the vllm-bundled lmcache_mp_connector.

This reverts only the adapter shim (restores the pre-LMCache#3100 signature
requiring parallel_strategy). The CI sitecustomize and
SETUPTOOLS_SCM_PRETEND_VERSION changes from LMCache#3100 are kept.

Callers on an un-updated vllm that still pass the old positional
args will break until the vllm side is updated to pass
parallel_strategy.

Signed-off-by: Samuel Shen <slshen@uchciago.edu>
Copy link
Copy Markdown
Collaborator

@deng451e deng451e left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 18fbeab. Configure here.

tp_size: int = 1,
parallel_strategy: Optional[ParallelStrategy] = None,
vllm_block_size: int,
parallel_strategy: ParallelStrategy,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In-repo connector still uses old adapter call convention

High Severity

The signature of LMCacheMPSchedulerAdapter and LMCacheMPWorkerAdapter was reverted to require (vllm_block_size, parallel_strategy), but the in-repo caller lmcache_mp_connector_0180.py (create_scheduler_adapter / create_worker_adapter) still passes (world_size, kv_rank, block_size) positionally. This causes kv_rank (an int) to land on parallel_strategy (expects ParallelStrategy), and block_size to collide with the keyword mq_timeout, producing a TypeError at runtime. The connector file needs to be updated to construct a ParallelStrategy and pass the new signature.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 18fbeab. Configure here.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the LMCacheMPSchedulerAdapter and LMCacheMPWorkerAdapter classes by removing legacy parameters and making vllm_block_size and parallel_strategy mandatory. The review identifies that these changes break existing internal integrations in lmcache_mp_connector_0180.py and that the LMCacheMPWorkerAdapter constructor lacks a required docstring per the project's style guide.

Comment on lines 213 to 222
def __init__(
self,
server_url: str,
context: zmq.Context,
model_name: str,
world_size: int = 1,
kv_rank: int = 0,
vllm_block_size: int = 16,
tp_size: int = 1,
parallel_strategy: Optional[ParallelStrategy] = None,
vllm_block_size: int,
parallel_strategy: ParallelStrategy,
mq_timeout: float = DEFAULT_MQ_TIMEOUT,
heartbeat_interval: float = DEFAULT_HEARTBEAT_INTERVAL,
):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This change breaks the internal integration code in lmcache/integration/vllm/lmcache_mp_connector_0180.py. The create_scheduler_adapter function (lines 123-133) still passes arguments using the old signature, which will now cause a TypeError because it attempts to pass an integer (kv_rank) to the parallel_strategy parameter. Since this connector is part of the repository, it should be updated in this PR to maintain a working state.

Comment on lines 555 to 564
def __init__(
self,
server_url: str,
context: zmq.Context,
model_name: str,
world_size: int = 1,
kv_rank: int = 0,
vllm_block_size: int = 16,
tp_size: int = 1,
parallel_strategy: Optional[ParallelStrategy] = None,
vllm_block_size: int,
parallel_strategy: ParallelStrategy,
mq_timeout: float = DEFAULT_MQ_TIMEOUT,
heartbeat_interval: float = DEFAULT_HEARTBEAT_INTERVAL,
):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This change breaks create_worker_adapter in lmcache/integration/vllm/lmcache_mp_connector_0180.py (lines 148-157) because the positional arguments no longer match the new signature. Additionally, per the repository style guide (line 26), this public method should include a docstring describing its arguments.

    def __init__(
        self,
        server_url: str,
        context: zmq.Context,
        model_name: str,
        vllm_block_size: int,
        parallel_strategy: ParallelStrategy,
        mq_timeout: float = DEFAULT_MQ_TIMEOUT,
        heartbeat_interval: float = DEFAULT_HEARTBEAT_INTERVAL,
    ):
        """
        Args:
            server_url: The server URL for the LMCache message queue
            context: The ZMQ context
            model_name: The model name used for LMCache keys
            vllm_block_size: The block size used in vLLM
            parallel_strategy:
                The parallel strategy, which includes use_mla,
                kv_world_size, kv_worker_id and so on
            mq_timeout: Timeout in seconds for message queue requests.
            heartbeat_interval: Interval in seconds between heartbeat pings.
        """
References
  1. All new public functions have docstrings (what, args, return, exceptions) (link)

@sammshen sammshen added the full Run comprehensive tests on this PR label Apr 22, 2026
Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ApostaC ApostaC enabled auto-merge (squash) April 22, 2026 21:11
@ApostaC ApostaC merged commit afd5c4b into LMCache:dev Apr 23, 2026
39 of 41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants