fix: relax worker port count assertion by can-sun · Pull Request #2867 · LMCache/LMCache

can-sun · 2026-03-25T18:23:50Z

Problem

The strict == assertions on lmcache_worker_ports length in LMCacheWorker reject valid configurations where more ports are provided than the effective world_size or worker count.

This is problematic in two scenarios:

MLA models reduce the effective world_size (vLLM divides by tp_size), so users providing ports based on total GPU count (TP × PP) hit the assertion. For example, on a p5 instance with 8 GPUs, providing 8 ports fails when MLA reduces world_size to 1.
Config reuse across instance types — a single config with 8 ports (for p5) fails on g5.24xlarge which only has 4 GPUs.

Fix

Change == to >= in both assertions in worker.py:

len(config.lmcache_worker_ports) >= metadata.world_size
len(config.lmcache_worker_ports) >= len(lmcache_worker_ids)

Extra ports are simply unused — they are never bound to sockets or registered with the controller. The port selection logic indexes by worker_id or lmcache_worker_ids.index(), so extra entries are harmless. This is consistent with how P2P ports (p2p_init_ports, p2p_lookup_ports) already work.

Split out from #2732 per reviewer feedback.

Note

Low Risk
Low risk: only loosens config validation for lmcache_worker_ports without changing runtime socket binding logic; main risk is allowing misconfigured port lists that are too short for the selected workers to slip through until index errors occur elsewhere.

Overview
Relaxes LMCacheWorker startup validation so lmcache_worker_ports can contain more entries than the effective world_size or explicit lmcache_worker_ids list.

The strict == length checks are replaced with >=, with updated inline documentation explaining why extra ports are safe (especially for MLA where vLLM may report a reduced world_size).

^{Written by Cursor Bugbot for commit 617c865. This will update automatically on new commits. Configure here.}

gemini-code-assist · 2026-03-25T18:25:09Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the flexibility of LMCacheWorker port configurations by relaxing strict assertion checks. Previously, the system would reject valid setups if the number of provided lmcache_worker_ports did not exactly match the effective world_size or worker count. This change addresses issues encountered with MLA models and configuration reuse across varied instance types, allowing users to provide more ports than strictly required without causing assertion failures.

Highlights

Port Assertion Relaxation: The strict equality ("==") assertions for lmcache_worker_ports length in LMCacheWorker have been relaxed to greater-than-or-equal (">="). This allows for more flexible port configurations, preventing errors in scenarios like MLA models reducing world_size or reusing configurations across different instance types.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request modifies lmcache/v1/cache_controller/worker.py to relax the assertion conditions for lmcache_worker_ports. The changes allow the number of configured worker ports to be greater than or equal to the number of active workers, providing more flexibility in port allocation. There is no feedback to provide.

sammshen · 2026-03-29T09:49:41Z

I think this makes sense but could you add some documentation in the code on exactly how the number of worker ports is decided and the semantics of the world_size for different models above the conditional?

DongDongJu

So IIUC, this is user error/fault which is that user configured more than actual # of gpu needed at workload.
Is this true? then not sure that why we strictly design this at the first time.
@ApostaC Do you have any idea on this?

sammshen · 2026-03-31T18:08:30Z

@cursor review

cursor · 2026-03-31T18:08:32Z

Skipping Bugbot: Bugbot is disabled for this repository. Visit the Bugbot dashboard to update your settings.

sammshen · 2026-03-31T18:10:49Z

@cursor review

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

Signed-off-by: Can Sun <sucan@amazon.com>

sammshen

LGTM

DongDongJu

LGTM

Relax worker port count assertion Signed-off-by: Can Sun <sucan@amazon.com>

gemini-code-assist Bot reviewed Mar 25, 2026

View reviewed changes

can-sun changed the title ~~Relax worker port count assertion~~ fix: Relax worker port count assertion Mar 25, 2026

can-sun changed the title ~~fix: Relax worker port count assertion~~ fix: relax worker port count assertion Mar 25, 2026

DongDongJu reviewed Mar 30, 2026

View reviewed changes

cursor Bot reviewed Mar 31, 2026

View reviewed changes

Comment thread lmcache/v1/cache_controller/worker.py

Relax worker port count assertion

ee262c6

Signed-off-by: Can Sun <sucan@amazon.com>

can-sun force-pushed the fix/relax-worker-port-assertion branch from 7f8ada7 to ee262c6 Compare April 1, 2026 16:35

Merge branch 'dev' into fix/relax-worker-port-assertion

617c865

sammshen approved these changes Apr 1, 2026

View reviewed changes

DongDongJu approved these changes Apr 1, 2026

View reviewed changes

DongDongJu enabled auto-merge (squash) April 1, 2026 19:56

github-actions Bot added the full Run comprehensive tests on this PR label Apr 1, 2026

DongDongJu merged commit 521c23b into LMCache:dev Apr 1, 2026
34 of 35 checks passed

jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026

fix: relax worker port count assertion (LMCache#2867)

95c5ce5

Relax worker port count assertion Signed-off-by: Can Sun <sucan@amazon.com>

jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026

fix: relax worker port count assertion (LMCache#2867)

c41a0bf

Relax worker port count assertion Signed-off-by: Can Sun <sucan@amazon.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: relax worker port count assertion#2867

fix: relax worker port count assertion#2867
DongDongJu merged 2 commits intoLMCache:devfrom
can-sun:fix/relax-worker-port-assertion

can-sun commented Mar 25, 2026 •

edited by cursor Bot

Loading

Uh oh!

gemini-code-assist Bot commented Mar 25, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

sammshen commented Mar 29, 2026

Uh oh!

DongDongJu left a comment

Uh oh!

sammshen commented Mar 31, 2026

Uh oh!

cursor Bot commented Mar 31, 2026

Uh oh!

sammshen commented Mar 31, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

sammshen left a comment

Uh oh!

DongDongJu left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

can-sun commented Mar 25, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Uh oh!

gemini-code-assist Bot commented Mar 25, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

sammshen commented Mar 29, 2026

Uh oh!

DongDongJu left a comment

Choose a reason for hiding this comment

Uh oh!

sammshen commented Mar 31, 2026

Uh oh!

cursor Bot commented Mar 31, 2026

Uh oh!

sammshen commented Mar 31, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sammshen left a comment

Choose a reason for hiding this comment

Uh oh!

DongDongJu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

can-sun commented Mar 25, 2026 •

edited by cursor Bot

Loading