[Build] Add lmcache-cli lightweight wheel by deng451e · Pull Request #2959 · LMCache/LMCache

deng451e · 2026-04-05T22:05:55Z

Introduces a lightweight PyPI package lmcache-cli for CLI-only usage (no CUDA/GPU, cross-platform).
• Adds pyproject_cli.toml and requirements/cli.txt for a pure-Python wheel
• Updates publish.yml to build and publish CLI artifacts
• Updates docs with CLI-only install instructions and comparison table
• Log clear error when cli server dependencies are missing

Note

Medium Risk
Adds a new packaging/release path and additional PyPI publish jobs, so misconfiguration could impact CI releases or publish the wrong artifacts; runtime code changes are limited to clearer dependency errors for lmcache server.

Overview
Introduces a new lmcache-cli CLI-only PyPI package built from pyproject_cli.toml with its own minimal dependencies (requirements/cli.txt), intended to work without CUDA/GPU while still providing the lmcache entry point.

Updates the publish workflow to build, artifact, and publish this CLI wheel separately to both TestPyPI and PyPI (including GitHub Release uploads), and expands docs to explain the two-package install options and warn against installing both.

Improves lmcache/cli/commands/server.py to gracefully handle missing server/CUDA dependencies by skipping argument registration and exiting with an actionable error message when lmcache server is invoked in a CLI-only install.

^{Reviewed by Cursor Bugbot for commit 05d31c4. Bugbot is set up for automated code reviews on this repo. Configure here.}

Signed-off-by: deng451e <838677410@qq.com>

gemini-code-assist

Code Review

This pull request introduces a lightweight lmcache-cli package, enabling CLI usage without GPU dependencies, and updates the documentation and build configurations accordingly. The server command now includes error handling for missing CUDA extensions, and the TTFT calculation was refined for cases with no content tokens. Feedback highlights a non-existent setuptools version in the build requirements, the unnecessary inclusion of ninja for a pure-Python package, and a Python version mismatch between the code's use of PEP 604 and the documentation. Additionally, a standard library import needs to be relocated to follow the project's import organization rules.

gemini-code-assist · 2026-04-05T22:08:25Z

+# Thus, we will still lock a torch version here because we can choose to release wheels
+# in sync with vllm and update our torch version accordingly
+requires = [
+    "ninja",


Since lmcache-cli is intended to be a lightweight, pure-Python wheel without CUDA extensions, ninja is not required in the build-system dependencies. Removing it will reduce the build environment's footprint and avoid unnecessary downloads during the build process.

ApostaC · 2026-04-06T22:40:33Z

Can we make it pip install lmcache[cli] (and pip install lmcache[all])

sammshen

LGTM!

Signed-off-by: deng451e <838677410@qq.com>

The publish-cli-pypi job does not need a checkout — gh release upload only requires the downloaded dist/ artifacts and GITHUB_TOKEN. Aligns with publish-pypi which has the same structure without a checkout step. The harden-runner egress policy also does not allow github.com or objects.githubusercontent.com, so the checkout would fail. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: deng451e <838677410@qq.com>

ApostaC

Please see the comments, thanks!

ApostaC · 2026-04-10T01:03:15Z

+        except ImportError:
+            return


Should we log any error here?

ApostaC · 2026-04-10T01:03:42Z

+# Thus, we will still lock a torch version here because we can choose to release wheels
+# in sync with vllm and update our torch version accordingly
+requires = [
+    "ninja",


Do we really need ninja for lmcache-cli?

ApostaC · 2026-04-10T01:04:46Z

@@ -0,0 +1 @@
+transformers>=4.51.1


Is it possible to remove transformers? It will also be a very big one.
Additionally, lmcache bench needs openai package. We probably need to revisit the dependencies.

cursor · 2026-04-10T06:12:18Z

+            id-token: write
+            contents: write
+        runs-on: ubuntu-latest
+        needs: [changes, build-cli-artifacts, test, code-quality]


CLI PyPI publish skips test PyPI validation gate

Medium Severity

The publish-cli-pypi job's needs array is [changes, build-cli-artifacts, test, code-quality], missing a dependency on publish-cli-test-pypi. The main publish-pypi job includes publish-test-pypi in its needs, ensuring a broken wheel is caught on Test PyPI before going to production. The CLI production publish has no such gate, so a broken lmcache-cli wheel could be published directly to PyPI without passing Test PyPI validation first.

Additional Locations (1)

.github/workflows/publish.yml#L237-L251

^{Reviewed by Cursor Bugbot for commit fdaa8a8. Configure here.}

- Make openai import lazy in bench/engine_bench (request_sender, config): any lmcache command (including --help) no longer crashes with ImportError when openai is not installed; error is deferred to lmcache bench engine invocation with a clear install hint - Remove transformers from requirements/cli.txt (already a lazy optional import with try/except fallback in prompt.py) - Revert lmcache/cli/request.py to dev (ttft_s = -1.0 when no token) - pyproject_cli.toml: drop ninja (not needed for pure-Python wheel), relax setuptools to >=68.0.0 Signed-off-by: deng451e <838677410@qq.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 3 total unresolved issues (including 2 from previous reviews).

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 60249fc. Configure here.}

cursor · 2026-04-10T06:26:35Z

+                f"Failed to import server dependencies: {e}. "
+                "Install the full lmcache package to use 'lmcache server'."
+            )
+            return


Server import error prints to stdout on every invocation

Medium Severity

The ImportError handler in add_arguments prints to stdout (missing file=sys.stderr), unlike the analogous handler in execute which correctly uses stderr. Since BaseCommand.register() calls add_arguments for every command during startup (via main() iterating ALL_COMMANDS), this error message will appear on stdout for every lmcache-cli invocation — even unrelated commands like lmcache ping or lmcache query. This pollutes stdout and breaks piped/parsed output.

^{Reviewed by Cursor Bugbot for commit 60249fc. Configure here.}

Signed-off-by: deng451e <838677410@qq.com>

ApostaC

LGTM!

* cli wheel build and release * add openai in cli dependency Signed-off-by: deng451e <838677410@qq.com> Co-authored-by: Yihua Cheng <yihua98@uchicago.edu>

deng451e added 2 commits April 5, 2026 19:32

cli wheel build and release

a0857b2

Signed-off-by: deng451e <838677410@qq.com>

update doc

4ef771e

Signed-off-by: deng451e <838677410@qq.com>

deng451e requested review from ApostaC, KuntaiDu, hickeyma, royyhuang and sammshen as code owners April 5, 2026 22:05

gemini-code-assist Bot reviewed Apr 5, 2026

View reviewed changes

cursor Bot reviewed Apr 5, 2026

View reviewed changes

Comment thread requirements/cli.txt Outdated

sammshen approved these changes Apr 8, 2026

View reviewed changes

add step for release

84df850

Signed-off-by: deng451e <838677410@qq.com>

deng451e force-pushed the pip_wheel_split branch from 8f1778a to 84df850 Compare April 9, 2026 00:52

deng451e force-pushed the pip_wheel_split branch from 8315e5b to e1bde6f Compare April 9, 2026 01:05

ApostaC reviewed Apr 10, 2026

View reviewed changes

cursor Bot reviewed Apr 10, 2026

View reviewed changes

deng451e force-pushed the pip_wheel_split branch from fdaa8a8 to d4b16bf Compare April 10, 2026 06:13

cursor Bot reviewed Apr 10, 2026

View reviewed changes

Comment thread pyproject_cli.toml

deng451e force-pushed the pip_wheel_split branch from d4b16bf to 60249fc Compare April 10, 2026 06:20

cursor Bot reviewed Apr 10, 2026

View reviewed changes

deng451e and others added 2 commits April 10, 2026 21:53

add openai in cli dependency

794dcb6

Signed-off-by: deng451e <838677410@qq.com>

Merge branch 'dev' into pip_wheel_split

05d31c4

ApostaC approved these changes Apr 11, 2026

View reviewed changes

ApostaC enabled auto-merge (squash) April 11, 2026 00:44

github-actions Bot added the full Run comprehensive tests on this PR label Apr 11, 2026

ApostaC merged commit f70fd14 into LMCache:dev Apr 12, 2026
40 checks passed

		@@ -0,0 +1 @@
		transformers>=4.51.1

Conversation

deng451e commented Apr 5, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

gemini-code-assist Bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ApostaC commented Apr 6, 2026

Uh oh!

sammshen left a comment

Choose a reason for hiding this comment

Uh oh!

ApostaC left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ApostaC Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

ApostaC Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

ApostaC Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot Apr 10, 2026

Choose a reason for hiding this comment

CLI PyPI publish skips test PyPI validation gate

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Apr 10, 2026

Choose a reason for hiding this comment

Server import error prints to stdout on every invocation

Uh oh!

ApostaC left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

deng451e commented Apr 5, 2026 •

edited by cursor Bot

Loading