Skip to content

[Build] Add lmcache-cli lightweight wheel#2959

Merged
ApostaC merged 7 commits intoLMCache:devfrom
deng451e:pip_wheel_split
Apr 12, 2026
Merged

[Build] Add lmcache-cli lightweight wheel#2959
ApostaC merged 7 commits intoLMCache:devfrom
deng451e:pip_wheel_split

Conversation

@deng451e
Copy link
Copy Markdown
Collaborator

@deng451e deng451e commented Apr 5, 2026

Introduces a lightweight PyPI package lmcache-cli for CLI-only usage (no CUDA/GPU, cross-platform).
• Adds pyproject_cli.toml and requirements/cli.txt for a pure-Python wheel
• Updates publish.yml to build and publish CLI artifacts
• Updates docs with CLI-only install instructions and comparison table
• Log clear error when cli server dependencies are missing


Note

Medium Risk
Adds a new packaging/release path and additional PyPI publish jobs, so misconfiguration could impact CI releases or publish the wrong artifacts; runtime code changes are limited to clearer dependency errors for lmcache server.

Overview
Introduces a new lmcache-cli CLI-only PyPI package built from pyproject_cli.toml with its own minimal dependencies (requirements/cli.txt), intended to work without CUDA/GPU while still providing the lmcache entry point.

Updates the publish workflow to build, artifact, and publish this CLI wheel separately to both TestPyPI and PyPI (including GitHub Release uploads), and expands docs to explain the two-package install options and warn against installing both.

Improves lmcache/cli/commands/server.py to gracefully handle missing server/CUDA dependencies by skipping argument registration and exiting with an actionable error message when lmcache server is invoked in a CLI-only install.

Reviewed by Cursor Bugbot for commit 05d31c4. Bugbot is set up for automated code reviews on this repo. Configure here.

deng451e added 2 commits April 5, 2026 19:32
Signed-off-by: deng451e <838677410@qq.com>
Signed-off-by: deng451e <838677410@qq.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a lightweight lmcache-cli package, enabling CLI usage without GPU dependencies, and updates the documentation and build configurations accordingly. The server command now includes error handling for missing CUDA extensions, and the TTFT calculation was refined for cases with no content tokens. Feedback highlights a non-existent setuptools version in the build requirements, the unnecessary inclusion of ninja for a pure-Python package, and a Python version mismatch between the code's use of PEP 604 and the documentation. Additionally, a standard library import needs to be relocated to follow the project's import organization rules.

Comment thread pyproject_cli.toml Outdated
Comment thread pyproject_cli.toml Outdated
# Thus, we will still lock a torch version here because we can choose to release wheels
# in sync with vllm and update our torch version accordingly
requires = [
"ninja",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since lmcache-cli is intended to be a lightweight, pure-Python wheel without CUDA extensions, ninja is not required in the build-system dependencies. Removing it will reduce the build environment's footprint and avoid unnecessary downloads during the build process.

Comment thread pyproject_cli.toml
Comment thread lmcache/cli/commands/server.py
Comment thread requirements/cli.txt Outdated
@ApostaC
Copy link
Copy Markdown
Contributor

ApostaC commented Apr 6, 2026

Can we make it pip install lmcache[cli] (and pip install lmcache[all])

Copy link
Copy Markdown
Contributor

@sammshen sammshen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Signed-off-by: deng451e <838677410@qq.com>
The publish-cli-pypi job does not need a checkout — gh release upload
only requires the downloaded dist/ artifacts and GITHUB_TOKEN. Aligns
with publish-pypi which has the same structure without a checkout step.
The harden-runner egress policy also does not allow github.com or
objects.githubusercontent.com, so the checkout would fail.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: deng451e <838677410@qq.com>
Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see the comments, thanks!

Comment thread lmcache/cli/request.py Outdated
Comment thread lmcache/cli/commands/server.py Outdated
Comment on lines +50 to +51
except ImportError:
return
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we log any error here?

Comment thread pyproject_cli.toml Outdated
# Thus, we will still lock a torch version here because we can choose to release wheels
# in sync with vllm and update our torch version accordingly
requires = [
"ninja",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need ninja for lmcache-cli?

Comment thread requirements/cli.txt Outdated
@@ -0,0 +1 @@
transformers>=4.51.1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to remove transformers? It will also be a very big one.
Additionally, lmcache bench needs openai package. We probably need to revisit the dependencies.

id-token: write
contents: write
runs-on: ubuntu-latest
needs: [changes, build-cli-artifacts, test, code-quality]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI PyPI publish skips test PyPI validation gate

Medium Severity

The publish-cli-pypi job's needs array is [changes, build-cli-artifacts, test, code-quality], missing a dependency on publish-cli-test-pypi. The main publish-pypi job includes publish-test-pypi in its needs, ensuring a broken wheel is caught on Test PyPI before going to production. The CLI production publish has no such gate, so a broken lmcache-cli wheel could be published directly to PyPI without passing Test PyPI validation first.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit fdaa8a8. Configure here.

- Make openai import lazy in bench/engine_bench (request_sender, config):
  any lmcache command (including --help) no longer crashes with
  ImportError when openai is not installed; error is deferred to
  lmcache bench engine invocation with a clear install hint
- Remove transformers from requirements/cli.txt (already a lazy optional
  import with try/except fallback in prompt.py)
- Revert lmcache/cli/request.py to dev (ttft_s = -1.0 when no token)
- pyproject_cli.toml: drop ninja (not needed for pure-Python wheel),
  relax setuptools to >=68.0.0

Signed-off-by: deng451e <838677410@qq.com>
Comment thread pyproject_cli.toml
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 3 total unresolved issues (including 2 from previous reviews).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 60249fc. Configure here.

f"Failed to import server dependencies: {e}. "
"Install the full lmcache package to use 'lmcache server'."
)
return
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Server import error prints to stdout on every invocation

Medium Severity

The ImportError handler in add_arguments prints to stdout (missing file=sys.stderr), unlike the analogous handler in execute which correctly uses stderr. Since BaseCommand.register() calls add_arguments for every command during startup (via main() iterating ALL_COMMANDS), this error message will appear on stdout for every lmcache-cli invocation — even unrelated commands like lmcache ping or lmcache query. This pollutes stdout and breaks piped/parsed output.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 60249fc. Configure here.

deng451e and others added 2 commits April 10, 2026 21:53
Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ApostaC ApostaC enabled auto-merge (squash) April 11, 2026 00:44
@github-actions github-actions Bot added the full Run comprehensive tests on this PR label Apr 11, 2026
@ApostaC ApostaC merged commit f70fd14 into LMCache:dev Apr 12, 2026
40 checks passed
Oasis-Git pushed a commit to Oasis-Git/LMCache that referenced this pull request Apr 13, 2026
* cli wheel build and release

* add openai in cli dependency


Signed-off-by: deng451e <838677410@qq.com>
Co-authored-by: Yihua Cheng <yihua98@uchicago.edu>
ftian1 pushed a commit to ftian1/LMCache that referenced this pull request Apr 20, 2026
* cli wheel build and release

* add openai in cli dependency


Signed-off-by: deng451e <838677410@qq.com>
Co-authored-by: Yihua Cheng <yihua98@uchicago.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants