[Build] Add lmcache-cli lightweight wheel#2959
Conversation
Signed-off-by: deng451e <838677410@qq.com>
Signed-off-by: deng451e <838677410@qq.com>
There was a problem hiding this comment.
Code Review
This pull request introduces a lightweight lmcache-cli package, enabling CLI usage without GPU dependencies, and updates the documentation and build configurations accordingly. The server command now includes error handling for missing CUDA extensions, and the TTFT calculation was refined for cases with no content tokens. Feedback highlights a non-existent setuptools version in the build requirements, the unnecessary inclusion of ninja for a pure-Python package, and a Python version mismatch between the code's use of PEP 604 and the documentation. Additionally, a standard library import needs to be relocated to follow the project's import organization rules.
| # Thus, we will still lock a torch version here because we can choose to release wheels | ||
| # in sync with vllm and update our torch version accordingly | ||
| requires = [ | ||
| "ninja", |
There was a problem hiding this comment.
|
Can we make it |
Signed-off-by: deng451e <838677410@qq.com>
8f1778a to
84df850
Compare
The publish-cli-pypi job does not need a checkout — gh release upload only requires the downloaded dist/ artifacts and GITHUB_TOKEN. Aligns with publish-pypi which has the same structure without a checkout step. The harden-runner egress policy also does not allow github.com or objects.githubusercontent.com, so the checkout would fail. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: deng451e <838677410@qq.com>
8315e5b to
e1bde6f
Compare
ApostaC
left a comment
There was a problem hiding this comment.
Please see the comments, thanks!
| except ImportError: | ||
| return |
There was a problem hiding this comment.
Should we log any error here?
| # Thus, we will still lock a torch version here because we can choose to release wheels | ||
| # in sync with vllm and update our torch version accordingly | ||
| requires = [ | ||
| "ninja", |
There was a problem hiding this comment.
Do we really need ninja for lmcache-cli?
| @@ -0,0 +1 @@ | |||
| transformers>=4.51.1 | |||
There was a problem hiding this comment.
Is it possible to remove transformers? It will also be a very big one.
Additionally, lmcache bench needs openai package. We probably need to revisit the dependencies.
| id-token: write | ||
| contents: write | ||
| runs-on: ubuntu-latest | ||
| needs: [changes, build-cli-artifacts, test, code-quality] |
There was a problem hiding this comment.
CLI PyPI publish skips test PyPI validation gate
Medium Severity
The publish-cli-pypi job's needs array is [changes, build-cli-artifacts, test, code-quality], missing a dependency on publish-cli-test-pypi. The main publish-pypi job includes publish-test-pypi in its needs, ensuring a broken wheel is caught on Test PyPI before going to production. The CLI production publish has no such gate, so a broken lmcache-cli wheel could be published directly to PyPI without passing Test PyPI validation first.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit fdaa8a8. Configure here.
fdaa8a8 to
d4b16bf
Compare
- Make openai import lazy in bench/engine_bench (request_sender, config): any lmcache command (including --help) no longer crashes with ImportError when openai is not installed; error is deferred to lmcache bench engine invocation with a clear install hint - Remove transformers from requirements/cli.txt (already a lazy optional import with try/except fallback in prompt.py) - Revert lmcache/cli/request.py to dev (ttft_s = -1.0 when no token) - pyproject_cli.toml: drop ninja (not needed for pure-Python wheel), relax setuptools to >=68.0.0 Signed-off-by: deng451e <838677410@qq.com>
d4b16bf to
60249fc
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 3 total unresolved issues (including 2 from previous reviews).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 60249fc. Configure here.
| f"Failed to import server dependencies: {e}. " | ||
| "Install the full lmcache package to use 'lmcache server'." | ||
| ) | ||
| return |
There was a problem hiding this comment.
Server import error prints to stdout on every invocation
Medium Severity
The ImportError handler in add_arguments prints to stdout (missing file=sys.stderr), unlike the analogous handler in execute which correctly uses stderr. Since BaseCommand.register() calls add_arguments for every command during startup (via main() iterating ALL_COMMANDS), this error message will appear on stdout for every lmcache-cli invocation — even unrelated commands like lmcache ping or lmcache query. This pollutes stdout and breaks piped/parsed output.
Reviewed by Cursor Bugbot for commit 60249fc. Configure here.
Signed-off-by: deng451e <838677410@qq.com>
* cli wheel build and release * add openai in cli dependency Signed-off-by: deng451e <838677410@qq.com> Co-authored-by: Yihua Cheng <yihua98@uchicago.edu>
* cli wheel build and release * add openai in cli dependency Signed-off-by: deng451e <838677410@qq.com> Co-authored-by: Yihua Cheng <yihua98@uchicago.edu>


Introduces a lightweight PyPI package lmcache-cli for CLI-only usage (no CUDA/GPU, cross-platform).
• Adds pyproject_cli.toml and requirements/cli.txt for a pure-Python wheel
• Updates publish.yml to build and publish CLI artifacts
• Updates docs with CLI-only install instructions and comparison table
• Log clear error when cli server dependencies are missing
Note
Medium Risk
Adds a new packaging/release path and additional PyPI publish jobs, so misconfiguration could impact CI releases or publish the wrong artifacts; runtime code changes are limited to clearer dependency errors for
lmcache server.Overview
Introduces a new
lmcache-cliCLI-only PyPI package built frompyproject_cli.tomlwith its own minimal dependencies (requirements/cli.txt), intended to work without CUDA/GPU while still providing thelmcacheentry point.Updates the publish workflow to build, artifact, and publish this CLI wheel separately to both TestPyPI and PyPI (including GitHub Release uploads), and expands docs to explain the two-package install options and warn against installing both.
Improves
lmcache/cli/commands/server.pyto gracefully handle missing server/CUDA dependencies by skipping argument registration and exiting with an actionable error message whenlmcache serveris invoked in a CLI-only install.Reviewed by Cursor Bugbot for commit 05d31c4. Bugbot is set up for automated code reviews on this repo. Configure here.