Skip to content

[CLI] Refactor query command #2995

Merged
ApostaC merged 4 commits intoLMCache:devfrom
deng451e:update_query
Apr 21, 2026
Merged

[CLI] Refactor query command #2995
ApostaC merged 4 commits intoLMCache:devfrom
deng451e:update_query

Conversation

@deng451e
Copy link
Copy Markdown
Collaborator

@deng451e deng451e commented Apr 10, 2026

Description:

Restructures lmcache query into a dedicated subpackage and removes local tokenizer usage for prompt token counting.

Changes:

  • Move lmcache/cli/commands/query.py → lmcache/cli/commands/query/init.py
  • Move lmcache/cli/prompt.py and lmcache/cli/request.py into lmcache/cli/commands/query/, mirroring the bench/ layout
  • Remove all tokenizer-based token counting (transformers.AutoTokenizer, per-document breakdown, _load_tokenizer, _token_weights,

Note

Medium Risk
Moderate risk because it changes lmcache query metric calculation (token counts now sourced from engine-reported usage and TTFT behavior when no tokens stream) and moves modules, which could affect imports/packaging.

Overview
Refactors lmcache query into a dedicated lmcache/cli/commands/query/ subpackage, relocating prompt expansion and HTTP request logic and updating docs to match the new layout.

Changes query engine metrics output to report prompt_tokens/output_tokens directly from the engine’s include_usage stream data (removing local tokenizer-based counting and per-document breakdown), and adjusts TTFT calculation to fall back to total round-trip time when no token is observed.

Reviewed by Cursor Bugbot for commit d24787e. Bugbot is set up for automated code reviews on this repo. Configure here.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the CLI query command by moving prompt and request logic into a dedicated query subdirectory and updating the documentation. It simplifies token reporting by using usage data directly from the engine instead of local estimation. Feedback was provided regarding incomplete docstrings for several public functions in lmcache/cli/commands/query/prompt.py, which violate the project's style guide requiring descriptions for arguments, return values, and exceptions.

Comment thread lmcache/cli/commands/query/prompt.py Outdated
Comment thread lmcache/cli/commands/query/prompt.py Outdated
Comment thread lmcache/cli/commands/query/prompt.py
Signed-off-by: deng451e <838677410@qq.com>
Comment thread lmcache/cli/commands/query/__init__.py Outdated
…ubpackage

- Extract output_tokens as int (like prompt_tokens) to avoid float
  rendering (9.00 → 9) that mismatched docs examples
- Add complete Args/Returns/Raises sections to expand_prompt,
  resolve_documents, and unknown_documents per project style guide

Signed-off-by: deng451e <838677410@qq.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@deng451e deng451e changed the title [CLI] Refactor query command into subpackage; use server-reported prompt tokens [CLI] Refactor query command Apr 10, 2026
Copy link
Copy Markdown
Contributor

@sammshen sammshen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ApostaC ApostaC enabled auto-merge (squash) April 20, 2026 22:44
@github-actions github-actions Bot added the full Run comprehensive tests on this PR label Apr 20, 2026
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit d24787e. Configure here.

MetricValue = tuple[str, Any]
MetricMap = dict[str, MetricValue]
from lmcache.cli.commands.query.prompt import PromptBuilder
from lmcache.cli.commands.query.request import Request
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design doc not updated after query restructuring

Low Severity

The design doc docs/design/cli/commands/query-command.md was not updated to reflect this restructuring. It still references the old file paths (lmcache/cli/commands/query.py, lmcache/cli/prompt.py, lmcache/cli/request.py), shows the old output format with per-document token breakdown ("Prompt documents lmcache", "Prompt query"), and describes the now-removed tokenizer-based token counting behavior. The project rules require design documents to be updated for architectural changes.

Additional Locations (1)
Fix in Cursor Fix in Web

Triggered by project rule: LMCache Code Review Style Guide

Reviewed by Cursor Bugbot for commit d24787e. Configure here.

@ApostaC ApostaC merged commit 3313527 into LMCache:dev Apr 21, 2026
30 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants