[LMCache CLI] Design and implementation of lmcache kvcache#2827
[LMCache CLI] Design and implementation of lmcache kvcache#2827royyhuang merged 15 commits intoLMCache:devfrom
lmcache kvcache#2827Conversation
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a comprehensive design for a new Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a design document for the lmcache kvcache CLI command. The document is well-structured, comprehensive, and clearly outlines the new functionality, including subcommands for inspecting, clearing, pinning, compressing, and ending sessions for KV caches on a per-request basis. The design thoughtfully considers script-friendliness with features like JSON output and specific exit codes. My feedback includes a couple of suggestions to further improve the scriptability of the JSON output and the user experience of the compress command.
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
ApostaC
left a comment
There was a problem hiding this comment.
LGTM overall! Just wondering what sub-command we already support for now? I suppose only clear?
Other small comments:
End-sessionshould only be used by the serving engine, otherwise it may cause internal state inconsistency- Can we add a user-facing doc (
docs/src/mp) for LMCache CLI as well?
lmcache kvcachelmcache kvcache
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
| Common Patterns | ||
| --------------- | ||
|
|
||
| **Check if a server is reachable before clearing:** |
There was a problem hiding this comment.
is using the destructive clear as a reachability check a good pattern?
There was a problem hiding this comment.
Here the clear command is not intended to perform reachability check. The goal is that, in case where the clear command fails due to connectivity issue, the return value reflects this. I just updated the doc.
|
|
||
| Every sub-command requires one of these to identify the target KV cache: | ||
|
|
||
| - **`--request-id <id>`** (required) — identifies the request whose KV cache |
There was a problem hiding this comment.
Since request id is required all the time, I feel it would be more convenient to just have lmcache kvcache <subcommand> <req_id>.
There was a problem hiding this comment.
lmcache kvcache clear does not take in request id. I will make that clear in the doc.
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
…into kuntai-kvcache
maobaolong
left a comment
There was a problem hiding this comment.
LGTM. Thanks for this great feature.
…#2827) * initial design of lmcache kvcache Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * changing of file Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * add lmcache kvcache -h Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * clarify that lmcache kvcache info design is temporary Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * initial implementation of lmcache kvcache Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * UX update Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * remove end-session Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * add user-facing docs Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * update doc and fix comments Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * let request-id be append argument instead of --request-id Signed-off-by: KuntaiDu <kuntai@uchicago.edu> --------- Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu> Co-authored-by: Roy Huang <roy.y.huang@gmail.com>
…#2827) * initial design of lmcache kvcache Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * changing of file Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * add lmcache kvcache -h Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * clarify that lmcache kvcache info design is temporary Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * initial implementation of lmcache kvcache Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * UX update Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * remove end-session Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * add user-facing docs Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * update doc and fix comments Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * let request-id be append argument instead of --request-id Signed-off-by: KuntaiDu <kuntai@uchicago.edu> --------- Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu> Co-authored-by: Roy Huang <roy.y.huang@gmail.com>
…#2827) * initial design of lmcache kvcache Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * changing of file Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * add lmcache kvcache -h Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * clarify that lmcache kvcache info design is temporary Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * initial implementation of lmcache kvcache Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * UX update Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * remove end-session Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * add user-facing docs Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * update doc and fix comments Signed-off-by: KuntaiDu <kuntai@uchicago.edu> * let request-id be append argument instead of --request-id Signed-off-by: KuntaiDu <kuntai@uchicago.edu> --------- Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu> Co-authored-by: Roy Huang <roy.y.huang@gmail.com>
The initial design of
lmcache kvcache. Please refer to the changed files for detail.If applicable: