Skip to content

[refactor] move get_shapes to LMCacheEngineMetadata#2284

Merged
chunxiaozheng merged 2 commits intoLMCache:devfrom
chunxiaozheng:dsa-1
Dec 22, 2025
Merged

[refactor] move get_shapes to LMCacheEngineMetadata#2284
chunxiaozheng merged 2 commits intoLMCache:devfrom
chunxiaozheng:dsa-1

Conversation

@chunxiaozheng
Copy link
Copy Markdown
Collaborator

@chunxiaozheng chunxiaozheng commented Dec 19, 2025

move get_shapes from gpu_connector to metadata, it is convenient for us to use in other places, such as LocalCPUBackend and RemoteConnector.
In addition, this can help us remove some methods in gpu_connector, such as init_group_info.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @chunxiaozheng, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the KV cache shape determination logic by relocating the get_shapes method from the GPUConnector classes to LMCacheEngineMetadata. This change centralizes the configuration-related aspects of KV cache shapes, simplifying the GPUConnector implementations and making the system more modular. It also updates the dummy memory allocator to better support multi-group KV cache configurations.

Highlights

  • Centralized KV Cache Shape Logic: The get_shapes method, responsible for determining the shapes of the KV cache, has been moved from GPUConnectorInterface and its subclasses to LMCacheEngineMetadata. This centralizes the logic within the configuration.
  • Simplified GPU Connector: The GPUConnectorInterface and VLLMPagedMemGPUConnectorV3 classes have been streamlined by removing the get_shapes method and refactoring init_group_info (now _initialize_kv_cache_pointers) to rely on LMCacheEngineMetadata for shape information.
  • Enhanced Dummy Allocator: The DummyAllocator now stores a list of shapes and dtypes in MemoryObjMetadata, improving its ability to handle multi-group KV cache configurations more robustly.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a well-executed refactoring that moves the get_shapes logic to LMCacheEngineMetadata, which is a more logical location for it. The changes are consistent across all modified files and improve the code structure. I have one suggestion to improve the clarity of a variable name in the new get_shapes method.

Comment thread lmcache/config.py Outdated
Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One quick question. Otherwise LGTM

Comment thread lmcache/config.py Outdated
Comment on lines +72 to +74
layers = group.num_layers
num_heads = 1 if self.use_mla else group.shape[3]
hidden_dim_size = num_heads * group.shape[-1]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dumb question: Will similar logic be used multiple times when we use kv_layer_groups in different places? If so, we can probably implement them as some methods in the kv_layer_group class.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ApostaC Good idea, this is a common attribute of kv_layer_group, I have updated!

Signed-off-by: idellzheng <idellzheng@tencent.com>
Signed-off-by: idellzheng <idellzheng@tencent.com>
@chunxiaozheng chunxiaozheng added the full Run comprehensive tests on this PR label Dec 22, 2025
Copy link
Copy Markdown
Collaborator

@maobaolong maobaolong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@chunxiaozheng chunxiaozheng merged commit f1e1e46 into LMCache:dev Dec 22, 2025
21 checks passed
DongDongJu pushed a commit to DongDongJu/LMCache that referenced this pull request Feb 22, 2026
* [refactor] move get_shapes to LMCacheEngineMetadata

Signed-off-by: idellzheng <idellzheng@tencent.com>

* update

Signed-off-by: idellzheng <idellzheng@tencent.com>

---------

Signed-off-by: idellzheng <idellzheng@tencent.com>
sammshen pushed a commit to sammshen/LMCache that referenced this pull request Mar 1, 2026
* [refactor] move get_shapes to LMCacheEngineMetadata

Signed-off-by: idellzheng <idellzheng@tencent.com>

* update

Signed-off-by: idellzheng <idellzheng@tencent.com>

---------

Signed-off-by: idellzheng <idellzheng@tencent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants