Skip to content

[Bugfix]: fix get_num_heads for MLA format#2941

Merged
maobaolong merged 1 commit intoLMCache:devfrom
sammshen:fix-mla-fmt
Apr 3, 2026
Merged

[Bugfix]: fix get_num_heads for MLA format#2941
maobaolong merged 1 commit intoLMCache:devfrom
sammshen:fix-mla-fmt

Conversation

@sammshen
Copy link
Copy Markdown
Contributor

@sammshen sammshen commented Apr 3, 2026

MLA format (NL_X_NB_BS_HS) absorbs heads into the hidden dim, so get_num_heads should return 1 instead of raising ValueError. This was preventing all MLA models (e.g. DeepSeek-V2-Lite) from launching.

What this PR does / why we need it:

Special notes for your reviewers:

If applicable:

  • this PR contains user facing changes - docs added
  • this PR contains unit tests

Note

Low Risk
Low risk: changes a single helper to stop raising and to return a constant for the NL_X_NB_BS_HS (MLA) KV format, affecting only head-count introspection for MLA models.

Overview
Fixes get_num_heads to handle the MLA KV cache format (GPUKVFormat.NL_X_NB_BS_HS) by returning 1 (heads folded into hidden dim) instead of raising ValueError, unblocking MLA-based model startup.

Written by Cursor Bugbot for commit 4b74496. This will update automatically on new commits. Configure here.

MLA format (NL_X_NB_BS_HS) absorbs heads into the hidden dim,
so get_num_heads should return 1 instead of raising ValueError.
This was preventing all MLA models (e.g. DeepSeek-V2-Lite) from launching.
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

elif gpu_kv_format == lmc_ops.GPUKVFormat.NL_X_NB_BS_HS:
raise ValueError(_ATTRIBUTE_NOT_EXIST_ERROR.format(format=gpu_kv_format))
# MLA: heads are absorbed into hidden dim, so num_heads = 1
return 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug fix lacks required regression test

Low Severity

This bug fix for get_num_heads with NL_X_NB_BS_HS format has no accompanying regression test. The project's AGENTS.md and review rules require that bug fixes include corresponding tests to prevent regressions. The PR's own "this PR contains unit tests" checkbox is unchecked.

Fix in Cursor Fix in Web

Triggered by project rule: LMCache Code Review Style Guide

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the get_num_heads utility in lmcache/v1/gpu_connector/utils.py to support the MLA GPU KV format by returning 1 instead of raising a ValueError. While the logic change is correct, the reviewer noted that a regression test should be included to comply with the repository style guide regarding bug fixes.

Comment on lines +564 to +565
# MLA: heads are absorbed into hidden dim, so num_heads = 1
return 1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The fix correctly handles the MLA format by returning 1 instead of raising a ValueError. However, according to the repository style guide (line 39), bug fixes should include regression tests. Please consider adding a unit test to verify this behavior and prevent future regressions, especially since this issue was blocking model launches for MLA-based models like DeepSeek-V2.

References
  1. Bug fixes should include regression tests (line 39). (link)

Copy link
Copy Markdown
Collaborator

@deng451e deng451e left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Collaborator

@maobaolong maobaolong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @sammshen Thanks for this quick fix!

@maobaolong maobaolong enabled auto-merge (squash) April 3, 2026 06:09
@github-actions github-actions Bot added the full Run comprehensive tests on this PR label Apr 3, 2026
@maobaolong maobaolong merged commit 45d4d36 into LMCache:dev Apr 3, 2026
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants