Skip to content

[KVEvent] User request.block_hash for parent block_hash#30544

Merged
vllm-bot merged 3 commits into
vllm-project:mainfrom
heheda12345:fix_kv_event
Dec 24, 2025
Merged

[KVEvent] User request.block_hash for parent block_hash#30544
vllm-bot merged 3 commits into
vllm-project:mainfrom
heheda12345:fix_kv_event

Conversation

@heheda12345

@heheda12345 heheda12345 commented Dec 12, 2025

Copy link
Copy Markdown
Collaborator

Purpose

Parent block can be a null block:

  • mamba: the block_table will be like [null_block, null_block, ..., null_block, normal_block] as we only need one block per decode step.
  • sliding window + kv cache connector, assume block_size 1, window size 2, hit 3 local tokens + 3 external tokens, the block table will be [NULL, NULL, NULL, NULL, 4, 5]
    we will do:
# first allocation
allocate_slots(delay_cache_blocks=True):
    save_new_computed_blocks() caches the first 3 blocks
# after kv cache transfer
allocate_slots(delay_cache_blocks=False):
    cache_blocks() caches the first 6 blocks, parent block is block 2 (null_block)

So we extract parent block hash from request.block_hashes instead of null_block

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Signed-off-by: Chen Zhang <zhangch99@outlook.com>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a potential AssertionError that could occur when determining the parent block hash for KV cache events. The issue arises in scenarios where the parent block is a null_block, which does not have an associated block hash, causing the assertion to fail. The proposed change correctly resolves this by retrieving the parent block hash directly from request.block_hashes. This is the correct approach as request.block_hashes represents the logical sequence of hashes and is the reliable source of truth, independent of the physical block implementation. The fix is clean, direct, and I find no issues with the implementation.

@mgoin mgoin left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me. Should we add a unit test if this failed in some case?

Signed-off-by: Yifan Qiao <yifanqiao@berkeley.edu>
@ivanium

ivanium commented Dec 18, 2025

Copy link
Copy Markdown
Collaborator

Seems reasonable to me. Should we add a unit test if this failed in some case?

Good idea. I added a test case for null parent block hash. PTAL

@heheda12345 heheda12345 enabled auto-merge (squash) December 21, 2025 23:30
@github-actions github-actions Bot added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 21, 2025
@vllm-bot vllm-bot merged commit 538e830 into vllm-project:main Dec 24, 2025
46 of 47 checks passed
yiliu30 pushed a commit to yiliu30/vllm-fork that referenced this pull request Dec 30, 2025
…#30544)

Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Signed-off-by: Yifan Qiao <yifanqiao@berkeley.edu>
Co-authored-by: Yifan Qiao <yifanqiao@berkeley.edu>
mystous pushed a commit to mystous/vllm_hybrid that referenced this pull request May 10, 2026
…#30544)

Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Signed-off-by: Yifan Qiao <yifanqiao@berkeley.edu>
Co-authored-by: Yifan Qiao <yifanqiao@berkeley.edu>
my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026
…#30544)

Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Signed-off-by: Yifan Qiao <yifanqiao@berkeley.edu>
Co-authored-by: Yifan Qiao <yifanqiao@berkeley.edu>
my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026
…#30544)

Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Signed-off-by: Yifan Qiao <yifanqiao@berkeley.edu>
Co-authored-by: Yifan Qiao <yifanqiao@berkeley.edu>
0826joyce pushed a commit to 0826joyce/vllm-serving-optimization that referenced this pull request May 19, 2026
…#30544)

Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Signed-off-by: Yifan Qiao <yifanqiao@berkeley.edu>
Co-authored-by: Yifan Qiao <yifanqiao@berkeley.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants