Skip to content

Refactor kv cache free#11351

Merged
cctry merged 7 commits intomainfrom
shiyang/mem_v2/free
Oct 15, 2025
Merged

Refactor kv cache free#11351
cctry merged 7 commits intomainfrom
shiyang/mem_v2/free

Conversation

@cctry
Copy link
Copy Markdown
Collaborator

@cctry cctry commented Oct 9, 2025

Motivation

Prepration for mem_cache V2.
This PR aims to simpify the kv cache free logic

Modifications

  • cache_finished_req
    • add is_insert to skip inserting to tree
    • change the computation of token_ids to support preempting a prefill request.
  • simplify release_req by using cache_finished_req with no insert
    • previous release_req does not consider the eagle radix tree fix cc @ispobock to confirm
  • disable insertion for PD when transfer fails since kv cache is corrupted

Accuracy Tests

Benchmarking and Profiling

Checklist

@cctry cctry force-pushed the shiyang/mem_v2/free branch from bdc7688 to 0552591 Compare October 11, 2025 01:06
@cctry cctry force-pushed the shiyang/mem_v2/free branch from 70d09a8 to 85bc836 Compare October 13, 2025 18:51
@cctry cctry marked this pull request as ready for review October 13, 2025 19:56
@cctry cctry removed the run-ci label Oct 13, 2025
@cctry cctry added the run-ci label Oct 14, 2025
@cctry cctry changed the title Refactor kv cache free code Refactor kv cache free Oct 14, 2025
@ispobock
Copy link
Copy Markdown
Collaborator

previous release_req does not consider the eagle radix tree fix

Yes, the calculation of last_uncached_pos should be handled correctly in the EAGLE fix, or it may cause memory leak issue. This preempt case is not tested in the previous fix.

@cctry cctry requested a review from merrymercy October 15, 2025 00:41
@cctry cctry merged commit 1d7f783 into main Oct 15, 2025
209 of 238 checks passed
@cctry cctry deleted the shiyang/mem_v2/free branch October 15, 2025 00:45
@ispobock ispobock mentioned this pull request Jan 20, 2026
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants