fix(l1_manager): propagate extra_count through prefetch path to prevent premature eviction by liuyumoye · Pull Request #2725 · LMCache/LMCache

liuyumoye · 2026-03-10T03:20:49Z

fix(l1_manager): propagate extra_count through prefetch path to prevent premature eviction

What this PR does / why we need it:
When a prefetch request completes and loaded keys transition from write-locked to read-locked, finish_write_and_reserve_read was always acquiring exactly 1 read lock per key, regardless of the extra_count passed by the caller. This caused premature eviction in Tensor Parallelism (TP > 1) scenarios (e.g. MLA models), where each TP worker needs to consume its own read lock — but only 1 lock was held, so the first worker's finish_read would drop the ref-count to 0 and trigger eviction before the remaining workers could access the data.

This PR propagates extra_count through the entire prefetch path:

StorageManager.submit_prefetch_request → PrefetchController.submit_prefetch_request → _submission_queue / _pending_queue → _start_lookup_phase → InFlightPrefetchRequest.extra_count

On prefetch completion: finish_write_and_reserve_read(loaded_keys, extra_count=request.extra_count) acquires 1 + extra_count read locks per key

Non-prefix loaded keys that are immediately released also pass extra_count to finish_read

L1Manager.finish_write_and_reserve_read now accepts extra_count and calls entry.read_lock.lock() a total of 1 + extra_count times per key

Special notes for your reviewers:
The extra_count semantics are unchanged from the existing submit_prefetch_task / finish_read contract — this PR only ensures the value is correctly threaded through the prefetch pipeline rather than silently dropped.

_pending_queue and _submission_queue tuple types are widened from (..., MemoryLayoutDesc) to (..., MemoryLayoutDesc, int)

No behavior change when extra_count=0 (the default), so existing single-TP deployments are unaffected.

If applicable:

this PR contains user facing changes - docs added
this PR contains unit tests

gemini-code-assist · 2026-03-10T03:21:06Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a critical issue causing premature eviction of prefetched keys in distributed environments utilizing Tensor Parallelism (TP > 1). Previously, only a single read lock was acquired per key upon prefetch completion, leading to data eviction before all TP workers could access it. The fix involves systematically propagating the extra_count parameter, which specifies additional required read locks, through the entire prefetch pipeline. This ensures that the correct number of read locks is held for each key, preventing premature eviction and maintaining data availability for all consuming workers.

Highlights

L1Manager Lock Acquisition: The L1Manager.finish_write_and_reserve_read method now accepts an extra_count parameter, enabling it to acquire 1 + extra_count read locks per key, addressing premature eviction in Tensor Parallelism (TP > 1) scenarios.
Prefetch Path Propagation: The extra_count parameter is now correctly propagated through the entire prefetch pipeline, including StorageManager.submit_prefetch_request, PrefetchController's internal queues (_submission_queue, _pending_queue), InFlightPrefetchRequest, and _start_lookup_phase.
InFlightPrefetchRequest Update: The InFlightPrefetchRequest data structure was updated to include the extra_count field, ensuring that the required number of locks is tracked throughout the prefetch process.
No Behavior Change for Default Cases: The changes introduce no behavior modification when extra_count is 0 (the default), ensuring existing single-TP deployments remain unaffected.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

lmcache/v1/distributed/l1_manager.py
- Updated finish_write_and_reserve_read method signature to include extra_count parameter.
- Added validation for the extra_count parameter.
- Modified the read lock acquisition logic to loop 1 + extra_count times for each key.
lmcache/v1/distributed/storage_controllers/prefetch_controller.py
- Extended InFlightPrefetchRequest dataclass with a new extra_count field.
- Updated type hints for _pending_queue and _submission_queue to include extra_count in their tuple structures.
- Modified submit_prefetch_request method signature to accept extra_count.
- Updated _start_lookup_phase method signature to accept extra_count.
- Ensured _finalize_load passes request.extra_count to l1_mgr.finish_write_and_reserve_read and l1_mgr.finish_read.
lmcache/v1/distributed/storage_manager.py
- Modified submit_prefetch_task to pass the extra_count argument to the _prefetch_controller.submit_prefetch_request call.

Activity

No human activity (comments, reviews, etc.) has been recorded on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request addresses a bug causing premature eviction in multi-worker scenarios by propagating the extra_count parameter through the prefetch path. However, it misses a critical propagation point in the read_prefetched_results context manager, which could lead to lock leaks. Additionally, the lack of type validation for the new parameter at API boundaries poses a risk of crashing background processing threads if malformed input is provided.

gemini-code-assist · 2026-03-10T03:23:46Z

            request_id = self._prefetch_controller.submit_prefetch_request(
                remaining_keys,
                layout_desc,
+                extra_count=extra_count,


The propagation of the extra_count parameter is incomplete. While it is correctly passed to the PrefetchController in submit_prefetch_task, the read_prefetched_results context manager (lines 173-233) was not updated to accept or use this parameter.

In its finally block (line 230), it calls self._l1_manager.finish_read(good_keys) with the default extra_count=0. If the objects were originally prefetched with an extra_count > 0, this call will only release one of the multiple read locks acquired. This results in a lock leak, which prevents the objects from being evicted or deleted from the L1 cache, eventually leading to resource exhaustion and a Denial of Service of the cache manager.

gemini-code-assist · 2026-03-10T03:23:46Z

+        extra_count = _validate_extra_count(extra_count)
+        total = 1 + extra_count


The extra_count parameter is not validated to be an integer before being used in calculations and loops. While _validate_extra_count performs range checking, it does not verify the type.

In the PrefetchController, this value is queued and processed by a background thread. If an invalid type (e.g., a string or a float) is passed to the public submit_prefetch_request API, it will eventually cause the background thread to crash with a TypeError when it attempts to perform comparisons (line 101), string formatting in the logger (line 103), or use the value in range() (line 569). A crash in the background loop disables prefetch functionality for the entire application. It is recommended to validate that extra_count is a non-negative integer at the entry points of the public APIs.

ApostaC · 2026-03-10T03:39:44Z

Hey @liuyumoye , can you fix the DCO issue by doing?

git commit --amend --signoff
git push -f

Thanks!

liuyumoye · 2026-03-10T03:44:51Z

Hey @liuyumoye , can you fix the DCO issue by doing?
git commit --amend --signoff
git push -f
Thanks!

Thanks for the reminder! Fixed the DCO issue and force-pushed.

ApostaC

Can we add some unit test to test the extra_count path for the prefetch controller?

Otherwise LGTM!

liuyumoye · 2026-03-10T08:49:56Z

Can we add some unit test to test the extra_count path for the prefetch controller?

Otherwise LGTM!

Thanks for the review! I've added a TestExtraCountPrefetch test class in tests/v1/distributed/test_prefetch_controller.py covering the extra_count path:

test_extra_count_zero_default_behavior – verifies extra_count=0 degrades to the original single-lock behavior: 1 lock acquired, 1 finish_read fully releases it
test_extra_count_one_requires_two_finish_reads – verifies extra_count=1 acquires 2 locks; the key remains readable after the 1st finish_read and is only released after the 2nd
test_extra_count_three_requires_four_finish_reads – same logic with extra_count=3 (4 locks total); the key stays readable through 3 finish_read calls and is released on the 4th
test_extra_count_with_prefix_trim – verifies that when L2 has a gap, only the contiguous prefix keys get 1 + extra_count locks; keys beyond the gap are never loaded into L1
test_extra_count_non_prefix_loaded_keys_fully_released – verifies that _finalize_load correctly releases all 1 + extra_count locks for keys that were loaded but fall outside the prefix, preventing lock leaks

maobaolong · 2026-03-10T11:45:36Z

@liuyumoye Maybe you have to update your branch to the dev, otherwise, the UT will not passed.

…nt premature eviction Signed-off-by: liuyumoye <adeline_ly2023@outlook.com>

ApostaC

LGTM!

maobaolong

LGTM @liuyumoye Thanks for this fix.

…nt premature eviction (LMCache#2725) Signed-off-by: liuyumoye <adeline_ly2023@outlook.com> Co-authored-by: liuyumoye <adeline_ly2023@outlook.com> Signed-off-by: shaoxiawjc <wjc2800@163.com>

…nt premature eviction (LMCache#2725) Signed-off-by: liuyumoye <adeline_ly2023@outlook.com> Co-authored-by: liuyumoye <adeline_ly2023@outlook.com> Signed-off-by: Aaron Wu <aaron.wu@dell.com>

…nt premature eviction (LMCache#2725) Signed-off-by: liuyumoye <adeline_ly2023@outlook.com> Co-authored-by: liuyumoye <adeline_ly2023@outlook.com>

gemini-code-assist Bot reviewed Mar 10, 2026

View reviewed changes

liuyumoye force-pushed the dev branch from da2a749 to 289e619 Compare March 10, 2026 03:42

liuyumoye closed this Mar 10, 2026

liuyumoye reopened this Mar 10, 2026

ApostaC reviewed Mar 10, 2026

View reviewed changes

liuyumoye force-pushed the dev branch from 289e619 to 09e5b2f Compare March 10, 2026 08:17

fix(l1_manager): propagate extra_count through prefetch path to preve…

f632976

…nt premature eviction Signed-off-by: liuyumoye <adeline_ly2023@outlook.com>

liuyumoye force-pushed the dev branch from 09e5b2f to f632976 Compare March 10, 2026 12:39

ApostaC approved these changes Mar 10, 2026

View reviewed changes

maobaolong approved these changes Mar 11, 2026

View reviewed changes

maobaolong enabled auto-merge (squash) March 11, 2026 02:12

github-actions Bot added the full Run comprehensive tests on this PR label Mar 11, 2026

maobaolong merged commit 038cd7c into LMCache:dev Mar 11, 2026
26 of 29 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(l1_manager): propagate extra_count through prefetch path to prevent premature eviction#2725

fix(l1_manager): propagate extra_count through prefetch path to prevent premature eviction#2725
maobaolong merged 1 commit intoLMCache:devfrom
liuyumoye:dev

liuyumoye commented Mar 10, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Mar 10, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Mar 10, 2026

Uh oh!

gemini-code-assist Bot Mar 10, 2026

Uh oh!

ApostaC commented Mar 10, 2026

Uh oh!

liuyumoye commented Mar 10, 2026

Uh oh!

ApostaC left a comment

Uh oh!

liuyumoye commented Mar 10, 2026 •

edited

Loading

Uh oh!

maobaolong commented Mar 10, 2026

Uh oh!

ApostaC left a comment

Uh oh!

maobaolong left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		extra_count = _validate_extra_count(extra_count)
		total = 1 + extra_count

Conversation

liuyumoye commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot commented Mar 10, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

ApostaC commented Mar 10, 2026

Uh oh!

liuyumoye commented Mar 10, 2026

Uh oh!

ApostaC left a comment

Choose a reason for hiding this comment

Uh oh!

liuyumoye commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

maobaolong commented Mar 10, 2026

Uh oh!

ApostaC left a comment

Choose a reason for hiding this comment

Uh oh!

maobaolong left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

liuyumoye commented Mar 10, 2026 •

edited

Loading

liuyumoye commented Mar 10, 2026 •

edited

Loading