fix: replace global lock with per-device transfer_lock to prevent deadlock by maobaolong · Pull Request #2816 · LMCache/LMCache

maobaolong · 2026-03-18T14:16:44Z

What this PR does / why we need it:

The global self.lock in MPCacheEngine was acquired inside
torch.cuda.device() context, creating a circular dependency
with the implicit CUDA driver lock (ABBA deadlock).

Replace it with a per-device transfer_lock on GPUCacheContext
so that GPU↔CPU transfers on the same
device are serialised without cross-lock contention.

Special notes for your reviewers:

If applicable:

this PR contains user facing changes - docs added
this PR contains unit tests

…dlock Signed-off-by: baoloongmao <baoloongmao@tencent.com>

chatgpt-codex-connector · 2026-03-18T14:16:53Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

gemini-code-assist · 2026-03-18T14:17:09Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical deadlock issue in the MPCacheEngine by refactoring the locking mechanism used for GPU-CPU data transfers. By moving from a single global lock to individual locks associated with each GPU device context, the change eliminates a circular dependency with the CUDA driver's implicit lock, thereby improving concurrency and stability during data operations without introducing new contention points.

Highlights

Deadlock Prevention: Replaced a global lock (self.lock) with a per-device transfer_lock to prevent an ABBA deadlock scenario that occurred when the global lock was acquired within a torch.cuda.device() context, conflicting with the implicit CUDA driver lock.
Granular Locking: Introduced a threading.Lock named transfer_lock within each GPUCacheContext instance, ensuring that GPU↔CPU data transfers are serialized on a per-device basis without blocking transfers on other devices.
Lock Usage Update: Modified the store and _retrieve_loop methods in MPCacheEngine to utilize the new gpu_context.transfer_lock instead of the problematic global self.lock for all data transfer operations.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a per-device transfer_lock in GPUContext to serialize GPU↔CPU data transfers, replacing a previous global lock. This change aims to prevent deadlocks with the implicit CUDA driver lock and allow concurrent transfers across different devices. The server.py file has been updated to utilize this new per-device lock for both storing data (GPU to CPU) and retrieving data (CPU to GPU).

ApostaC

LGTM!

chunxiaozheng

lgtm

…dlock (LMCache#2816) Signed-off-by: baoloongmao <baoloongmao@tencent.com>

…dlock (LMCache#2816) Signed-off-by: baoloongmao <baoloongmao@tencent.com> Signed-off-by: Aaron Wu <aaron.wu@dell.com>

…dlock (LMCache#2816) Signed-off-by: baoloongmao <baoloongmao@tencent.com>

fix: replace global lock with per-device transfer_lock to prevent dea…

89ca3cb

…dlock Signed-off-by: baoloongmao <baoloongmao@tencent.com>

maobaolong requested a review from ApostaC March 18, 2026 14:16

gemini-code-assist Bot reviewed Mar 18, 2026

View reviewed changes

ApostaC approved these changes Mar 19, 2026

View reviewed changes

ApostaC added the full Run comprehensive tests on this PR label Mar 19, 2026

maobaolong enabled auto-merge (squash) March 19, 2026 09:16

chunxiaozheng approved these changes Mar 19, 2026

View reviewed changes

maobaolong merged commit 9b4c713 into LMCache:dev Mar 19, 2026
27 of 29 checks passed

hyunyul-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Mar 20, 2026

fix: replace global lock with per-device transfer_lock to prevent dea…

a571e1e

…dlock (LMCache#2816) Signed-off-by: baoloongmao <baoloongmao@tencent.com>

deng451e pushed a commit to deng451e/LMCache that referenced this pull request Mar 21, 2026

fix: replace global lock with per-device transfer_lock to prevent dea…

c3c58f4

…dlock (LMCache#2816) Signed-off-by: baoloongmao <baoloongmao@tencent.com>

deng451e pushed a commit to deng451e/LMCache that referenced this pull request Mar 25, 2026

fix: replace global lock with per-device transfer_lock to prevent dea…

5750263

…dlock (LMCache#2816) Signed-off-by: baoloongmao <baoloongmao@tencent.com>

deng451e pushed a commit to deng451e/LMCache that referenced this pull request Mar 27, 2026

fix: replace global lock with per-device transfer_lock to prevent dea…

996cbbd

…dlock (LMCache#2816) Signed-off-by: baoloongmao <baoloongmao@tencent.com>

jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026

fix: replace global lock with per-device transfer_lock to prevent dea…

a08ff50

…dlock (LMCache#2816) Signed-off-by: baoloongmao <baoloongmao@tencent.com>

jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026

fix: replace global lock with per-device transfer_lock to prevent dea…

441bad6

…dlock (LMCache#2816) Signed-off-by: baoloongmao <baoloongmao@tencent.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: replace global lock with per-device transfer_lock to prevent deadlock#2816

fix: replace global lock with per-device transfer_lock to prevent deadlock#2816
maobaolong merged 1 commit intoLMCache:devfrom
maobaolong:store_deadlock

maobaolong commented Mar 18, 2026

Uh oh!

chatgpt-codex-connector Bot commented Mar 18, 2026

Uh oh!

gemini-code-assist Bot commented Mar 18, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

ApostaC left a comment

Uh oh!

chunxiaozheng left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

maobaolong commented Mar 18, 2026

Uh oh!

chatgpt-codex-connector Bot commented Mar 18, 2026

Uh oh!

gemini-code-assist Bot commented Mar 18, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

ApostaC left a comment

Choose a reason for hiding this comment

Uh oh!

chunxiaozheng left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants