Revert "[MP] Support MP Server restart (#2713)" by ApostaC · Pull Request #2729 · LMCache/LMCache

ApostaC · 2026-03-10T19:39:53Z

Reverts #2713

See the comments for the reason: #2713 (review)

gemini-code-assist · 2026-03-10T19:40:21Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses concerns with the previously implemented multi-process server restart support by reverting the entire feature. The core change involves simplifying the client-server communication for cache operations, moving from a detailed OperationStatus enum to a straightforward boolean success indicator. This streamlines the interaction and removes the client-side complexity of handling server restarts and re-registration, effectively removing the server restart resilience capability.

Highlights

Revert of Server Restart Support: This pull request fully reverts the changes introduced in [MP] Support MP Server restart #2713, which aimed to add multi-process server restart support.
Removal of OperationStatus Enum: The OperationStatus enum, previously used to indicate the outcome of store and retrieve operations, has been entirely removed from the codebase.
Simplified Store/Retrieve Results: The return types for store and retrieve operations have been simplified from a status integer (OperationStatus) to a boolean flag, indicating success or failure.
Removal of Auto Re-registration Logic: The client-side logic for detecting unregistered instances and automatically re-registering KV caches after a server restart has been removed.
Removed Server Restart Tests: The dedicated test file test_server_restart.py, which covered the server restart resilience scenario, has been removed.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

lmcache/integration/vllm/vllm_multi_process_adapter.py
- Removed OperationStatus import.
- Changed StoreResult and RetrieveResult type hints from int to bool.
- Removed the registered attribute and _do_register method.
- Modified register_kv_caches to directly send the registration request.
- Removed auto re-registration logic and need_reregister flag from get_finished.
- Updated error logging for store/retrieve results to check for boolean failure.
lmcache/v1/multiprocess/custom_types.py
- Removed IntEnum import.
- Removed the OperationStatus enum definition.
lmcache/v1/multiprocess/futures.py
- Removed self.is_done_.set() and related conditional logic in _on_raw_future_complete.
- Removed is_done_.is_set() checks in wait and query methods.
- Replaced if self.event_ is not None: with assert self.event_ is not None.
lmcache/v1/multiprocess/protocols/engine.py
- Updated response_class for STORE and RETRIEVE protocols from tuple[bytes, int] to tuple[bytes, bool].
- Updated docstrings to reflect boolean success flags for store/retrieve operations.
lmcache/v1/multiprocess/server.py
- Removed OperationStatus import.
- Changed return type hints for store and retrieve methods to tuple[bytes, bool].
- Updated docstrings for store and retrieve to reflect boolean success flags.
- Replaced warning and OperationStatus.NOT_REGISTERED return with an assert for unregistered instances.
- Changed return values from OperationStatus codes to True or False.
tests/v1/multiprocess/test_blend_server.py
- Removed OperationStatus import.
- Updated assertions from OperationStatus.SUCCESS to True.
tests/v1/multiprocess/test_blend_server_v2.py
- Removed OperationStatus import.
- Updated assertions from OperationStatus.SUCCESS to True.
tests/v1/multiprocess/test_cache_server.py
- Removed OperationStatus import.
- Updated assertions from OperationStatus.SUCCESS to True.
- Modified retrieve_keys to append the raw boolean result instead of comparing to OperationStatus.SUCCESS.
tests/v1/multiprocess/test_mq.py
- Removed OperationStatus import.
- Updated docstrings for test_mq_store and test_mq_retrieve to reflect (bytes, bool) return type.
- Updated expected_response in helper.run_test from OperationStatus.SUCCESS to True.
tests/v1/multiprocess/test_mq_handler_helpers.py
- Removed OperationStatus import.
- Updated type hints for store_handler and retrieve_handler to tuple[bytes, bool].
- Updated docstrings to reflect success flag.
- Changed return values from OperationStatus.SUCCESS to True.
tests/v1/multiprocess/test_server_restart.py
- Removed the entire test file.

Activity

This pull request is a direct revert of [MP] Support MP Server restart #2713, indicating that the previous implementation for multi-process server restart support encountered issues or was deemed not ready for production, as highlighted by the linked review comments.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

This reverts commit 55622e8. Signed-off-by: ApostaC <yihua98@uchicago.edu>

gemini-code-assist

Code Review

This pull request successfully reverts the support for MP Server restart, correctly removing the OperationStatus enum, related logic, and the server restart test file. However, a potential Denial of Service (DoS) vulnerability was identified in the MPCacheEngine.store and MPCacheEngine.retrieve methods in lmcache/v1/multiprocess/server.py. The use of assert statements for validating instance_id from external requests could lead to unhandled exceptions and server crashes, especially if assertions are disabled in production. It is recommended to replace these assertions with explicit error handling and return a failure status to the client. Minor suggestions have also been made to improve logging consistency and fix a typo.

gemini-code-assist · 2026-03-10T19:43:52Z

+                    "store request for request_id=%s",
                    request_id,


For consistency with the error logging for retrieve requests, consider including the s_result in the log message. This will be helpful for debugging failed store requests.

Suggested change

"store request for request_id=%s",

request_id,

"store request for request_id=%s, result=%s",

request_id, s_result,

sammshen

LGTM!

DongDongJu

LGTM.

This reverts commit 55622e8. Signed-off-by: ApostaC <yihua98@uchicago.edu> Signed-off-by: shaoxiawjc <wjc2800@163.com>

This reverts commit 55622e8. Signed-off-by: ApostaC <yihua98@uchicago.edu> Signed-off-by: Aaron Wu <aaron.wu@dell.com>

This reverts commit 55622e8. Signed-off-by: ApostaC <yihua98@uchicago.edu>

ApostaC requested review from DongDongJu and maobaolong March 10, 2026 19:40

Revert "[MP] Support MP Server restart (#2713)"

b9f439d

This reverts commit 55622e8. Signed-off-by: ApostaC <yihua98@uchicago.edu>

ApostaC force-pushed the revert-2713-mp_server_restart branch from 5a39ed1 to b9f439d Compare March 10, 2026 19:41

gemini-code-assist Bot reviewed Mar 10, 2026

View reviewed changes

ApostaC added the full Run comprehensive tests on this PR label Mar 10, 2026

sammshen approved these changes Mar 10, 2026

View reviewed changes

sammshen reviewed Mar 10, 2026

View reviewed changes

sammshen approved these changes Mar 10, 2026

View reviewed changes

Oasis-Git approved these changes Mar 10, 2026

View reviewed changes

DongDongJu approved these changes Mar 10, 2026

View reviewed changes

ApostaC merged commit 9af5412 into dev Mar 10, 2026
30 of 37 checks passed

jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026

Revert "[MP] Support MP Server restart (LMCache#2713)" (LMCache#2729)

4ea4763

This reverts commit 55622e8. Signed-off-by: ApostaC <yihua98@uchicago.edu>

jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026

Revert "[MP] Support MP Server restart (LMCache#2713)" (LMCache#2729)

fd6d84e

This reverts commit 55622e8. Signed-off-by: ApostaC <yihua98@uchicago.edu>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "[MP] Support MP Server restart (#2713)"#2729

Revert "[MP] Support MP Server restart (#2713)"#2729
ApostaC merged 1 commit intodevfrom
revert-2713-mp_server_restart

ApostaC commented Mar 10, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Mar 10, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist Bot Mar 10, 2026

Uh oh!

Uh oh!

sammshen left a comment

Uh oh!

DongDongJu left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ApostaC commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot commented Mar 10, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist Bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sammshen left a comment

Choose a reason for hiding this comment

Uh oh!

DongDongJu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ApostaC commented Mar 10, 2026 •

edited

Loading