Support non-contiguous alloc in MemoryAllocator by chunxiaozheng · Pull Request #2767 · LMCache/LMCache

chunxiaozheng · 2026-03-13T08:49:57Z

Support non-contiguous alloc in MemoryAllocator.

As the running time increases, the last block of each request will first be evicted each time, which will result in the inability to allocate large contiguous space in the future.

gemini-code-assist · 2026-03-13T08:50:19Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the memory allocation system by introducing a new configuration option, contiguous_alloc, which allows users to specify whether memory should be allocated contiguously or non-contiguously. This provides greater flexibility in memory management strategies, enabling optimization for different workloads and potentially improving resource utilization by allowing the system to allocate memory in smaller, non-adjacent blocks when a single large contiguous block is not feasible or desired.

Highlights

New Configuration Option: Introduced a contiguous_alloc parameter across various memory allocator configurations and implementations to control memory allocation strategy.
Non-Contiguous Allocation Support: Enabled non-contiguous memory allocation within the TensorMemoryAllocator's batched_allocate method, allowing individual allocations instead of a single large block.
Command-Line Argument: Exposed a new command-line argument, --l1-contiguous-alloc, to allow users to easily configure the contiguous allocation behavior for L1 memory.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

docs/source/mp/configuration.rst
- Documented the new --l1-contiguous-alloc command-line argument.
lmcache/v1/config.py
- Added a contiguous_alloc boolean configuration option with a default of True.
lmcache/v1/distributed/config.py
- Added contiguous_alloc field to L1MemoryManagerConfig.
- Integrated --l1-contiguous-alloc as a new command-line argument.
- Passed the l1_contiguous_alloc argument to the L1MemoryManagerConfig constructor.
lmcache/v1/distributed/memory_manager.py
- Modified create_memory_allocator to pass the contiguous_alloc parameter to LazyMemoryAllocator and MixedMemoryAllocator.
- Added logging to indicate the contiguous_alloc setting for both allocator types.
lmcache/v1/lazy_memory_allocator.py
- Added contiguous_alloc parameter to the LazyMemoryAllocator constructor and passed it to the underlying AddressManager.
lmcache/v1/memory_management.py
- Added contiguous_alloc parameter to TensorMemoryAllocator's constructor and stored it.
- Implemented logic in TensorMemoryAllocator.batched_allocate to perform individual allocations when contiguous_alloc is False.
- Updated MixedMemoryAllocator to pass contiguous_alloc to its TensorMemoryAllocator instance.
lmcache/v1/storage_backend/local_cpu_backend.py
- Passed the contiguous_alloc configuration to the LazyMemoryAllocator and MixedMemoryAllocator constructors during initialization.

Activity

No specific activity has been recorded for this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

The pull request introduces a new contiguous_alloc configuration option, defaulting to True, which controls whether batched memory allocations are performed contiguously or as individual allocations. This involved updating configuration files, argument parsing, memory manager constructors, and the AddressManager's batched_allocate method to conditionally handle non-contiguous allocation by iterating through individual allocations. Review feedback suggests improving the docstring for contiguous_alloc in L1MemoryManagerConfig for better consistency with CLI documentation and raises a potential performance concern regarding the current iterative approach for non-contiguous batched allocations, recommending exploration of more optimized individual allocation strategies.

maobaolong

LGTM, Thanks for the fix.

chunxiaozheng · 2026-03-13T13:02:52Z

@ApostaC could you help take a look?

ApostaC

I like the implementation in the AddressManager. My proposal is to have it become the default and completely discard the previous "contiguous" mode. In this case, we don't need to change the config and CLI, and don't need to pass down the new arguments all the way down to the address manager.

Additionally, please add UTs for the AddressManager.batched_allocate.

Signed-off-by: idellzheng <idellzheng@tencent.com>

ApostaC

Please fix the issue in the logger output. Otherwise LGTM!

chunxiaozheng · 2026-03-19T02:00:45Z

@ApostaC Thanks for your review, I have updated, could you help take a look again?

Signed-off-by: idellzheng <idellzheng@tencent.com>

ApostaC

LGTM!

* Support non-contiguous alloc in MemoryAllocator Signed-off-by: idellzheng <idellzheng@tencent.com> * optimeize lock cost Signed-off-by: idellzheng <idellzheng@tencent.com>

* Support non-contiguous alloc in MemoryAllocator Signed-off-by: idellzheng <idellzheng@tencent.com> * optimeize lock cost Signed-off-by: idellzheng <idellzheng@tencent.com> Signed-off-by: Aaron Wu <aaron.wu@dell.com>

* Support non-contiguous alloc in MemoryAllocator Signed-off-by: idellzheng <idellzheng@tencent.com> * optimeize lock cost Signed-off-by: idellzheng <idellzheng@tencent.com>

gemini-code-assist Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread lmcache/v1/distributed/config.py Outdated

Comment thread lmcache/v1/memory_management.py Outdated

maobaolong approved these changes Mar 13, 2026

View reviewed changes

chunxiaozheng requested a review from ApostaC March 13, 2026 12:39

ApostaC requested changes Mar 18, 2026

View reviewed changes

Comment thread lmcache/v1/memory_management.py Outdated

Comment thread lmcache/v1/memory_management.py Outdated

chunxiaozheng added 5 commits March 18, 2026 10:05

Support non-contiguous alloc in MemoryAllocator

1259722

Signed-off-by: idellzheng <idellzheng@tencent.com>

optimeize lock cost

6bde74b

Signed-off-by: idellzheng <idellzheng@tencent.com>

make non-contiguous allocate as default

89bf148

Signed-off-by: idellzheng <idellzheng@tencent.com>

update allocate failed log level

b0d151a

Signed-off-by: idellzheng <idellzheng@tencent.com>

add ut

f52ab36

Signed-off-by: idellzheng <idellzheng@tencent.com>

chunxiaozheng force-pushed the mp-memory-allocate branch from 3d6b1d9 to f52ab36 Compare March 18, 2026 03:03

update

72b8349

Signed-off-by: idellzheng <idellzheng@tencent.com>

chunxiaozheng force-pushed the mp-memory-allocate branch from b691a8b to 72b8349 Compare March 18, 2026 03:15

ApostaC added the mp Buildkite trigger for multi-processing mode test label Mar 18, 2026

ApostaC reviewed Mar 18, 2026

View reviewed changes

Comment thread lmcache/v1/distributed/memory_manager.py Outdated

Comment thread lmcache/v1/distributed/memory_manager.py Outdated

fix log

574f69a

Signed-off-by: idellzheng <idellzheng@tencent.com>

chunxiaozheng added the full Run comprehensive tests on this PR label Mar 19, 2026

ApostaC approved these changes Mar 19, 2026

View reviewed changes

ApostaC merged commit 7e2857d into LMCache:dev Mar 19, 2026
26 of 28 checks passed

ccgibson mentioned this pull request May 3, 2026

[Bug] LazyMemoryAllocator exhausts under sustained load with no recovery (17k+ "Failed to allocate" warnings/4h on a single H200 server) #3187

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support non-contiguous alloc in MemoryAllocator#2767

Support non-contiguous alloc in MemoryAllocator#2767
ApostaC merged 7 commits intoLMCache:devfrom
chunxiaozheng:mp-memory-allocate

chunxiaozheng commented Mar 13, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Mar 13, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

maobaolong left a comment

Uh oh!

chunxiaozheng commented Mar 13, 2026

Uh oh!

ApostaC left a comment

Uh oh!

Uh oh!

Uh oh!

ApostaC left a comment

Uh oh!

Uh oh!

Uh oh!

chunxiaozheng commented Mar 19, 2026

Uh oh!

ApostaC left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

chunxiaozheng commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot commented Mar 13, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

maobaolong left a comment

Choose a reason for hiding this comment

Uh oh!

chunxiaozheng commented Mar 13, 2026

Uh oh!

ApostaC left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ApostaC left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

chunxiaozheng commented Mar 19, 2026

Uh oh!

ApostaC left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chunxiaozheng commented Mar 13, 2026 •

edited

Loading