Skip to content

Support non-contiguous alloc in MemoryAllocator#2767

Merged
ApostaC merged 7 commits intoLMCache:devfrom
chunxiaozheng:mp-memory-allocate
Mar 19, 2026
Merged

Support non-contiguous alloc in MemoryAllocator#2767
ApostaC merged 7 commits intoLMCache:devfrom
chunxiaozheng:mp-memory-allocate

Conversation

@chunxiaozheng
Copy link
Copy Markdown
Collaborator

@chunxiaozheng chunxiaozheng commented Mar 13, 2026

Support non-contiguous alloc in MemoryAllocator.

As the running time increases, the last block of each request will first be evicted each time, which will result in the inability to allocate large contiguous space in the future.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the memory allocation system by introducing a new configuration option, contiguous_alloc, which allows users to specify whether memory should be allocated contiguously or non-contiguously. This provides greater flexibility in memory management strategies, enabling optimization for different workloads and potentially improving resource utilization by allowing the system to allocate memory in smaller, non-adjacent blocks when a single large contiguous block is not feasible or desired.

Highlights

  • New Configuration Option: Introduced a contiguous_alloc parameter across various memory allocator configurations and implementations to control memory allocation strategy.
  • Non-Contiguous Allocation Support: Enabled non-contiguous memory allocation within the TensorMemoryAllocator's batched_allocate method, allowing individual allocations instead of a single large block.
  • Command-Line Argument: Exposed a new command-line argument, --l1-contiguous-alloc, to allow users to easily configure the contiguous allocation behavior for L1 memory.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • docs/source/mp/configuration.rst
    • Documented the new --l1-contiguous-alloc command-line argument.
  • lmcache/v1/config.py
    • Added a contiguous_alloc boolean configuration option with a default of True.
  • lmcache/v1/distributed/config.py
    • Added contiguous_alloc field to L1MemoryManagerConfig.
    • Integrated --l1-contiguous-alloc as a new command-line argument.
    • Passed the l1_contiguous_alloc argument to the L1MemoryManagerConfig constructor.
  • lmcache/v1/distributed/memory_manager.py
    • Modified create_memory_allocator to pass the contiguous_alloc parameter to LazyMemoryAllocator and MixedMemoryAllocator.
    • Added logging to indicate the contiguous_alloc setting for both allocator types.
  • lmcache/v1/lazy_memory_allocator.py
    • Added contiguous_alloc parameter to the LazyMemoryAllocator constructor and passed it to the underlying AddressManager.
  • lmcache/v1/memory_management.py
    • Added contiguous_alloc parameter to TensorMemoryAllocator's constructor and stored it.
    • Implemented logic in TensorMemoryAllocator.batched_allocate to perform individual allocations when contiguous_alloc is False.
    • Updated MixedMemoryAllocator to pass contiguous_alloc to its TensorMemoryAllocator instance.
  • lmcache/v1/storage_backend/local_cpu_backend.py
    • Passed the contiguous_alloc configuration to the LazyMemoryAllocator and MixedMemoryAllocator constructors during initialization.
Activity
  • No specific activity has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces a new contiguous_alloc configuration option, defaulting to True, which controls whether batched memory allocations are performed contiguously or as individual allocations. This involved updating configuration files, argument parsing, memory manager constructors, and the AddressManager's batched_allocate method to conditionally handle non-contiguous allocation by iterating through individual allocations. Review feedback suggests improving the docstring for contiguous_alloc in L1MemoryManagerConfig for better consistency with CLI documentation and raises a potential performance concern regarding the current iterative approach for non-contiguous batched allocations, recommending exploration of more optimized individual allocation strategies.

Comment thread lmcache/v1/distributed/config.py Outdated
Comment thread lmcache/v1/memory_management.py Outdated
Copy link
Copy Markdown
Collaborator

@maobaolong maobaolong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks for the fix.

@chunxiaozheng chunxiaozheng requested a review from ApostaC March 13, 2026 12:39
@chunxiaozheng
Copy link
Copy Markdown
Collaborator Author

@ApostaC could you help take a look?

Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the implementation in the AddressManager. My proposal is to have it become the default and completely discard the previous "contiguous" mode. In this case, we don't need to change the config and CLI, and don't need to pass down the new arguments all the way down to the address manager.

Additionally, please add UTs for the AddressManager.batched_allocate.

Comment thread lmcache/v1/memory_management.py Outdated
Comment thread lmcache/v1/memory_management.py Outdated
Signed-off-by: idellzheng <idellzheng@tencent.com>
Signed-off-by: idellzheng <idellzheng@tencent.com>
Signed-off-by: idellzheng <idellzheng@tencent.com>
Signed-off-by: idellzheng <idellzheng@tencent.com>
Signed-off-by: idellzheng <idellzheng@tencent.com>
Signed-off-by: idellzheng <idellzheng@tencent.com>
@ApostaC ApostaC added the mp Buildkite trigger for multi-processing mode test label Mar 18, 2026
Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the issue in the logger output. Otherwise LGTM!

Comment thread lmcache/v1/distributed/memory_manager.py Outdated
Comment thread lmcache/v1/distributed/memory_manager.py Outdated
@chunxiaozheng
Copy link
Copy Markdown
Collaborator Author

@ApostaC Thanks for your review, I have updated, could you help take a look again?

Signed-off-by: idellzheng <idellzheng@tencent.com>
@chunxiaozheng chunxiaozheng added the full Run comprehensive tests on this PR label Mar 19, 2026
Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ApostaC ApostaC merged commit 7e2857d into LMCache:dev Mar 19, 2026
26 of 28 checks passed
hyunyul-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Mar 20, 2026
* Support non-contiguous alloc in MemoryAllocator

Signed-off-by: idellzheng <idellzheng@tencent.com>

* optimeize lock cost

Signed-off-by: idellzheng <idellzheng@tencent.com>
realAaronWu pushed a commit to realAaronWu/LMCache that referenced this pull request Mar 20, 2026
* Support non-contiguous alloc in MemoryAllocator

Signed-off-by: idellzheng <idellzheng@tencent.com>

* optimeize lock cost

Signed-off-by: idellzheng <idellzheng@tencent.com>
Signed-off-by: Aaron Wu <aaron.wu@dell.com>
deng451e pushed a commit to deng451e/LMCache that referenced this pull request Mar 21, 2026
* Support non-contiguous alloc in MemoryAllocator

Signed-off-by: idellzheng <idellzheng@tencent.com>

* optimeize lock cost

Signed-off-by: idellzheng <idellzheng@tencent.com>
deng451e pushed a commit to deng451e/LMCache that referenced this pull request Mar 25, 2026
* Support non-contiguous alloc in MemoryAllocator

Signed-off-by: idellzheng <idellzheng@tencent.com>

* optimeize lock cost

Signed-off-by: idellzheng <idellzheng@tencent.com>
deng451e pushed a commit to deng451e/LMCache that referenced this pull request Mar 27, 2026
* Support non-contiguous alloc in MemoryAllocator

Signed-off-by: idellzheng <idellzheng@tencent.com>

* optimeize lock cost

Signed-off-by: idellzheng <idellzheng@tencent.com>
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
* Support non-contiguous alloc in MemoryAllocator

Signed-off-by: idellzheng <idellzheng@tencent.com>

* optimeize lock cost

Signed-off-by: idellzheng <idellzheng@tencent.com>
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
* Support non-contiguous alloc in MemoryAllocator

Signed-off-by: idellzheng <idellzheng@tencent.com>

* optimeize lock cost

Signed-off-by: idellzheng <idellzheng@tencent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR mp Buildkite trigger for multi-processing mode test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants