[1/N] Support NIXL-based L2 storage in MP mode#2664
Conversation
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the LMCache's distributed storage capabilities by introducing support for NIXL as an L2 storage backend. It refines memory management by enabling lazy initialization for paged allocators, ensuring that memory is allocated efficiently and correctly based on actual KV cache requirements. These changes facilitate more robust and performant L2 caching in multi-process setups, particularly when dealing with large-scale language models. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces support for NIXL-based L2 storage in multi-process mode, including a new NIXL L2 adapter, configuration options, and updates to the memory management system for paged memory allocation. However, several security issues were identified, including potential Denial of Service vulnerabilities due to unvalidated input in the ZMQ server and mismatched configuration keys, as well as insecure file creation in the NIXL storage adapter that could lead to data corruption or symlink attacks. Additionally, the review noted areas for improvement such as silently ignored exceptions, opportunities to use zip(..., strict=True) for robustness, simplification of a configuration class with a dataclass, and a misleading docstring in the memory management code. Addressing these points will significantly improve the robustness, security, and maintainability of the new code.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com>
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
| for i in range(num_pages): | ||
| filename = f"obj_{i}_{uuid.uuid4().hex[0:4]}.bin" | ||
| tmp_path = os.path.join(file_path, filename) | ||
| fd = os.open(tmp_path, flags) | ||
| fds.append(fd) |
There was a problem hiding this comment.
The IBM storage team complained about this this morning. Let's see how we can improve this.
There was a problem hiding this comment.
Persistence will be handled in followup prs
| for i, key in enumerate(keys): | ||
| if (obj := self._memory_objects.get(key)) is None: |
There was a problem hiding this comment.
Does this mean we will never have a "real" lookup in the storage, but just rely on in-proc states?
There was a problem hiding this comment.
Currently we are doing in-process lookup
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com>
ApostaC
left a comment
There was a problem hiding this comment.
Some minor comments. Otherwise LGTM.
| self.nixl_agent.deregister_memory(self.mem_reg_descs) | ||
|
|
||
|
|
||
| class NixlStoreL2Adapter(L2AdapterInterface): |
There was a problem hiding this comment.
Can we add the intialization code into l2_adapters/__init__.py?
There was a problem hiding this comment.
Oh right, the code for init is already merged. Updated
| raise RuntimeError("NIXL transfer failed") | ||
|
|
||
| def get_storage_indices(self, num_objs: int) -> list[int]: | ||
| return self.pool.batched_allocate(num_objs) |
There was a problem hiding this comment.
We do allocate here, but where do we free?
There was a problem hiding this comment.
Will be handled in the eviction pr. Feels we might need to reuse the eviction code across different adapters and also l1 memory.
| except Exception: | ||
| success = False | ||
|
|
||
| with self._lock: | ||
| for key, storage_obj in zip(keys, storage_objs, strict=False): | ||
| self._memory_objects[key] = storage_obj | ||
|
|
||
| self._completed_store_tasks[task_id] = success |
There was a problem hiding this comment.
If the success = False, do we want to update the self._memory_obj? Will this cause an incorrect lookup/load?
There was a problem hiding this comment.
Yeah the logic here is a bit confused. Transfer errors and allocation failures are handled in the same way. Updated the logic here.
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn>
* initial nixl support in mp Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * add tests Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * Update lmcache/v1/memory_management.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> * Update lmcache/v1/distributed/l2_adapters/nixl_store_l2_adapter.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> * Update lmcache/v1/distributed/l2_adapters/nixl_store_l2_adapter.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> * Update lmcache/v1/distributed/l2_adapters/nixl_store_l2_adapter.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> * fix misc Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix wrong mem allocator Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * not using paged allocator Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix tests Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * add init code Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * address several comments Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * add design doc Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix tests Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * use lazy import Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * remove get vm space Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> --------- Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* initial nixl support in mp Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * add tests Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * Update lmcache/v1/memory_management.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> * Update lmcache/v1/distributed/l2_adapters/nixl_store_l2_adapter.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> * Update lmcache/v1/distributed/l2_adapters/nixl_store_l2_adapter.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> * Update lmcache/v1/distributed/l2_adapters/nixl_store_l2_adapter.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> * fix misc Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix wrong mem allocator Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * not using paged allocator Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix tests Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * add init code Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * address several comments Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * add design doc Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix tests Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * use lazy import Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * remove get vm space Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> --------- Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: shaoxiawjc <wjc2800@163.com>
* initial nixl support in mp Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * add tests Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * Update lmcache/v1/memory_management.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> * Update lmcache/v1/distributed/l2_adapters/nixl_store_l2_adapter.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> * Update lmcache/v1/distributed/l2_adapters/nixl_store_l2_adapter.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> * Update lmcache/v1/distributed/l2_adapters/nixl_store_l2_adapter.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> * fix misc Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix wrong mem allocator Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * not using paged allocator Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix tests Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * add init code Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * address several comments Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * add design doc Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix tests Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * use lazy import Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * remove get vm space Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> --------- Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Aaron Wu <aaron.wu@dell.com>
* initial nixl support in mp Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * add tests Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * Update lmcache/v1/memory_management.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> * Update lmcache/v1/distributed/l2_adapters/nixl_store_l2_adapter.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> * Update lmcache/v1/distributed/l2_adapters/nixl_store_l2_adapter.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> * Update lmcache/v1/distributed/l2_adapters/nixl_store_l2_adapter.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> * fix misc Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix wrong mem allocator Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * not using paged allocator Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix tests Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * add init code Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * address several comments Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * add design doc Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix tests Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * use lazy import Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * remove get vm space Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> --------- Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* initial nixl support in mp Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * add tests Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * Update lmcache/v1/memory_management.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> * Update lmcache/v1/distributed/l2_adapters/nixl_store_l2_adapter.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> * Update lmcache/v1/distributed/l2_adapters/nixl_store_l2_adapter.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> * Update lmcache/v1/distributed/l2_adapters/nixl_store_l2_adapter.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> * fix misc Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix wrong mem allocator Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * not using paged allocator Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix tests Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * add init code Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * address several comments Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * add design doc Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * fix tests Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * use lazy import Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> * remove get vm space Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> --------- Signed-off-by: YaoJiayi <120040070@link.cuhk.edu.cn> Signed-off-by: Jiayi Yao <82156730+YaoJiayi@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
What this PR does / why we need it:
Initial support of Nixl-based L2 adapter.
TODOs:
Special notes for your reviewers:
If applicable: