feat(disk): support multi-path local disk backend for multi-device I/O#2801
feat(disk): support multi-path local disk backend for multi-device I/O#2801DongDongJu merged 1 commit intoLMCache:devfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant enhancement to the local disk backend by enabling support for multiple storage paths. This allows users to leverage several independent disk devices, such as multiple NVMe drives, to distribute KV cache files. The primary goal is to boost aggregate disk I/O bandwidth, which is crucial for performance-sensitive applications. The changes include robust parsing of multi-path configurations, intelligent handling of varying filesystem block sizes for optimal direct I/O, and a deterministic strategy for distributing cache entries across the available paths. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces multi-path support for the local disk backend, allowing KV cache files to be distributed across multiple storage devices for increased I/O bandwidth. The changes include updates to the configuration parsing, the local disk backend logic to handle multiple paths and heterogeneous block sizes, and comprehensive documentation and tests for the new feature.
My review identifies two high-severity issues. First, the configuration parser for disk paths is too strict and could lead to runtime errors with valid path formats. Second, the key distribution logic uses Python's built-in hash() function, which is not deterministic across process restarts, potentially leading to cache invalidation. I've provided suggestions to address both of these points for improved robustness and correctness.
| if re.match(r"file://(.*)/", part): | ||
| parsed.append(part[7:]) | ||
| else: | ||
| parsed.append(part) |
There was a problem hiding this comment.
The regex r"file://(.*)/" used to parse local_disk paths is too strict as it requires a trailing slash on file:// URIs. For example, file:///path/to/cache would not be parsed correctly, and the path would be passed as file:///path/to/cache to filesystem functions, which would fail. This is also inconsistent with the example in docs/source/api_reference/configurations.rst which does not use a trailing slash.
Using startswith("file://") would be more robust and handle both cases (with and without a trailing slash).
| if re.match(r"file://(.*)/", part): | |
| parsed.append(part[7:]) | |
| else: | |
| parsed.append(part) | |
| if part.startswith("file://"): | |
| parsed.append(part[7:]) | |
| else: | |
| parsed.append(part) |
| Absolute file path for the cached chunk. | ||
| """ | ||
| filename = key.to_string().replace("/", "-") + ".pt" | ||
| idx = hash(filename) % len(self.paths) |
There was a problem hiding this comment.
The built-in hash() function is not suitable for persistent caches as its output is not guaranteed to be stable across different Python versions or even different process invocations due to hash randomization (enabled by default). This means that after a restart, keys might be mapped to different paths, leading to cache misses and effectively losing cached data until it's evicted and rewritten.
For deterministic distribution, a stable hashing algorithm from the hashlib module should be used. You will need to add import hashlib at the top of the file.
| idx = hash(filename) % len(self.paths) | |
| idx = int(__import__('hashlib').md5(filename.encode()).hexdigest(), 16) % len(self.paths) |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c8a80177e3
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if path.startswith(disk_dir): | ||
| return bs |
There was a problem hiding this comment.
Use directory-boundary matching for block size lookup
_get_block_size uses path.startswith(disk_dir), which can select the wrong filesystem entry when one configured cache path is a prefix of another (for example, /mnt/nvme1 vs /mnt/nvme10). In that case reads/writes for files under the longer path can get the shorter path’s block size, and with use_odirect=True this can lead to misaligned direct I/O and runtime EINVAL failures. This was introduced with multi-path support and affects any deployment using overlapping path prefixes.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 30b364dee3
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
|
||
| parsed: list[str] = [] | ||
| for part in raw_parts: | ||
| if re.match(r"file://(.*)/", part): |
There was a problem hiding this comment.
Strip file:// URIs without requiring trailing slash
_parse_local_disk only removes the file:// prefix when a path matches file://(.*)/, so valid file URIs like file:///mnt/nvme0 (no trailing /) are left unchanged. In the new multi-path flow this causes LocalDiskBackend to treat entries as literal filesystem paths (file:///...) and create/write under the wrong directory tree instead of the intended mount points, which breaks configurations that omit trailing slashes.
Useful? React with 👍 / 👎.
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
can you add multipath support first with simple round robin routing? and then we can add extra modules to match workers with disk paths / smarter load balancing later? |
d386614 to
8b21b63
Compare
|
@cursor review |
| assert result == "/mnt/nvme0/,/mnt/nvme1/" | ||
|
|
||
| def test_empty_string(self): | ||
| assert _parse_local_disk("") is None |
There was a problem hiding this comment.
Tests directly test private function _parse_local_disk
Low Severity
TestParseLocalDisk imports and directly tests _parse_local_disk, a private (underscore-prefixed) module-level function from config_base. Project rules state tests must verify the public interface and docstring contract, not implementation details, and explicitly prohibit tests for private methods. This parsing behavior can be tested through the public LMCacheEngineConfig.from_defaults or from_env API instead.
Additional Locations (1)
Triggered by project rule: LMCache Code Review Style Guide
It sounds make sense to me, lets just do the file level interleaving among the folders. |
sure. this is introducing "by_gpu" sharding. same like we did for NIXL #2418 and GDS #2922. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit ae805f9. Configure here.
Allow `local_disk` to accept comma-separated paths (e.g. "/mnt/nvme0/,/mnt/nvme1/") to use multiple NVMe devices. Each GPU worker selects one path at init time via the `local_disk_path_sharding` strategy (currently only "by_gpu": device_id % num_paths), matching the GDS backend approach LMCache#2817 and NIXL approach LMCache#2418. - Path selected once in __init__; _key_to_path, write_file, read_file unchanged from upstream - _parse_local_disk now uses startswith("file://") instead of regex, fixing file:// URIs without a trailing slash - All directories created at startup - Added local_disk_path_sharding config field (default: "by_gpu") - Added tests and updated docs Before this change the only way to increase performance was to use any of the linux multi-pathing technologies to aggregate IOs. Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com>
| os.makedirs(self.path) | ||
| logger.info(f"Created local disk cache directory: {self.path}") | ||
|
|
||
| # Multi-path support: parse comma-separated paths and select one |
There was a problem hiding this comment.
ideally you create a new module called PathSharder that takes the csv as input and outputs a single path. the policy should be in there
There was a problem hiding this comment.
yes, this is a good idea. thank you for suggestion. I will introduce PathSharder and use it in all 3 storages: local, gds and nixl. This will remove duplication and create nicer abstraction for the sharding strategy
sammshen
left a comment
There was a problem hiding this comment.
LGTM, approving first but please address the serious code modularity comment
DongDongJu
left a comment
There was a problem hiding this comment.
LGTM, left just one comment for the community.
|
|
||
| local_disk: "/mnt/nvme0/kvcache/,/mnt/nvme1/kvcache/" | ||
| local_disk_path_sharding: "by_gpu" | ||
| max_local_disk_size: 20.0 |
There was a problem hiding this comment.
max_local_disk_size is kinda vague in here.
we should make better naming or param with multi folder structure.
LMCache#2801) feat(disk): support multi-path local disk backend with path sharding Allow `local_disk` to accept comma-separated paths (e.g. "/mnt/nvme0/,/mnt/nvme1/") to use multiple NVMe devices. Each GPU worker selects one path at init time via the `local_disk_path_sharding` strategy (currently only "by_gpu": device_id % num_paths), matching the GDS backend approach LMCache#2817 and NIXL approach LMCache#2418. - Path selected once in __init__; _key_to_path, write_file, read_file unchanged from upstream - _parse_local_disk now uses startswith("file://") instead of regex, fixing file:// URIs without a trailing slash - All directories created at startup - Added local_disk_path_sharding config field (default: "by_gpu") - Added tests and updated docs Before this change the only way to increase performance was to use any of the linux multi-pathing technologies to aggregate IOs. Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com>


Allow
local_diskto accept comma-separated paths (e.g. "/mnt/nvme0/,/mnt/nvme1/") so KV cache files are hash-distributed across multiple devices, increasing aggregate disk bandwidth.What this PR does / why we need it:
Special notes for your reviewers:
If applicable:
Note
Medium Risk
Changes disk cache initialization and path selection logic, which can affect where KV files are written/read and O_DIRECT alignment behavior. Risk is moderate due to new config surface (
local_disk_path_sharding) and device-id parsing, but behavior is backward-compatible for single-path setups and covered by added tests.Overview
Enables multi-device local disk offloading by allowing
local_diskto be a comma-separated list of cache directories and addinglocal_disk_path_sharding(defaultby_gpu) to deterministically choose a per-worker path based on GPU device index.Updates
_parse_local_diskto strip optionalfile://prefixes per entry and normalize multi-path values, adjustsLocalDiskBackendto create all configured directories and select the active path for I/O, and adds unit tests plus docs describing multi-path configuration and sharding behavior.Reviewed by Cursor Bugbot for commit 3101789. Bugbot is set up for automated code reviews on this repo. Configure here.