Skip to content

refactor: extract PathSharder module for shared multi-path selection#2982

Merged
sammshen merged 1 commit intoLMCache:devfrom
glimchb:sharding
Apr 14, 2026
Merged

refactor: extract PathSharder module for shared multi-path selection#2982
sammshen merged 1 commit intoLMCache:devfrom
glimchb:sharding

Conversation

@glimchb
Copy link
Copy Markdown
Contributor

@glimchb glimchb commented Apr 8, 2026

Move the duplicated path-sharding logic (CSV parsing, device ID extraction, by_gpu strategy) from LocalDiskBackend and GdsBackend into a new PathSharder class in path_sharder.py.

Both backends now delegate to PathSharder, keeping the policy in one place. The _resolve_device_id helper handles cuda:N, bare cuda, and cpu devices gracefully.

Generated with Devin

What this PR does / why we need it:

Special notes for your reviewers:

If applicable:

  • this PR contains user facing changes - docs added
  • this PR contains unit tests

Note

Medium Risk
Touches storage backend initialization and path selection logic for both LocalDiskBackend and GdsBackend, which could change which directory is used (and how invalid configs fail) in multi-GPU/multi-path deployments.

Overview
Refactors multi-path cache directory selection by extracting the shared CSV parsing + by-GPU sharding logic into a new PathSharder utility and switching both GdsBackend and LocalDiskBackend to use it.

Standardizes device-id resolution (handles cuda:N, bare cuda, and cpu) and changes unsupported sharding configuration failures from AssertionError to ValueError, with updated/new unit tests covering selection, directory creation, and edge cases.

Reviewed by Cursor Bugbot for commit 1923414. Bugbot is set up for automated code reviews on this repo. Configure here.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new PathSharder utility to centralize path-sharding logic for multi-path storage backends, refactoring GdsBackend and LocalDiskBackend to use this shared component. The review identified a potential bug in the _resolve_device_id helper function regarding "cpu" device handling and robustness against malformed device strings, for which a code suggestion was provided.

Comment thread lmcache/v1/storage_backend/path_sharder.py
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit e3fd684. Configure here.

Comment thread lmcache/v1/storage_backend/path_sharder.py Outdated
Comment thread tests/v1/storage_backend/test_path_sharder.py Outdated
@DongDongJu
Copy link
Copy Markdown
Collaborator

Hello @glimchb, Thanks for the hard work. Could you check bot review?
Otherwise, functionally LGTM!

Move the duplicated path-sharding logic (CSV parsing, device ID
extraction, by_gpu strategy) from LocalDiskBackend and GdsBackend
into a new PathSharder class in path_sharder.py.

Both backends now delegate to PathSharder, keeping the policy in
one place.  The _resolve_device_id helper handles cuda:N, bare
cuda, and cpu devices gracefully.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <noreply@cognition.ai>
Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com>
@DongDongJu DongDongJu added the full Run comprehensive tests on this PR label Apr 13, 2026
Copy link
Copy Markdown
Collaborator

@DongDongJu DongDongJu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
I left one nit question.


def test_gds_path_sharding_default(self, temp_gds_path, async_loop):
"""Default gds_path_sharding is 'by_gpu'."""
"""Default gds_path_sharding is 'by_gpu' (backend inits OK)."""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "backend inits OK" meaning?

Copy link
Copy Markdown
Contributor

@sammshen sammshen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sammshen sammshen merged commit 3c9a29c into LMCache:dev Apr 14, 2026
38 checks passed
@glimchb glimchb deleted the sharding branch April 14, 2026 03:43
ekaynar pushed a commit to ekaynar/LMCache that referenced this pull request Apr 15, 2026
…MCache#2982)

Move the duplicated path-sharding logic (CSV parsing, device ID
extraction, by_gpu strategy) from LocalDiskBackend and GdsBackend
into a new PathSharder class in path_sharder.py.

Both backends now delegate to PathSharder, keeping the policy in
one place.  The _resolve_device_id helper handles cuda:N, bare
cuda, and cpu devices gracefully.

Generated with [Devin](https://cli.devin.ai/docs)

Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com>
Co-authored-by: Devin <noreply@cognition.ai>
ekaynar added a commit to ekaynar/LMCache that referenced this pull request Apr 15, 2026
- Remove validate_nixl_path method from NixlStorageConfig
- Update NixlFilePool to accept PathSharder instance
- Update createPool to use PathSharder with buffer_device
- Update tests to use PathSharder directly

This aligns with PR LMCache#2982 which centralized path sharding logic.
ekaynar added a commit to ekaynar/LMCache that referenced this pull request Apr 15, 2026
- Remove validate_nixl_path method from NixlStorageConfig
- Update NixlFilePool to accept PathSharder instance
- Update createPool to use PathSharder with buffer_device
- Update tests to use PathSharder directly

This aligns with PR LMCache#2982 which centralized path sharding logic.
ekaynar added a commit to ekaynar/LMCache that referenced this pull request Apr 15, 2026
- Remove validate_nixl_path method from NixlStorageConfig
- Update NixlFilePool to accept PathSharder instance
- Update createPool to use PathSharder with buffer_device
- Update tests to use PathSharder directly

This aligns with PR LMCache#2982 which centralized path sharding logic.

Signed-off-by: Ugur Kaynar <ugur.kaynar@dell.com>
ftian1 pushed a commit to ftian1/LMCache that referenced this pull request Apr 20, 2026
…MCache#2982)

Move the duplicated path-sharding logic (CSV parsing, device ID
extraction, by_gpu strategy) from LocalDiskBackend and GdsBackend
into a new PathSharder class in path_sharder.py.

Both backends now delegate to PathSharder, keeping the policy in
one place.  The _resolve_device_id helper handles cuda:N, bare
cuda, and cpu devices gracefully.

Generated with [Devin](https://cli.devin.ai/docs)

Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com>
Co-authored-by: Devin <noreply@cognition.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants