feat(gds): add gds_path_sharding config for multi-path strategy#2922
feat(gds): add gds_path_sharding config for multi-path strategy#2922deng451e merged 1 commit intoLMCache:devfrom
Conversation
e68a869 to
3f09875
Compare
There was a problem hiding this comment.
Code Review
This pull request refactors the GDS (GPU Direct Storage) configuration by renaming cufile_buffer_size to gds_buffer_size and introducing explicit fields for use_gds, gds_backend, and gds_path_sharding. These changes are reflected across the documentation, configuration schema, and storage backend implementation. Feedback was provided regarding a violation of the style guide's rule against cross-class private member access when checking if a configuration key was explicitly set by the user.
| user_set_keys: set[str] = getattr(config, "_user_set_keys", set()) | ||
| use_gds_explicitly_set = "use_gds" in user_set_keys |
There was a problem hiding this comment.
Accessing the _user_set_keys attribute directly violates the style guide's rule against accessing private members (_-prefixed attributes) across class boundaries. This creates a tight coupling between GdsBackend and the internal implementation of LMCacheEngineConfig, making the code less maintainable.
A better approach would be to add a public method to LMCacheEngineConfig to encapsulate this logic. This can be done by defining a helper function and adding it to the namespace_extras dictionary in create_config_class in lmcache/v1/config.py:
# In lmcache/v1/config.py
def _is_user_set(self, key: str) -> bool:
"""Check if a configuration key was explicitly set by the user."""
return key in self._user_set_keys
# ... later, in create_config_class call:
LMCacheEngineConfig = create_config_class(
# ...
namespace_extras={
"is_user_set": _is_user_set,
# ... other methods
},
)Then, you could use it here like this:
| user_set_keys: set[str] = getattr(config, "_user_set_keys", set()) | |
| use_gds_explicitly_set = "use_gds" in user_set_keys | |
| use_gds_explicitly_set = config.is_user_set("use_gds") |
References
- The code accesses a private attribute
_user_set_keysof theLMCacheEngineConfigobject from within theGdsBackendclass. This violates the style guide rule on line 27: 'No private member access (_-prefixed attributes) across class boundaries'. (link)
3f09875 to
3434b0a
Compare
Add a top-level `gds_path_sharding` config field (default: "by_gpu") that controls how GPUs are assigned to storage paths when multiple comma-separated paths are provided in `gds_path`. This replaces the previously hardcoded by_gpu logic with an explicit, extensible setting. Currently only "by_gpu" is supported (selects path via `device_id % num_paths`); unsupported values raise AssertionError. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <noreply@cognition.ai> Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com>
3434b0a to
2a90190
Compare
|
@cursor review |
sure @DongDongJu. The performance numbers already committed in the |
What this PR does / why we need it:
Add a top-level
gds_path_shardingconfig field (default: "by_gpu")that controls how GPUs are assigned to storage paths when multiple
comma-separated paths are provided in
gds_path. This replaces thepreviously hardcoded by_gpu logic with an explicit, extensible setting.
Currently only "by_gpu" is supported (selects path via
device_id % num_paths); unsupported values raise AssertionError.Special notes for your reviewers:
If applicable:
Note
Medium Risk
Adds a new top-level config/env var that affects GDS path selection and introduces runtime validation that will assert on unsupported values, potentially breaking misconfigured deployments.
Overview
Makes multi-path GDS selection explicitly configurable via new
gds_path_sharding(env:LMCACHE_GDS_PATH_SHARDING) with default"by_gpu", replacing previously hardcoded GPU-ID modulo behavior with a validated strategy.Updates GDS backend initialization to read/validate the sharding strategy (currently only
by_gpusupported) and extends unit tests and docs to cover the default, explicit setting, and failure on unsupported values.Written by Cursor Bugbot for commit 2a90190. This will update automatically on new commits. Configure here.