[MP][Feat] Add cache_salt to ObjectKey for cache isolation#3042
[MP][Feat] Add cache_salt to ObjectKey for cache isolation#3042royyhuang merged 8 commits intoLMCache:devfrom
Conversation
Adds a ``cache_salt: str = ""`` field to ``ObjectKey`` and ``IPCCacheEngineKey`` so same-content/different-user cache entries produce distinct keys. This is the data-model piece of the per-user L2 cache quota feature (PR2 of 6). PR1a (LMCache#3029) wired cache_salt through the adapter surface but left it a no-op until the data model landed; this PR makes it load-bearing. Key changes: - ``ObjectKey`` and ``IPCCacheEngineKey`` gain ``cache_salt`` field participating in eq/hash. ``__post_init__`` on both rejects ``@`` in the salt — this is the invariant the L2 adapter parsers rely on. - ``ipc_key_to_object_keys(ipc_key, chunk_hashes)`` reads ``ipc_key.cache_salt`` internally. No separate parameter, so callers cannot accidentally drop the salt. - Native connector, FS adapter Python, and FS adapter C++ all use a leading ``@@`` marker to disambiguate salted from legacy keys. Legacy files and wire payloads (empty salt) remain readable — existing stored data is preserved. - msgspec encodes dataclasses as maps, so an old 7-field ``IPCCacheEngineKey`` payload decodes on new code with ``cache_salt`` defaulting to ``""``. New-on-old works because msgspec ignores unknown map keys by default. Tests: 32 new cases covering serialization roundtrip (native + FS), validation (``@`` rejection), isolation semantics (eq/hash per salt), msgspec wire compat (old→new, new→new), and the FS C++ parser's legacy/salted/``@``-in-model paths. Signed-off-by: royyhuang <roy.y.huang@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request introduces per-user cache isolation by adding a cache_salt field to ObjectKey and IPCCacheEngineKey. The L2 adapters (filesystem and native) and the C++ FS connector are updated to handle a new salted serialization format prefixed with @@, ensuring backward compatibility with legacy keys. Validation logic is implemented to prevent the use of the @ character within the salt. Additionally, the design documentation and vLLM integration are updated to support automatic salt propagation, and extensive unit tests are added to verify isolation and wire compatibility. I have no feedback to provide.
Replaces ``L2AdapterInterface.get_usage() -> tuple[float, float]`` with a structured ``AdapterUsage`` dataclass (aggregate + per-user byte counts) and moves byte accounting into the base class. Adapters pass ``max_capacity_bytes`` to ``super().__init__()`` and fire ``_notify_keys_stored(keys, sizes)`` / ``_notify_keys_deleted(keys, sizes)``; the base class maintains both ``_total_bytes_used`` and ``_per_user_size_bytes`` (keyed by ``ObjectKey.cache_salt`` from PR LMCache#3042) under a dedicated ``_usage_lock``. Existing LRU eviction behavior is unchanged — this is purely an internal refactor that sets up PR5's per-user quota policy. PR3 in a 6-PR series. Key changes: - ``AdapterUsage`` dataclass with ``total_bytes_used``, ``total_capacity_bytes``, ``per_user_bytes`` (read-only ``MappingProxyType`` snapshot), and ``usage_fraction`` property that preserves the legacy ``-1.0`` "no eviction signal" sentinel. - ``L2AdapterInterface.supports_eviction`` property — ``True`` only when ``max_capacity_bytes > 0``. ``StorageManager`` skips creating an ``L2AdapterEvictionState`` for adapters that don't support eviction (logs a warning if ``eviction_config`` was nevertheless set). - Underflow clamp: ``_notify_keys_deleted`` clamps ``_total_bytes_used`` to 0 with a warning if a buggy caller would drive it negative — without the clamp the ``-1`` sentinel would silently disable eviction forever. - Native connector preserves listener notification on duplicate stores (passes ``size=0``) so LRU policies still ``move_to_end`` on re-store. - Each adapter retains whatever per-key bookkeeping it needs internally (mock keeps ``_current_size_bytes`` for the within-batch capacity gate; native connector keeps ``_key_sizes`` so the demux thread can look up sizes at delete time). - ``NixlObjPool.get_usage`` renamed to ``get_slot_usage`` to avoid colliding with the byte-based adapter ``get_usage``. Tests: 24 new across two files. ``test_l2_adapter_base.py`` covers ``AdapterUsage`` semantics, ``supports_eviction``, base-class accounting (aggregate + per-user, bucket cleanup at zero, underflow clamp, ``MappingProxyType`` immutability, listener-only ``size=0`` notify, concurrent thread safety) and ``test_mock_l2_adapter.py`` adds an end-to-end test that ``cache_salt`` flows from a real adapter store through to ``AdapterUsage.per_user_bytes``. Signed-off-by: royyhuang <roy.y.huang@gmail.com>
ApostaC
left a comment
There was a problem hiding this comment.
3 of the comments are generated by claude code. Remainings are my minor comments.
Addresses review feedback on LMCache#3042: - cache_salt must not contain '/', '\', or NUL (in addition to '@'). The FS adapter embeds salt into filenames — '/' would inject directory separators; '\' is a Windows path separator; NUL terminates C strings in the C++ connector. - Max length capped at 128 chars to stay within NAME_MAX (255) after model_name, kv_rank, chunk_hash, and the .data extension are added. - Updated example external connector template (examples/lmc_external_native_connector) to handle the @@ salted key format, matching the in-tree csrc/storage_backends/fs/connector.cpp. - Added tests for /, \, NUL rejection and length cap on both ObjectKey and IPCCacheEngineKey. Signed-off-by: royyhuang <roy.y.huang@gmail.com>
All new writes use the unified @@-prefixed format for both salted and
unsalted keys:
@@<cache_salt>@<model>@<rank>@<hash>
For empty cache_salt this becomes @@@model@rank@hash. The legacy
3-field format (no @@ prefix) is still readable by all parsers for
backward compatibility but emits a deprecation warning on the FS
adapter read path. Re-storing a legacy entry automatically migrates
it to the new format.
This simplifies parsing — the @@ prefix is always present so there
is no need to disambiguate based on field count (model_name may
contain @ and would otherwise be ambiguous).
Updated Python (native_connector + fs_l2_adapter), C++ (in-tree
fs/connector.cpp), and the example external connector template.
Tests updated: serialization format assertions reflect the @@@
prefix for empty salt, and roundtrip tests verify the parser still
handles legacy filenames.
Signed-off-by: royyhuang <roy.y.huang@gmail.com>
Aligns the example external native connector template with the in-tree csrc/storage_backends/fs/connector.cpp convention from the previous commit. Previously the example only emitted the @@ prefix when cache_salt was non-empty, leaving empty-salt keys in the legacy 3-field format. Now all files on disk use the unified @@-prefixed format (empty salt produces @@@model@...), and legacy inputs without @@ are still accepted on the read path for backward compatibility. Third-party connectors cloned from this template will produce migrate-compatible filenames out of the box. Signed-off-by: royyhuang <roy.y.huang@gmail.com>
Supersedes the prior ``@@``-prefix design (commit 5afd493). The new format appends ``cache_salt`` as a trailing 4th field only when set, so un-salted keys use the original 3-field shape unchanged: Unsalted: <model>@<rank>@<hash> (== pre-cache_salt format) Salted : <model>@<rank>@<hash>@<salt> Benefits over ``@@`` prefix: - **No migration needed** for existing un-salted cache directories or remote stores — the on-disk/wire shape is bit-identical to before. The migration helper script is removed accordingly. - **Simpler parser**: one ``split('@')`` dispatched by field count (3 or 4). No prefix check, no rsplit. - **Cleaner filenames**: no triple-``@`` prefix for empty salt. Disambiguation now relies on forbidding ``@`` in ``model_name`` (added to ``ObjectKey.__post_init__`` alongside the existing ``cache_salt`` invariants). HuggingFace model IDs use ``-_./`` so this rejects nothing that occurs in practice. Files updated: - lmcache/v1/distributed/api.py (reject '@' in model_name) - lmcache/v1/distributed/l2_adapters/fs_l2_adapter.py - lmcache/v1/distributed/l2_adapters/native_connector_l2_adapter.py - csrc/storage_backends/fs/connector.{cpp,h} - examples/lmc_external_native_connector/csrc/connector.cpp - tests/v1/distributed/test_{fs_l2_adapter_keys,native_connector_l2_adapter}.py - docs/design/.../l2_per_user_quota.md Signed-off-by: royyhuang <roy.y.huang@gmail.com>
There was a problem hiding this comment.
@sammshen Can you take a quick look at this file?
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 62a6125. Configure here.
``ObjectKey.__post_init__`` now validates ``model_name`` and ``cache_salt`` invariants (no ``@``, no path separators, length cap) and raises ``ValueError`` on violation. The previous ``_filename_to_object_key`` constructed the ``ObjectKey`` outside the try/except block, so a stray file on disk with a forbidden character in its parsed fields would propagate an uncaught exception — callers (e.g. startup directory scans) expect ``None`` per the function's contract. Move the ``ObjectKey(...)`` construction inside the try block so the ``ValueError`` is caught and ``None`` returned. Added a test exercising the over-length-salt path. Signed-off-by: royyhuang <roy.y.huang@gmail.com>

Summary
Adds a
cache_salt: str = ""field toObjectKeyandIPCCacheEngineKeyso same-content / different-user cache entries produce distinct keys. This is the data-model piece of the per-user L2 cache quota feature (PR2 of a 6-PR series). #3029 wiredcache_saltthrough the MP adapter surface but left it a no-op until the data model landed; this PR makes it load-bearing.Stacks on top of #3029 and #3032 (both already merged). No migration is required — un-salted traffic (the default) produces bit-identical wire keys and filenames to the pre-cache_salt format.
Changes
Data model (
lmcache/v1/distributed/api.py,.../multiprocess/custom_types.py)ObjectKeygainscache_salt: str = ""as an identity field (participates ineq/hash).__post_init__validates:model_namemust not contain@(used as field separator in serialized keys)cache_saltmust not contain@,/,\, or NULcache_saltlength ≤ 128 (leaves headroom forNAME_MAX=255on FS filenames)IPCCacheEngineKeygainscache_saltat the end (msgspec map-mode wire compat; old 7-field payloads still decode).ipc_key_to_object_keys(ipc_key, chunk_hashes)readsipc_key.cache_saltdirectly — no separate parameter, so callers can't silently drop the salt.vllm_multi_process_adapter.py: scheduler + worker_create_key()now forwardcache_salttoIPCCacheEngineKey.Serialization format (trailing-salt, no marker)
Both the native connector wire format and the FS adapter filename format use
@as the field separator withcache_saltas an optional trailing field:split('@')+ dispatch on field count (3 or 4). No@@prefix, no rsplit, no marker byte — relies on themodel_name/cache_salt"no@" invariants instead._object_key_to_string(native connector),_object_key_to_filename/_filename_to_object_key(FS adapter),FSConnector::key_to_filename(C++ in-tree),ExampleFSConnector::safe_filename(external connector template).Backward compatibility
IPCCacheEngineKey): map-mode encoding means old 7-field payloads decode on new code withcache_saltdefaulting to"". Explicit wire-compat test.Test plan
test_fs_l2_adapter_keys.py(new): filename roundtrip parametrized over 2 model_name values × 3 cache_salt values; explicit 3-field and 4-field format tests; rejection tests for.txt, too-few / too-many field countstest_native_connector_l2_adapter.py:_object_key_to_stringformat tests for unsalted / salted shape; salt-inequality + hash separation; model_name@rejection; salt@///\/ NUL rejection + length captest_custom_types.py: msgspec wire roundtrip with + without salttests/v1/distributed/+tests/v1/multiprocess/suites: 698 passed, 26 skipped (GPU-heavy tests excluded)lmcache_fs.sorebuilt; unsalted key produces<safe_model>@0xrank@hash.data, salted key produces<safe_model>@0xrank@hash@salt.dataRelated
docs/design/v1/distributed/l2_adapters/l2_per_user_quota.md(updated in this PR)Note
Medium Risk
Touches cache-key identity, serialization, and filesystem/IPC parsing paths; bugs could cause cache misses, collisions, or inability to read existing entries, though unsalted backward-compat is explicitly preserved and covered by tests.
Overview
Adds
cache_saltas a first-class identity field on bothObjectKeyandIPCCacheEngineKey, with validation and msgspec wire-compat defaults, so same-content entries from different users no longer collide.Updates key serialization across the native connector wire format, FS adapter filenames, and the in-tree/external C++ FS connectors to support an optional trailing
@<cache_salt>field while keeping the unsalted 3-field format byte-identical for backward compatibility.Plumbs
cache_saltthrough the vLLM MP adapter when constructingIPCCacheEngineKey, changesipc_key_to_object_keys()to read the salt directly from the IPC key (removing the risk of callers dropping it), updates the design doc accordingly, and adds unit tests covering roundtrips, validation, and wire compatibility.Reviewed by Cursor Bugbot for commit e1f9ef1. Bugbot is set up for automated code reviews on this repo. Configure here.