Skip to content

[Feat] Add cache_salt parameter to MP adapter interfaces#3029

Merged
royyhuang merged 5 commits intoLMCache:devfrom
royyhuang:feat/l2-per-user-quota-pr1
Apr 15, 2026
Merged

[Feat] Add cache_salt parameter to MP adapter interfaces#3029
royyhuang merged 5 commits intoLMCache:devfrom
royyhuang:feat/l2-per-user-quota-pr1

Conversation

@royyhuang
Copy link
Copy Markdown
Contributor

@royyhuang royyhuang commented Apr 14, 2026

Summary

Changed methods

Scheduler adapter (LMCacheMPSchedulerAdapter):

  • maybe_submit_lookup_request(request_id, token_ids, cache_salt="")
  • free_lookup_locks(..., cache_salt="")
  • _create_key(..., cache_salt="")

Worker adapter (LMCacheMPWorkerAdapter):

  • submit_store_request(request_id, op, event, cache_salt="")
  • submit_retrieve_request(request_id, op, event, cache_salt="")
  • batched_submit_store_requests(request_ids, ops, event, cache_salts=None)
  • batched_submit_retrieve_requests(request_ids, ops, event, cache_salts=None)
  • _create_key(..., cache_salt="")

cache_salt is accepted but not yet forwarded to IPCCacheEngineKey — that field will be added in a follow-up PR.

Test plan

  • All 441 distributed tests pass
  • pre-commit clean
  • No behavioral change (all defaults are empty string)

Context

PR1a of the per-user L2 quota feature. See design doc: docs/design/l2_adapters/l2_per_user_quota.md


Note

Low Risk
Low risk interface expansion with default values, intended to be behavior-preserving; main risk is downstream callers needing to pass/ignore the new optional args correctly.

Overview
Introduces a behavior-preserving interface extension to the vLLM multi-process adapter by adding optional cache_salt parameters to scheduler lookup/free-lock calls and worker store/retrieve APIs (including batched variants), threading the value through to key construction for future per-user cache isolation.

Adds a detailed design doc (docs/design/l2_adapters/l2_per_user_quota.md) describing the upcoming per-user quota + eviction approach and the planned cross-component changes.

Reviewed by Cursor Bugbot for commit 829689a. Bugbot is set up for automated code reviews on this repo. Configure here.

Add cache_salt="" default parameter to scheduler and worker adapter
methods to prepare for per-user cache isolation. This is a no-op
change — all defaults are empty string, preserving existing behavior.

Scheduler adapter:
- maybe_submit_lookup_request: add cache_salt param
- free_lookup_locks: add cache_salt param
- _create_key: add cache_salt param (accepted, not yet forwarded)

Worker adapter:
- submit_store_request: add cache_salt param
- submit_retrieve_request: add cache_salt param
- batched_submit_store_requests: add cache_salts list param
- batched_submit_retrieve_requests: add cache_salts list param
- _create_key: add cache_salt param (accepted, not yet forwarded)

cache_salt will be forwarded to IPCCacheEngineKey once that field is
added in a follow-up PR.

Signed-off-by: royyhuang <royyhuang@gmail.com>
Signed-off-by: royyhuang <roy.y.huang@gmail.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces cache_salt (per-user isolation salt) across various methods in the VLLMMultiProcessAdapter, including lookup, store, retrieve, and their batched counterparts. Feedback suggests using strict=True in the zip calls within the batched store and retrieve methods to ensure that the input lists (request_ids, ops, and cache_salts) are of equal length, preventing potential silent truncation.

self.submit_store_request(request_id, op, event)
if cache_salts is None:
cache_salts = [""] * len(request_ids)
for request_id, op, salt in zip(request_ids, ops, cache_salts, strict=False):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using strict=True in the zip call ensures that request_ids, ops, and cache_salts have the same length. This prevents silent truncation of the batch if there is a mismatch between these lists, which is particularly important now that a third list (cache_salts) is involved in the iteration.

Suggested change
for request_id, op, salt in zip(request_ids, ops, cache_salts, strict=False):
for request_id, op, salt in zip(request_ids, ops, cache_salts, strict=True):

self.submit_retrieve_request(request_id, op, event)
if cache_salts is None:
cache_salts = [""] * len(request_ids)
for request_id, op, salt in zip(request_ids, ops, cache_salts, strict=False):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using strict=True in the zip call ensures that request_ids, ops, and cache_salts have the same length. This prevents silent truncation of the batch if there is a mismatch between these lists, ensuring all requests in the batch are processed as expected.

Suggested change
for request_id, op, salt in zip(request_ids, ops, cache_salts, strict=False):
for request_id, op, salt in zip(request_ids, ops, cache_salts, strict=True):

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 656976a. Configure here.

Comment thread lmcache/integration/vllm/vllm_multi_process_adapter.py
Signed-off-by: royyhuang <roy.y.huang@gmail.com>
@royyhuang royyhuang requested a review from ApostaC as a code owner April 14, 2026 22:05
Copy link
Copy Markdown
Contributor

@sammshen sammshen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@royyhuang royyhuang added the full Run comprehensive tests on this PR label Apr 14, 2026
Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment thread lmcache/integration/vllm/vllm_multi_process_adapter.py
Signed-off-by: royyhuang <roy.y.huang@gmail.com>
@royyhuang royyhuang enabled auto-merge (squash) April 15, 2026 00:35
@royyhuang royyhuang merged commit e69b684 into LMCache:dev Apr 15, 2026
38 of 39 checks passed
royyhuang added a commit to royyhuang/LMCache that referenced this pull request Apr 15, 2026
Adds a ``cache_salt: str = ""`` field to ``ObjectKey`` and
``IPCCacheEngineKey`` so same-content/different-user cache entries
produce distinct keys. This is the data-model piece of the per-user
L2 cache quota feature (PR2 of 6). PR1a (LMCache#3029) wired cache_salt
through the adapter surface but left it a no-op until the data model
landed; this PR makes it load-bearing.

Key changes:

- ``ObjectKey`` and ``IPCCacheEngineKey`` gain ``cache_salt`` field
  participating in eq/hash. ``__post_init__`` on both rejects ``@`` in
  the salt — this is the invariant the L2 adapter parsers rely on.
- ``ipc_key_to_object_keys(ipc_key, chunk_hashes)`` reads
  ``ipc_key.cache_salt`` internally. No separate parameter, so callers
  cannot accidentally drop the salt.
- Native connector, FS adapter Python, and FS adapter C++ all use a
  leading ``@@`` marker to disambiguate salted from legacy keys. Legacy
  files and wire payloads (empty salt) remain readable — existing
  stored data is preserved.
- msgspec encodes dataclasses as maps, so an old 7-field
  ``IPCCacheEngineKey`` payload decodes on new code with ``cache_salt``
  defaulting to ``""``. New-on-old works because msgspec ignores
  unknown map keys by default.

Tests: 32 new cases covering serialization roundtrip (native + FS),
validation (``@`` rejection), isolation semantics (eq/hash per salt),
msgspec wire compat (old→new, new→new), and the FS C++ parser's
legacy/salted/``@``-in-model paths.

Signed-off-by: royyhuang <roy.y.huang@gmail.com>
royyhuang added a commit that referenced this pull request Apr 17, 2026
* [Feat] Add cache_salt to ObjectKey for per-user cache isolation

Adds a ``cache_salt: str = ""`` field to ``ObjectKey`` and
``IPCCacheEngineKey`` so same-content/different-user cache entries
produce distinct keys. This is the data-model piece of the per-user
L2 cache quota feature (PR2 of 6). PR1a (#3029) wired cache_salt
through the adapter surface but left it a no-op until the data model
landed; this PR makes it load-bearing.

Key changes:

- ``ObjectKey`` and ``IPCCacheEngineKey`` gain ``cache_salt`` field
  participating in eq/hash. ``__post_init__`` on both rejects ``@`` in
  the salt — this is the invariant the L2 adapter parsers rely on.
- ``ipc_key_to_object_keys(ipc_key, chunk_hashes)`` reads
  ``ipc_key.cache_salt`` internally. No separate parameter, so callers
  cannot accidentally drop the salt.
- Native connector, FS adapter Python, and FS adapter C++ all use a
  leading ``@@`` marker to disambiguate salted from legacy keys. Legacy
  files and wire payloads (empty salt) remain readable — existing
  stored data is preserved.
- msgspec encodes dataclasses as maps, so an old 7-field
  ``IPCCacheEngineKey`` payload decodes on new code with ``cache_salt``
  defaulting to ``""``. New-on-old works because msgspec ignores
  unknown map keys by default.

Tests: 32 new cases covering serialization roundtrip (native + FS),
validation (``@`` rejection), isolation semantics (eq/hash per salt),
msgspec wire compat (old→new, new→new), and the FS C++ parser's
legacy/salted/``@``-in-model paths.

Signed-off-by: royyhuang <roy.y.huang@gmail.com>

* [Fix] Expand cache_salt validation: reject /, \, NUL; add length cap

Addresses review feedback on #3042:

- cache_salt must not contain '/', '\', or NUL (in addition to '@').
  The FS adapter embeds salt into filenames — '/' would inject directory
  separators; '\' is a Windows path separator; NUL terminates C strings
  in the C++ connector.
- Max length capped at 128 chars to stay within NAME_MAX (255) after
  model_name, kv_rank, chunk_hash, and the .data extension are added.
- Updated example external connector template
  (examples/lmc_external_native_connector) to handle the @@ salted
  key format, matching the in-tree csrc/storage_backends/fs/connector.cpp.
- Added tests for /, \, NUL rejection and length cap on both ObjectKey
  and IPCCacheEngineKey.

Signed-off-by: royyhuang <roy.y.huang@gmail.com>

* [Refactor] Always use @@-prefixed key format; deprecate legacy

All new writes use the unified @@-prefixed format for both salted and
unsalted keys:

    @@<cache_salt>@<model>@<rank>@<hash>

For empty cache_salt this becomes @@@model@rank@hash. The legacy
3-field format (no @@ prefix) is still readable by all parsers for
backward compatibility but emits a deprecation warning on the FS
adapter read path. Re-storing a legacy entry automatically migrates
it to the new format.

This simplifies parsing — the @@ prefix is always present so there
is no need to disambiguate based on field count (model_name may
contain @ and would otherwise be ambiguous).

Updated Python (native_connector + fs_l2_adapter), C++ (in-tree
fs/connector.cpp), and the example external connector template.
Tests updated: serialization format assertions reflect the @@@
prefix for empty salt, and roundtrip tests verify the parser still
handles legacy filenames.

Signed-off-by: royyhuang <roy.y.huang@gmail.com>

* [Refactor] Example connector: always emit @@-prefixed filename

Aligns the example external native connector template with the
in-tree csrc/storage_backends/fs/connector.cpp convention from the
previous commit.

Previously the example only emitted the @@ prefix when cache_salt
was non-empty, leaving empty-salt keys in the legacy 3-field format.
Now all files on disk use the unified @@-prefixed format (empty
salt produces @@@model@...), and legacy inputs without @@ are still
accepted on the read path for backward compatibility.

Third-party connectors cloned from this template will produce
migrate-compatible filenames out of the box.

Signed-off-by: royyhuang <roy.y.huang@gmail.com>

---------

Signed-off-by: royyhuang <roy.y.huang@gmail.com>
ftian1 pushed a commit to ftian1/LMCache that referenced this pull request Apr 20, 2026
* [Feat] Add cache_salt parameter to MP adapter interfaces

Add cache_salt="" default parameter to scheduler and worker adapter
methods to prepare for per-user cache isolation. This is a no-op
change — all defaults are empty string, preserving existing behavior.

Scheduler adapter:
- maybe_submit_lookup_request: add cache_salt param
- free_lookup_locks: add cache_salt param
- _create_key: add cache_salt param (accepted, not yet forwarded)

Worker adapter:
- submit_store_request: add cache_salt param
- submit_retrieve_request: add cache_salt param
- batched_submit_store_requests: add cache_salts list param
- batched_submit_retrieve_requests: add cache_salts list param
- _create_key: add cache_salt param (accepted, not yet forwarded)

cache_salt will be forwarded to IPCCacheEngineKey once that field is
added in a follow-up PR.

Signed-off-by: royyhuang <royyhuang@gmail.com>
Signed-off-by: royyhuang <roy.y.huang@gmail.com>

* add per user l2 usage design docs

Signed-off-by: royyhuang <roy.y.huang@gmail.com>

* address comments

Signed-off-by: royyhuang <roy.y.huang@gmail.com>

---------

Signed-off-by: royyhuang <royyhuang@gmail.com>
Signed-off-by: royyhuang <roy.y.huang@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants