[Feat] Add cache_salt parameter to MP adapter interfaces by royyhuang · Pull Request #3029 · LMCache/LMCache

royyhuang · 2026-04-14T21:52:47Z

Summary

Add cache_salt="" default parameter to all scheduler and worker adapter methods in the MP multi-process adapter
This is a no-op change — all defaults are empty string, preserving existing behavior. The actual data model changes will be made after interface changes are merged on vLLM side [KVConnector][LMCache] Propagate cache_salt through MP connector for per-user cache isolation vllm-project/vllm#39837.
Prepares the adapter interface for per-user cache isolation (follow-up PRs will populate cache_salt from vLLM's Request.cache_salt)

Changed methods

Scheduler adapter (LMCacheMPSchedulerAdapter):

maybe_submit_lookup_request(request_id, token_ids, cache_salt="")
free_lookup_locks(..., cache_salt="")
_create_key(..., cache_salt="")

Worker adapter (LMCacheMPWorkerAdapter):

submit_store_request(request_id, op, event, cache_salt="")
submit_retrieve_request(request_id, op, event, cache_salt="")
batched_submit_store_requests(request_ids, ops, event, cache_salts=None)
batched_submit_retrieve_requests(request_ids, ops, event, cache_salts=None)
_create_key(..., cache_salt="")

cache_salt is accepted but not yet forwarded to IPCCacheEngineKey — that field will be added in a follow-up PR.

Test plan

All 441 distributed tests pass
pre-commit clean
No behavioral change (all defaults are empty string)

Context

PR1a of the per-user L2 quota feature. See design doc: docs/design/l2_adapters/l2_per_user_quota.md

Note

Low Risk
Low risk interface expansion with default values, intended to be behavior-preserving; main risk is downstream callers needing to pass/ignore the new optional args correctly.

Overview
Introduces a behavior-preserving interface extension to the vLLM multi-process adapter by adding optional cache_salt parameters to scheduler lookup/free-lock calls and worker store/retrieve APIs (including batched variants), threading the value through to key construction for future per-user cache isolation.

Adds a detailed design doc (docs/design/l2_adapters/l2_per_user_quota.md) describing the upcoming per-user quota + eviction approach and the planned cross-component changes.

^{Reviewed by Cursor Bugbot for commit 829689a. Bugbot is set up for automated code reviews on this repo. Configure here.}

Add cache_salt="" default parameter to scheduler and worker adapter methods to prepare for per-user cache isolation. This is a no-op change — all defaults are empty string, preserving existing behavior. Scheduler adapter: - maybe_submit_lookup_request: add cache_salt param - free_lookup_locks: add cache_salt param - _create_key: add cache_salt param (accepted, not yet forwarded) Worker adapter: - submit_store_request: add cache_salt param - submit_retrieve_request: add cache_salt param - batched_submit_store_requests: add cache_salts list param - batched_submit_retrieve_requests: add cache_salts list param - _create_key: add cache_salt param (accepted, not yet forwarded) cache_salt will be forwarded to IPCCacheEngineKey once that field is added in a follow-up PR. Signed-off-by: royyhuang <royyhuang@gmail.com> Signed-off-by: royyhuang <roy.y.huang@gmail.com>

gemini-code-assist

Code Review

This pull request introduces cache_salt (per-user isolation salt) across various methods in the VLLMMultiProcessAdapter, including lookup, store, retrieve, and their batched counterparts. Feedback suggests using strict=True in the zip calls within the batched store and retrieve methods to ensure that the input lists (request_ids, ops, and cache_salts) are of equal length, preventing potential silent truncation.

gemini-code-assist · 2026-04-14T21:54:45Z

-            self.submit_store_request(request_id, op, event)
+        if cache_salts is None:
+            cache_salts = [""] * len(request_ids)
+        for request_id, op, salt in zip(request_ids, ops, cache_salts, strict=False):


Using strict=True in the zip call ensures that request_ids, ops, and cache_salts have the same length. This prevents silent truncation of the batch if there is a mismatch between these lists, which is particularly important now that a third list (cache_salts) is involved in the iteration.

Suggested change

for request_id, op, salt in zip(request_ids, ops, cache_salts, strict=False):

for request_id, op, salt in zip(request_ids, ops, cache_salts, strict=True):

gemini-code-assist · 2026-04-14T21:54:45Z

-            self.submit_retrieve_request(request_id, op, event)
+        if cache_salts is None:
+            cache_salts = [""] * len(request_ids)
+        for request_id, op, salt in zip(request_ids, ops, cache_salts, strict=False):


Using strict=True in the zip call ensures that request_ids, ops, and cache_salts have the same length. This prevents silent truncation of the batch if there is a mismatch between these lists, ensuring all requests in the batch are processed as expected.

Suggested change

for request_id, op, salt in zip(request_ids, ops, cache_salts, strict=False):

for request_id, op, salt in zip(request_ids, ops, cache_salts, strict=True):

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 656976a. Configure here.}

Signed-off-by: royyhuang <roy.y.huang@gmail.com>

sammshen

LGTM!

ApostaC

LGTM!

Signed-off-by: royyhuang <roy.y.huang@gmail.com>

…ota-pr1' into feat/l2-per-user-quota-pr1

Adds a ``cache_salt: str = ""`` field to ``ObjectKey`` and ``IPCCacheEngineKey`` so same-content/different-user cache entries produce distinct keys. This is the data-model piece of the per-user L2 cache quota feature (PR2 of 6). PR1a (LMCache#3029) wired cache_salt through the adapter surface but left it a no-op until the data model landed; this PR makes it load-bearing. Key changes: - ``ObjectKey`` and ``IPCCacheEngineKey`` gain ``cache_salt`` field participating in eq/hash. ``__post_init__`` on both rejects ``@`` in the salt — this is the invariant the L2 adapter parsers rely on. - ``ipc_key_to_object_keys(ipc_key, chunk_hashes)`` reads ``ipc_key.cache_salt`` internally. No separate parameter, so callers cannot accidentally drop the salt. - Native connector, FS adapter Python, and FS adapter C++ all use a leading ``@@`` marker to disambiguate salted from legacy keys. Legacy files and wire payloads (empty salt) remain readable — existing stored data is preserved. - msgspec encodes dataclasses as maps, so an old 7-field ``IPCCacheEngineKey`` payload decodes on new code with ``cache_salt`` defaulting to ``""``. New-on-old works because msgspec ignores unknown map keys by default. Tests: 32 new cases covering serialization roundtrip (native + FS), validation (``@`` rejection), isolation semantics (eq/hash per salt), msgspec wire compat (old→new, new→new), and the FS C++ parser's legacy/salted/``@``-in-model paths. Signed-off-by: royyhuang <roy.y.huang@gmail.com>

@model

* [Feat] Add cache_salt to ObjectKey for per-user cache isolation Adds a ``cache_salt: str = ""`` field to ``ObjectKey`` and ``IPCCacheEngineKey`` so same-content/different-user cache entries produce distinct keys. This is the data-model piece of the per-user L2 cache quota feature (PR2 of 6). PR1a (#3029) wired cache_salt through the adapter surface but left it a no-op until the data model landed; this PR makes it load-bearing. Key changes: - ``ObjectKey`` and ``IPCCacheEngineKey`` gain ``cache_salt`` field participating in eq/hash. ``__post_init__`` on both rejects ``@`` in the salt — this is the invariant the L2 adapter parsers rely on. - ``ipc_key_to_object_keys(ipc_key, chunk_hashes)`` reads ``ipc_key.cache_salt`` internally. No separate parameter, so callers cannot accidentally drop the salt. - Native connector, FS adapter Python, and FS adapter C++ all use a leading ``@@`` marker to disambiguate salted from legacy keys. Legacy files and wire payloads (empty salt) remain readable — existing stored data is preserved. - msgspec encodes dataclasses as maps, so an old 7-field ``IPCCacheEngineKey`` payload decodes on new code with ``cache_salt`` defaulting to ``""``. New-on-old works because msgspec ignores unknown map keys by default. Tests: 32 new cases covering serialization roundtrip (native + FS), validation (``@`` rejection), isolation semantics (eq/hash per salt), msgspec wire compat (old→new, new→new), and the FS C++ parser's legacy/salted/``@``-in-model paths. Signed-off-by: royyhuang <roy.y.huang@gmail.com> * [Fix] Expand cache_salt validation: reject /, \, NUL; add length cap Addresses review feedback on #3042: - cache_salt must not contain '/', '\', or NUL (in addition to '@'). The FS adapter embeds salt into filenames — '/' would inject directory separators; '\' is a Windows path separator; NUL terminates C strings in the C++ connector. - Max length capped at 128 chars to stay within NAME_MAX (255) after model_name, kv_rank, chunk_hash, and the .data extension are added. - Updated example external connector template (examples/lmc_external_native_connector) to handle the @@ salted key format, matching the in-tree csrc/storage_backends/fs/connector.cpp. - Added tests for /, \, NUL rejection and length cap on both ObjectKey and IPCCacheEngineKey. Signed-off-by: royyhuang <roy.y.huang@gmail.com> * [Refactor] Always use @@-prefixed key format; deprecate legacy All new writes use the unified @@-prefixed format for both salted and unsalted keys: @@<cache_salt>@<model>@<rank>@<hash> For empty cache_salt this becomes @@@model@rank@hash. The legacy 3-field format (no @@ prefix) is still readable by all parsers for backward compatibility but emits a deprecation warning on the FS adapter read path. Re-storing a legacy entry automatically migrates it to the new format. This simplifies parsing — the @@ prefix is always present so there is no need to disambiguate based on field count (model_name may contain @ and would otherwise be ambiguous). Updated Python (native_connector + fs_l2_adapter), C++ (in-tree fs/connector.cpp), and the example external connector template. Tests updated: serialization format assertions reflect the @@@ prefix for empty salt, and roundtrip tests verify the parser still handles legacy filenames. Signed-off-by: royyhuang <roy.y.huang@gmail.com> * [Refactor] Example connector: always emit @@-prefixed filename Aligns the example external native connector template with the in-tree csrc/storage_backends/fs/connector.cpp convention from the previous commit. Previously the example only emitted the @@ prefix when cache_salt was non-empty, leaving empty-salt keys in the legacy 3-field format. Now all files on disk use the unified @@-prefixed format (empty salt produces @@@model@...), and legacy inputs without @@ are still accepted on the read path for backward compatibility. Third-party connectors cloned from this template will produce migrate-compatible filenames out of the box. Signed-off-by: royyhuang <roy.y.huang@gmail.com> --------- Signed-off-by: royyhuang <roy.y.huang@gmail.com>

* [Feat] Add cache_salt parameter to MP adapter interfaces Add cache_salt="" default parameter to scheduler and worker adapter methods to prepare for per-user cache isolation. This is a no-op change — all defaults are empty string, preserving existing behavior. Scheduler adapter: - maybe_submit_lookup_request: add cache_salt param - free_lookup_locks: add cache_salt param - _create_key: add cache_salt param (accepted, not yet forwarded) Worker adapter: - submit_store_request: add cache_salt param - submit_retrieve_request: add cache_salt param - batched_submit_store_requests: add cache_salts list param - batched_submit_retrieve_requests: add cache_salts list param - _create_key: add cache_salt param (accepted, not yet forwarded) cache_salt will be forwarded to IPCCacheEngineKey once that field is added in a follow-up PR. Signed-off-by: royyhuang <royyhuang@gmail.com> Signed-off-by: royyhuang <roy.y.huang@gmail.com> * add per user l2 usage design docs Signed-off-by: royyhuang <roy.y.huang@gmail.com> * address comments Signed-off-by: royyhuang <roy.y.huang@gmail.com> --------- Signed-off-by: royyhuang <royyhuang@gmail.com> Signed-off-by: royyhuang <roy.y.huang@gmail.com>

royyhuang requested review from YaoJiayi, deng451e, maobaolong and sammshen as code owners April 14, 2026 21:52

gemini-code-assist Bot reviewed Apr 14, 2026

View reviewed changes

cursor Bot reviewed Apr 14, 2026

View reviewed changes

Comment thread lmcache/integration/vllm/vllm_multi_process_adapter.py

royyhuang mentioned this pull request Apr 14, 2026

[KVConnector][LMCache] Propagate cache_salt through MP connector for per-user cache isolation vllm-project/vllm#39837

Merged

3 tasks

add per user l2 usage design docs

b51ec9f

Signed-off-by: royyhuang <roy.y.huang@gmail.com>

royyhuang requested a review from ApostaC as a code owner April 14, 2026 22:05

sammshen approved these changes Apr 14, 2026

View reviewed changes

royyhuang added the full Run comprehensive tests on this PR label Apr 14, 2026

Merge branch 'dev' into feat/l2-per-user-quota-pr1

3b1e53a

ApostaC approved these changes Apr 15, 2026

View reviewed changes

Comment thread lmcache/integration/vllm/vllm_multi_process_adapter.py

royyhuang added 2 commits April 15, 2026 00:34

address comments

2e859d6

Signed-off-by: royyhuang <roy.y.huang@gmail.com>

Merge remote-tracking branch 'refs/remotes/origin/feat/l2-per-user-qu…

829689a

…ota-pr1' into feat/l2-per-user-quota-pr1

royyhuang enabled auto-merge (squash) April 15, 2026 00:35

royyhuang merged commit e69b684 into LMCache:dev Apr 15, 2026
38 of 39 checks passed

royyhuang mentioned this pull request Apr 15, 2026

[MP][Feat] Add cache_salt to ObjectKey for cache isolation #3042

Merged

4 tasks

ruizhang0101 mentioned this pull request Apr 20, 2026

[MP][Feat] Tag MP metrics and spans with cache_salt for per-tenant attribution #3089

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat] Add cache_salt parameter to MP adapter interfaces#3029

[Feat] Add cache_salt parameter to MP adapter interfaces#3029
royyhuang merged 5 commits intoLMCache:devfrom
royyhuang:feat/l2-per-user-quota-pr1

royyhuang commented Apr 14, 2026 •

edited by cursor Bot

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 14, 2026

Uh oh!

gemini-code-assist Bot Apr 14, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

sammshen left a comment

Uh oh!

ApostaC left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	for request_id, op, salt in zip(request_ids, ops, cache_salts, strict=False):
	for request_id, op, salt in zip(request_ids, ops, cache_salts, strict=True):

Conversation

royyhuang commented Apr 14, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changed methods

Test plan

Context

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sammshen left a comment

Choose a reason for hiding this comment

Uh oh!

ApostaC left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

royyhuang commented Apr 14, 2026 •

edited by cursor Bot

Loading