[CLI]Add long-doc-permutator CLI bench workload by deng451e · Pull Request #2937 · LMCache/LMCache

deng451e · 2026-04-02T21:40:53Z

Add long-doc-permutator workload to lmcache bench engine

Stress-tests blended KV cache reuse by sending all N! permutations of N synthetic context documents, exercising five axes: context boundary mixing, eviction,
chunk homogeneity, prefix domination, and concurrency.

Benchmark script ported from @sammshen PR #2885.

Changes

long_doc_permutator.py — new workload + CLI wiring (--ldp-* flags, workload factory)

Note

Medium Risk
Adds a new async workload with its own concurrency and request-dispatch loop, plus new CLI/interactive config surface; issues would primarily affect benchmark execution and resource usage rather than core runtime logic.

Overview
Adds a new long-doc-permutator benchmark workload to lmcache bench engine, which generates synthetic long contexts and sends multiple permutations of them (with configurable context count/length, system prompt length, permutation count, and in-flight concurrency).

Wires the workload into the CLI and interactive config schema via new --ldp-* flags and factory dispatch, and updates CSV export to auto-create the output directory before writing results. Includes a comprehensive new test suite covering config validation, prompt/permutation generation, dispatch behavior, and reproducibility.

^{Written by Cursor Bugbot for commit 1356112. This will update automatically on new commits. Configure here.}

gemini-code-assist

Code Review

This pull request introduces the long-doc-permutator workload to the engine benchmark, designed to stress-test KV cache reuse through document permutations. It also refactors the lmcache query CLI command to be self-contained, integrating RequestSender and implementing an automatic fallback to the completions endpoint when chat templates are missing. Feedback focuses on optimizing memory usage for large permutation sets, ensuring proper resource lifecycle management by removing a redundant run override, and refining exception handling for optional dependency imports.

sammshen

LGTM! small comment on the long doc permutator

cursor · 2026-04-02T23:18:10Z

+def _is_missing_chat_template_error(error: str) -> bool:
+    """Return whether an error indicates missing tokenizer chat template."""
+    normalized = error.lower()
+    return "chat template" in normalized and "tokenizer" in normalized


Chat template error detection is too narrow for fallback

Medium Severity

_is_missing_chat_template_error requires both "chat template" and "tokenizer" in the error string, but the old _missing_chat_template matched on "chat template" alone (plus several other patterns). Common vLLM/engine errors like "No chat template found" or "This model does not have a chat template" lack "tokenizer", so the automatic retry from chat to completions mode won't trigger, causing queries to fail unnecessarily.

Additional Locations (1)

lmcache/cli/commands/query.py#L454-L460

ApostaC

Update to the lmcache bench looks good to me in general! Please see other comments below.

ApostaC · 2026-04-03T00:51:01Z

+                "long-doc-permutator",
+                "Permutations of context documents (stress-tests blended KV reuse)",


For the name and the description, the current one is not super clear.

My proposal for the description: Query the same set of long documents with different system prompts

No good ideas for the name. WDYT?

Oh I got it wrong! Is it something like query the same set of long documents with different orders?

yes, just updated it to Query the same set of long documents with different orders to make it less confusing

ApostaC · 2026-04-03T00:51:41Z

+    ConfigItem(
+        key="ldp_vocab_size",
+        display_name="Vocabulary size",
+        description=(
+            "Pool size for context word generation. "
+            "Smaller values increase chunk hash collision risk."
+        ),
+        input_type="int",
+        default=8000,
+        condition=_workload_is("long-doc-permutator"),
+        phase=PHASE_WORKLOAD,
+    ),


I don't think we need to expose this to users. This can be purely internal and hard-coded.

Changed it to hardcoded

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: deng451e <838677410@qq.com>

Signed-off-by: deng451e <838677410@qq.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: deng451e <838677410@qq.com>

Signed-off-by: deng451e <838677410@qq.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

ApostaC

LGTM!

deng451e requested a review from sammshen April 2, 2026 21:41

deng451e marked this pull request as ready for review April 2, 2026 21:41

gemini-code-assist Bot reviewed Apr 2, 2026

View reviewed changes

Comment thread lmcache/cli/commands/bench/engine_bench/workloads/long_doc_permutator.py

Comment thread lmcache/cli/commands/bench/engine_bench/workloads/long_doc_permutator.py

Comment thread lmcache/cli/commands/query.py Outdated

cursor Bot reviewed Apr 2, 2026

View reviewed changes

Comment thread lmcache/cli/commands/bench/engine_bench/workloads/long_doc_permutator.py Outdated

deng451e requested a review from ApostaC April 2, 2026 21:48

sammshen added the full Run comprehensive tests on this PR label Apr 2, 2026

sammshen approved these changes Apr 2, 2026

View reviewed changes

sammshen mentioned this pull request Apr 2, 2026

[Bench]: Add Blend V2 Stress Test script #2885

Closed

cursor Bot reviewed Apr 2, 2026

View reviewed changes

ApostaC reviewed Apr 3, 2026

View reviewed changes

cursor Bot reviewed Apr 3, 2026

View reviewed changes

Comment thread lmcache/cli/commands/bench/engine_bench/workloads/__init__.py

deng451e force-pushed the migrate_bench_to_cli branch 2 times, most recently from 962b2a0 to 5284f7d Compare April 3, 2026 01:49

deng451e and others added 2 commits April 3, 2026 01:53

add new workload to cli bench

253838c

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: deng451e <838677410@qq.com>

Revert query-command.md and prompt.py to dev state

cdcc318

Signed-off-by: deng451e <838677410@qq.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: deng451e <838677410@qq.com>

deng451e force-pushed the migrate_bench_to_cli branch from 5284f7d to cdcc318 Compare April 3, 2026 01:53

fix issue

1356112

Signed-off-by: deng451e <838677410@qq.com>

cursor Bot reviewed Apr 3, 2026

View reviewed changes

Comment thread lmcache/cli/commands/bench/engine_bench/workloads/long_doc_permutator.py

ApostaC approved these changes Apr 3, 2026

View reviewed changes

ApostaC merged commit 6ceed5e into LMCache:dev Apr 3, 2026
35 checks passed

		"long-doc-permutator",
		"Permutations of context documents (stress-tests blended KV reuse)",

Conversation

deng451e commented Apr 2, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sammshen left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Apr 2, 2026

Choose a reason for hiding this comment

Chat template error detection is too narrow for fallback

Uh oh!

Uh oh!

ApostaC left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ApostaC Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

ApostaC Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

deng451e Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

ApostaC Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

deng451e Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ApostaC left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

deng451e commented Apr 2, 2026 •

edited by cursor Bot

Loading

ApostaC left a comment •

edited

Loading