[Bench]: Add Blend V2 Stress Test script by sammshen · Pull Request #2885 · LMCache/LMCache

sammshen · 2026-03-26T22:41:59Z

Add long_doc_permutator.py and README for stress testing the blend v2 server across 5 axes: context boundaries, eviction, chunk homogeneity, prefix domination, and concurrency.

Note

Medium Risk
Adds a new async benchmark script that can generate high request volume/concurrency against a Blend v2 server and write artifacts; misuse or missing optional dependencies (e.g., LMCache log parser/module) could cause runtime failures or heavy load.

Overview
Adds a new benchmarks/blend_v2 permutation-based stress test to exercise Blend v2 KV reuse across context boundary orderings, eviction pressure, chunk-hash collision risk, prefix-dominated prompts, and concurrency.

Introduces long_doc_permutator.py, which generates synthetic system prompts/contexts, enumerates or samples context permutations, drives async streaming chat-completion requests with configurable concurrency, and writes results/plots plus a combined summary.txt (optionally attempting to parse an --lmcache-log). A new README documents the 5 stress axes and provides runnable example configurations.

^{Written by Cursor Bugbot for commit 968a94e. This will update automatically on new commits. Configure here.}

Add long_doc_permutator.py and README for stress testing the blend v2 server across 5 axes: context boundaries, eviction, chunk homogeneity, prefix domination, and concurrency.

gemini-code-assist

Code Review

This pull request introduces a new benchmark tool, long_doc_permutator.py, and its accompanying documentation to stress test the Blend Server V2 implementation. The tool evaluates performance across several axes, including context boundaries, eviction, chunk homogeneity, prefix domination, and concurrency. The review feedback identifies several violations of the repository's style guide, specifically regarding missing type hints and docstrings for new functions, as well as improper import practices and path construction logic.

gemini-code-assist · 2026-03-26T22:44:27Z

+# ---------------------------------------------------------------------------
+
+
+def write_resp(text: str):


The function write_resp is missing a return type hint and a docstring. According to the repository style guide (lines 24 and 25), all new public functions must have type hints and docstrings. Please add the -> None return type and a docstring explaining the function's purpose.

Suggested change

def write_resp(text: str):

def write_resp(text: str) -> None:

References

All new functions must have type hints for arguments and return values. ^(link)

All new public functions must have docstrings. ^(link)

gemini-code-assist · 2026-03-26T22:44:27Z

+# ---------------------------------------------------------------------------
+
+
+def relative_time(df: pd.DataFrame, start_time: float):


The function relative_time is missing a return type hint and a docstring, violating the repository style guide (lines 24 and 25). Please add the -> None return type and a docstring.

Suggested change

def relative_time(df: pd.DataFrame, start_time: float):

def relative_time(df: pd.DataFrame, start_time: float) -> None:

References

All new functions must have type hints for arguments and return values. ^(link)

All new public functions must have docstrings. ^(link)

gemini-code-assist · 2026-03-26T22:44:28Z

+    }
+
+
+def print_results(df: pd.DataFrame, wall_time: float, label: str):


The function print_results is missing a return type hint and a docstring, violating the repository style guide (lines 24 and 25). Please add the -> None return type and a docstring.

Suggested change

def print_results(df: pd.DataFrame, wall_time: float, label: str):

def print_results(df: pd.DataFrame, wall_time: float, label: str) -> None:

References

All new functions must have type hints for arguments and return values. ^(link)

All new public functions must have docstrings. ^(link)

gemini-code-assist · 2026-03-26T22:44:28Z

+        print(f"  Throughput     : {total_tokens / wall_time:.2f} tok/s")
+
+
+def plot_ttft_distribution(df: pd.DataFrame, filename: str = "ttft_distribution.png"):


The function plot_ttft_distribution is missing a return type hint, violating the repository style guide (line 24). Please add the -> None return type.

Suggested change

def plot_ttft_distribution(df: pd.DataFrame, filename: str = "ttft_distribution.png"):

def plot_ttft_distribution(df: pd.DataFrame, filename: str = "ttft_distribution.png") -> None:

References

All new functions must have type hints for arguments and return values. ^(link)

gemini-code-assist · 2026-03-26T22:44:28Z

+# ---------------------------------------------------------------------------
+
+
+async def main(args):


The main function is missing type hints for its args parameter and its return value, as well as a docstring. This violates the repository style guide (lines 24 and 25). Please type args as argparse.Namespace, add the -> None return type, and include a docstring.

Suggested change

async def main(args):

async def main(args: argparse.Namespace) -> None:

References

All new functions must have type hints for arguments and return values. ^(link)

All new public functions must have docstrings. ^(link)

gemini-code-assist · 2026-03-26T22:44:28Z

+    parser.add_argument(
+        "--lmcache-workers",
+        type=int,
+        default=4,


The function create_argument_parser is missing a return type hint and a docstring, violating the repository style guide (lines 24 and 25). Please add the -> argparse.ArgumentParser return type and a docstring.

def create_argument_parser() -> argparse.ArgumentParser:

References

All new functions must have type hints for arguments and return values. ^(link)

All new public functions must have docstrings. ^(link)

gemini-code-assist · 2026-03-26T22:44:28Z

+            script_dir = os.path.dirname(os.path.abspath(__file__))
+            sys.path.insert(0, script_dir)
+            # Third Party
+            from parse_lmcache_log import parse_log
+            from parse_lmcache_log import report as cache_report


Modifying sys.path at runtime to handle imports is fragile and goes against best practices. It also violates the project's import ordering conventions (style guide line 28), which require imports to be at the top of the file. The comment # Third Party is also incorrect for this local import.

Please refactor this to use a standard import mechanism. If parse_lmcache_log is a sibling script, consider making this directory a package or adjusting PYTHONPATH externally when running the benchmark.

References

Imports should be ordered: Standard / Third Party / First Party / Local, and placed at the top of the file. ^(link)

gemini-code-assist · 2026-03-26T22:44:28Z

+    if args.output:
+        if args.output_dir and args.output_dir != ".":
+            OUTPUT_FILE = os.path.join(args.output_dir, args.output)
+        else:
+            OUTPUT_FILE = args.output


The logic for constructing the OUTPUT_FILE path can be simplified. os.path.join handles the case where args.output_dir is . correctly. You can reduce these lines to a single os.path.join call inside the if args.output: block for better readability.

if args.output: OUTPUT_FILE = os.path.join(args.output_dir, args.output)

JiwaniZakir

In generate_vocab_pool, the pool variable is typed as set[str] to avoid duplicates, but the uniqueness is already guaranteed structurally: the suffix f"{word}{len(pool)}" uses the current pool size as a counter, so every generated word is inherently unique regardless of the random base. Using a set here adds unnecessary overhead for large --vocab-size values; a list with a simple counter would be both cleaner and more efficient.

The sentinel value --max-inflight-requests 0 meaning "flood all requests" is a subtle footgun — zero conventionally reads as "no concurrency allowed" rather than "unlimited." A value of -1 (or a dedicated --flood flag) would align better with common CLI conventions and avoid confusion when users scan the argument help text.

One missing stress axis worth considering: the README documents five axes but there's no test scenario combining a very small --vocab-size (e.g., 6) with a high --num-permutations to simultaneously stress both chunk collision and eviction, which seems like the most adversarial real-world case for the rolling-hash logic.

cursor

Cursor Bugbot has reviewed your changes and found 4 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

cursor · 2026-04-01T21:13:16Z

+        with open(OUTPUT_FILE, "a") as f:
+            f.write(text)
+    else:
+        sys.stdout.write(text)


Multiple public functions missing return type hints

Low Severity

Several new public functions lack return type hints: write_resp, relative_time, print_results, main, and create_argument_parser. The project's coding conventions require all functions to have type hints for arguments and return values. main also lacks a type hint for its args parameter.

Additional Locations (2)

benchmarks/blend_v2/long_doc_permutator.py#L281-L285

benchmarks/blend_v2/long_doc_permutator.py#L397-L398

^{Triggered by project rule: LMCache Code Review Style Guide}

cursor · 2026-04-01T21:13:16Z

+    if len(ok) > 0:
+        total_tokens = ok["prompt_tokens"].sum() + ok["completion_tokens"].sum()
+        print(f"  Throughput     : {len(ok) / wall_time:.2f} req/s")
+        print(f"  Throughput     : {total_tokens / wall_time:.2f} tok/s")


Multiple public functions missing docstrings

Low Severity

Several new public functions lack docstrings: write_resp, relative_time, print_results, main, and create_argument_parser. The project's coding conventions require all public functions to have docstrings covering what the function does, its arguments, and return values.

Additional Locations (2)

benchmarks/blend_v2/long_doc_permutator.py#L281-L285

benchmarks/blend_v2/long_doc_permutator.py#L541-L653

^{Triggered by project rule: LMCache Code Review Style Guide}

cursor · 2026-04-01T21:13:16Z

+            sys.path.insert(0, script_dir)
+            # Third Party
+            from parse_lmcache_log import parse_log
+            from parse_lmcache_log import report as cache_report


Local import mislabeled as third-party

Low Severity

The parse_lmcache_log imports are labeled with a # Third Party section comment, but this is a local module loaded via sys.path.insert. Per the project's import ordering convention (Standard / Third Party / First Party / Local), this should use a # Local comment instead.

^{Triggered by project rule: LMCache Code Review Style Guide}

cursor · 2026-04-01T21:13:16Z

+        "max": float(s.max()),
+        "p95": float(s.quantile(0.95)),
+        "p99": float(s.quantile(0.99)),
+        "std": float(s.std()),


NaN from single-element std produces invalid JSON output

Low Severity

When exactly one request succeeds (e.g. --num-permutations 1), s.std() with default ddof=1 returns NaN. This NaN propagates through ttft_stats into the summary dict, and json.dumps(summary) emits a NaN literal, which is not valid JSON per the spec. Strict JSON parsers (e.g. jq) will reject the output.

Additional Locations (1)

benchmarks/blend_v2/long_doc_permutator.py#L508-L513

sammshen · 2026-04-02T22:38:38Z

now in #2937

[Bench]: Add blend v2 permutation stress test

1b93860

Add long_doc_permutator.py and README for stress testing the blend v2 server across 5 axes: context boundaries, eviction, chunk homogeneity, prefix domination, and concurrency.

gemini-code-assist Bot reviewed Mar 26, 2026

View reviewed changes

JiwaniZakir reviewed Mar 27, 2026

View reviewed changes

sammshen mentioned this pull request Mar 31, 2026

[CI] Add CI test for CB #2900

Merged

1 task

Merge branch 'dev' into blend-stress-tests

968a94e

deng451e added the full Run comprehensive tests on this PR label Apr 1, 2026

cursor Bot reviewed Apr 1, 2026

View reviewed changes

deng451e mentioned this pull request Apr 2, 2026

[CLI]Add long-doc-permutator CLI bench workload #2937

Merged

sammshen closed this Apr 2, 2026

		# ---------------------------------------------------------------------------


		def write_resp(text: str):

	def write_resp(text: str):
	def write_resp(text: str) -> None:

		# ---------------------------------------------------------------------------


		def relative_time(df: pd.DataFrame, start_time: float):

		}


		def print_results(df: pd.DataFrame, wall_time: float, label: str):

		print(f" Throughput : {total_tokens / wall_time:.2f} tok/s")


		def plot_ttft_distribution(df: pd.DataFrame, filename: str = "ttft_distribution.png"):

		# ---------------------------------------------------------------------------


		async def main(args):

	async def main(args):
	async def main(args: argparse.Namespace) -> None:

Conversation

sammshen commented Mar 26, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

JiwaniZakir left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Apr 1, 2026

Choose a reason for hiding this comment

Multiple public functions missing return type hints

Uh oh!

cursor Bot Apr 1, 2026

Choose a reason for hiding this comment

Multiple public functions missing docstrings

Uh oh!

cursor Bot Apr 1, 2026

Choose a reason for hiding this comment

Local import mislabeled as third-party

Uh oh!

cursor Bot Apr 1, 2026

Choose a reason for hiding this comment

NaN from single-element std produces invalid JSON output

Uh oh!

sammshen commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sammshen commented Mar 26, 2026 •

edited by cursor Bot

Loading