Skip to content

[CI] Add CI test for CB#2900

Merged
sammshen merged 4 commits intoLMCache:devfrom
deng451e:ci_blend_test
Mar 31, 2026
Merged

[CI] Add CI test for CB#2900
sammshen merged 4 commits intoLMCache:devfrom
deng451e:ci_blend_test

Conversation

@deng451e
Copy link
Copy Markdown
Collaborator

@deng451e deng451e commented Mar 29, 2026

What this PR does / why we need it:
• Adds CI coverage for CB to prevent regressions.
• Adds a shuffled document benchmark to test non-prefix chunk matching and recomputation.
Special notes for your reviewers:

If applicable:

  • this PR contains user facing changes - docs added

Note

Medium Risk
Medium risk because it introduces new CI orchestration (GPU pods, dynamic ports, background processes) and new proxy/benchmark code that could make CI flaky, but it doesn’t change production runtime paths.

Overview
Adds a new Buildkite Blend (CacheBlend) CI job that runs on a 2×GPU K8s pod (tensormesh/cacheblend:latest) and uploads artifacts (build_*.log).

Introduces setup-blend-env.sh to do per-job setup: GPU health precheck, create/reuse /workspace/.venv, install the latest vLLM nightly wheel (with a stable fallback), install LMCache editable into both the image venv and test venv, and pin tiktoken encodings via a local TIKTOKEN_ENCODINGS_BASE.

Adds a Blend run harness (run.sh + scripts/run-blend-test.sh) that starts the LMCache blend server, launches configurable pools of prefiller/decoder vLLM instances with dynamic free-port selection and GPU assignment, runs a new FastAPI disagg proxy (proxy.py) that waits on KV-cache telemetry before forwarding to decoders, then executes a new shuffled multi-doc QA benchmark (benchmarks/multi_doc_qa/shuffle_doc_qa.py) and fails the job if logs contain error/traceback/fatal patterns.

Written by Cursor Bugbot for commit b64bc64. This will update automatically on new commits. Configure here.

@deng451e deng451e changed the title [CI] Add CI for CacheBlend [CI] Add CI test for CacheBlend Mar 29, 2026
@deng451e deng451e requested review from ApostaC, YaoJiayi and sammshen and removed request for YaoJiayi March 29, 2026 07:31
Comment thread .buildkite/k3_tests/blend/scripts/proxy.py Fixed
Comment thread .buildkite/k3_tests/blend/scripts/proxy.py Fixed
Comment thread .buildkite/k3_tests/blend/scripts/proxy.py Fixed
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new integration test suite for CacheBlend, featuring Buildkite pipeline definitions, environment setup scripts, a disaggregated prefill/decode proxy server, and a multi-document QA benchmark. The reviewer identified several improvement opportunities, including replacing a text-based path reference with a proper symbolic link, avoiding the ':latest' tag in CI images for reproducibility, and adhering to the repository's style guide regarding import placement. Further suggestions include refactoring the proxy server to avoid global variables and magic numbers, and making hardcoded tool paths in shell scripts configurable to reduce brittleness.

Comment thread proxy.py Outdated
Comment thread .buildkite/k3_tests/blend/pipeline.yml
Comment thread .buildkite/k3_tests/blend/scripts/proxy.py
Comment thread .buildkite/k3_tests/blend/scripts/proxy.py
Comment thread .buildkite/k3_tests/blend/scripts/proxy.py
Comment thread .buildkite/k3_tests/blend/scripts/run-blend-test.sh
@deng451e deng451e force-pushed the ci_blend_test branch 2 times, most recently from 93ef4a1 to 7b4044c Compare March 29, 2026 07:49
Signed-off-by: deng451e <838677410@qq.com>
@deng451e deng451e changed the title [CI] Add CI test for CacheBlend [CI] Add CI test for CB Mar 29, 2026
Signed-off-by: deng451e <838677410@qq.com>
@deng451e deng451e requested a review from YaoJiayi March 30, 2026 22:56
@deng451e deng451e marked this pull request as ready for review March 31, 2026 01:29
@@ -0,0 +1,289 @@
# SPDX-License-Identifier: Apache-2.0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feel free to take from: #2885

Signed-off-by: deng451e <838677410@qq.com>
Signed-off-by: deng451e <838677410@qq.com>
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

"${TEST_PYTHON}" benchmarks/multi_doc_qa/shuffle_doc_qa.py \
--num-documents "${SHUFFLE_NUM_DOCUMENTS}" \
--document-length "${SHUFFLE_DOCUMENT_LENGTH}" \
--output-len "${SHUFFLE_OUTPUT_LEN}"; then
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing wait for proxy server before benchmark starts

High Severity

The CacheBlend proxy is started in the background (step 5) but the benchmark (step 6) runs immediately without waiting for the proxy to become ready. The benchmark's first action is client.models.list() which hits the proxy's /v1/models endpoint — if the proxy hasn't finished starting (uvicorn + lifespan), this fails with a connection error. A wait_for_server "$SERVICE_PORT" call is needed between steps 5 and 6, consistent with how all vLLM instances are awaited in step 4.

Fix in Cursor Fix in Web

echo "[FAIL] shuffle_doc_qa exceeded BENCHMARK_TIMEOUT_SEC=${BENCHMARK_TIMEOUT_SEC}s"
else
echo "[FAIL] shuffle_doc_qa exited with code ${rc}"
fi
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timeout exit code never captured due to bash $? semantics

Medium Severity

rc=$? on line 296 is inside the then block of if ! timeout ...; then. In bash, after if ! cmd, $? reflects the negated result (always 0 when the body is entered), not the original exit code from timeout. So rc is always 0, the [[ "$rc" -eq 124 ]] check on line 297 can never be true, and the timeout-specific diagnostic message is dead code. On a real timeout, the misleading message "exited with code 0" is shown instead.

Fix in Cursor Fix in Web

)

for chunk in chat_completion:
chunk_message = parse_chunk_output(chunk.choices[0])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing empty chunk.choices guard causes potential IndexError

Medium Severity

chunk.choices[0] is accessed without first checking that choices is non-empty. Streaming chat completions can yield chunks with an empty choices list (e.g., usage-reporting chunks), which would cause an IndexError. The sibling multi_doc_qa.py in the same directory explicitly guards against this with if not chunk.choices: continue.

Fix in Cursor Fix in Web

Copy link
Copy Markdown
Contributor

@sammshen sammshen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@sammshen sammshen enabled auto-merge (squash) March 31, 2026 22:58
@github-actions github-actions Bot added the full Run comprehensive tests on this PR label Mar 31, 2026
@sammshen sammshen merged commit e0a3325 into LMCache:dev Mar 31, 2026
35 checks passed
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
* [CI] ci blend test

Signed-off-by: deng451e <838677410@qq.com>

* correct path

Signed-off-by: deng451e <838677410@qq.com>

* fix gpt oss encoder issue

Signed-off-by: deng451e <838677410@qq.com>

* update model path

Signed-off-by: deng451e <838677410@qq.com>

---------

Signed-off-by: deng451e <838677410@qq.com>
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
* [CI] ci blend test

Signed-off-by: deng451e <838677410@qq.com>

* correct path

Signed-off-by: deng451e <838677410@qq.com>

* fix gpt oss encoder issue

Signed-off-by: deng451e <838677410@qq.com>

* update model path

Signed-off-by: deng451e <838677410@qq.com>

---------

Signed-off-by: deng451e <838677410@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants