Skip to content

[CLI] Implement lmcache ping subcommand#2859

Merged
maobaolong merged 2 commits intoLMCache:devfrom
Oasis-Git:ter
Mar 24, 2026
Merged

[CLI] Implement lmcache ping subcommand#2859
maobaolong merged 2 commits intoLMCache:devfrom
Oasis-Git:ter

Conversation

@Oasis-Git
Copy link
Copy Markdown
Member

What this PR does / why we need it:

Special notes for your reviewers:

If applicable:

  • this PR contains user facing changes - docs added
  • this PR contains unit tests

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new ping subcommand to the lmcache CLI, enabling users to perform liveness checks on both the LMCache and vLLM servers. This enhancement provides a standardized and efficient way to verify the operational status and responsiveness of these critical components, offering immediate feedback on their health through HTTP-based checks and round-trip time measurements. The implementation includes detailed error reporting and flexible URL configuration, significantly improving diagnostic capabilities for the system.

Highlights

  • New lmcache ping subcommand: A new CLI subcommand lmcache ping has been implemented to perform pure liveness checks for both the LMCache server (kvcache) and the vLLM server (engine).
  • HTTP-based health checks: The ping command utilizes HTTP GET requests to specific health endpoints (/api/healthcheck for kvcache and /health for engine) to determine server status.
  • Round-trip time measurement: The command measures and reports the network round-trip time, excluding local Python overhead, providing a clear indication of responsiveness.
  • Default URL configuration: Default URLs are provided for both kvcache (http://localhost:8080) and engine (http://localhost:8000), simplifying usage.
  • Comprehensive error handling: The command includes robust error handling for various HTTP statuses and connection issues, reporting 'OK' or 'FAIL' with detailed messages and appropriate exit codes.
  • Documentation and testing: A new design document for lmcache ping has been added, and extensive unit tests cover the ping helper and PingCommand's end-to-end functionality, including default URL resolution and error scenarios.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@Oasis-Git Oasis-Git added the full Run comprehensive tests on this PR label Mar 24, 2026
@Oasis-Git Oasis-Git requested a review from KuntaiDu March 24, 2026 03:18
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the lmcache ping subcommand, complete with a design document, implementation, and corresponding tests. The overall implementation is well-structured and aligns with the provided design. My review focuses on a few areas for improvement related to code modularity, style guide adherence, and test robustness. Specifically, I've suggested moving a shared utility function to a dedicated module to improve architecture, refactoring a helper function to enhance code clarity, and strengthening tests to more thoroughly validate behavior. I've also pointed out a minor style guide violation regarding import placement.


# First Party
from lmcache.cli.commands.base import BaseCommand
from lmcache.cli.commands.describe import normalize_url
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve modularity and avoid coupling between CLI command modules, the normalize_url function should be moved from describe.py to a shared utility file (e.g., lmcache/cli/utils.py). The design document docs/design/cli/ping.md also suggests this as a preferred option. This would make normalize_url reusable by other commands in the future without creating dependencies between them.

References
  1. The style guide emphasizes architectural soundness and modularity. Creating a dependency between two command modules (ping and describe) for a shared utility function goes against this principle. (link)

Comment on lines +52 to +67
start = time.monotonic()
try:
with urllib.request.urlopen(
urllib.request.Request(url), timeout=timeout
) as resp:
rtt_ms = (time.monotonic() - start) * 1000
if resp.status == 200:
return ("OK", rtt_ms, None)
return ("FAIL", rtt_ms, f"HTTP {resp.status}")
except urllib.error.HTTPError as exc:
rtt_ms = (time.monotonic() - start) * 1000
return ("FAIL", rtt_ms, f"HTTP {exc.code}: {exc.reason}")
except (urllib.error.URLError, OSError) as exc:
rtt_ms = (time.monotonic() - start) * 1000
reason = getattr(exc, "reason", str(exc))
return ("FAIL", rtt_ms, f"Cannot connect to {url}: {reason}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The calculation of rtt_ms is repeated in the try block and in each except block. This can be refactored to remove code duplication and improve readability by calculating the round-trip time once after the try...except block.

Suggested change
start = time.monotonic()
try:
with urllib.request.urlopen(
urllib.request.Request(url), timeout=timeout
) as resp:
rtt_ms = (time.monotonic() - start) * 1000
if resp.status == 200:
return ("OK", rtt_ms, None)
return ("FAIL", rtt_ms, f"HTTP {resp.status}")
except urllib.error.HTTPError as exc:
rtt_ms = (time.monotonic() - start) * 1000
return ("FAIL", rtt_ms, f"HTTP {exc.code}: {exc.reason}")
except (urllib.error.URLError, OSError) as exc:
rtt_ms = (time.monotonic() - start) * 1000
reason = getattr(exc, "reason", str(exc))
return ("FAIL", rtt_ms, f"Cannot connect to {url}: {reason}")
start = time.monotonic()
status, error = "OK", None
try:
with urllib.request.urlopen(
urllib.request.Request(url), timeout=timeout
) as resp:
if resp.status != 200:
status, error = "FAIL", f"HTTP {resp.status}"
except urllib.error.HTTPError as exc:
status, error = "FAIL", f"HTTP {exc.code}: {exc.reason}"
except (urllib.error.URLError, OSError) as exc:
reason = getattr(exc, "reason", str(exc))
status, error = "FAIL", f"Cannot connect to {url}: {reason}"
rtt_ms = (time.monotonic() - start) * 1000
return status, rtt_ms, error
References
  1. The style guide emphasizes code cleanliness. Refactoring to remove duplicated logic improves maintainability and readability. (link)

Comment thread tests/cli/test_ping.py
Comment on lines +108 to +109
# Standard
from unittest.mock import patch
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The import from unittest.mock import patch is located inside the test_ok method. According to the repository style guide, imports should be grouped at the top of the file. Please move this import to the Standard import section at the top of the file. This also applies to the same import in TestPingCommandEngine.test_ok on line 170.

References
  1. According to the repository style guide, imports should be grouped at the top of the file in the order: Standard, Third Party, First Party, Local. (link)

Comment thread tests/cli/test_ping.py
Comment on lines +219 to +220
with pytest.raises(SystemExit):
cmd.execute(FakeArgs())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This test correctly verifies that SystemExit is raised, but it doesn't confirm that the command attempted to connect to the correct default URL. The test can be strengthened by capturing stderr and asserting that the error message contains the expected default URL (http://localhost:8080/api/healthcheck). You'll need to ensure from unittest.mock import patch and import io are at the top of the file.

Suggested change
with pytest.raises(SystemExit):
cmd.execute(FakeArgs())
with pytest.raises(SystemExit), patch(
"sys.stderr", new_callable=io.StringIO
) as mock_stderr:
cmd.execute(FakeArgs())
assert "http://localhost:8080/api/healthcheck" in mock_stderr.getvalue()
References
  1. The style guide states that tests should verify the public interface and docstring contract. The current test does not fully verify the contract that a specific default URL is used. (link)

Comment thread tests/cli/test_ping.py
Comment on lines +232 to +233
with pytest.raises(SystemExit):
cmd.execute(FakeArgs())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to test_kvcache_default, this test should be strengthened to verify that the correct default URL for the engine (http://localhost:8000/health) was used. This can be done by capturing stderr and checking the contents of the error message. You'll need to ensure from unittest.mock import patch and import io are at the top of the file.

Suggested change
with pytest.raises(SystemExit):
cmd.execute(FakeArgs())
with pytest.raises(SystemExit), patch(
"sys.stderr", new_callable=io.StringIO
) as mock_stderr:
cmd.execute(FakeArgs())
assert "http://localhost:8000/health" in mock_stderr.getvalue()
References
  1. The style guide states that tests should verify the public interface and docstring contract. The current test does not fully verify the contract that a specific default URL is used. (link)

Copy link
Copy Markdown
Contributor

@KuntaiDu KuntaiDu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Collaborator

@maobaolong maobaolong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@maobaolong maobaolong enabled auto-merge (squash) March 24, 2026 07:56
@maobaolong maobaolong merged commit ad15a3b into LMCache:dev Mar 24, 2026
34 of 35 checks passed
@Oasis-Git Oasis-Git deleted the ter branch March 24, 2026 17:54
realAaronWu pushed a commit to realAaronWu/LMCache that referenced this pull request Mar 26, 2026
fix

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>
Co-authored-by: Kuntai Du <kuntai@uchicago.edu>
deng451e pushed a commit to deng451e/LMCache that referenced this pull request Mar 27, 2026
fix

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>
Co-authored-by: Kuntai Du <kuntai@uchicago.edu>
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
fix

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>
Co-authored-by: Kuntai Du <kuntai@uchicago.edu>
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
fix

Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>
Co-authored-by: Kuntai Du <kuntai@uchicago.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants