Skip to content

[refactor]: clean up the messy LMCacheManager#2683

Merged
sammshen merged 9 commits intoLMCache:devfrom
sammshen:cleanup-manager
Mar 20, 2026
Merged

[refactor]: clean up the messy LMCacheManager#2683
sammshen merged 9 commits intoLMCache:devfrom
sammshen:cleanup-manager

Conversation

@sammshen
Copy link
Copy Markdown
Contributor

@sammshen sammshen commented Mar 4, 2026

introduce a new abstraction called ServiceFactory that every northbound integration can implement if they want all of the benefits of the LMCacheManager.


Note

Medium Risk
Moderate refactor of service initialization/shutdown paths for both vLLM and standalone modes; behavioral differences in which components/health monitors are created per role could affect runtime lifecycle and lookup operations.

Overview
Refactors LMCacheManager to be integration-agnostic by introducing a BaseServiceFactory abstraction that owns component creation (engine/metadata, lookup client/server, offload server, internal API server, runtime plugins, and health monitor).

Adds VllmServiceFactory and StandaloneServiceFactory implementations and updates the vLLM adapter and standalone manager to construct the manager via these factories. Health-monitor setup (including Prometheus/periodic-thread metrics wiring) is centralized in BaseServiceFactory, and vLLM-specific helpers (MLA validation, speculative draft-layer calculation, DP-rank0 detection) are moved into integration/vllm/utils.py.

Updates the internal lookup API to infer role from lmcache_engine_metadata.role (instead of an adapter role field), and adjusts lookup recreate behavior to error when the relevant lookup component is absent. Tests are updated to mock factories/metadata accordingly.

Written by Cursor Bugbot for commit 01cb891. This will update automatically on new commits. Configure here.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the LMCacheManager by introducing a ServiceFactory abstraction. This new design pattern decouples the manager from the specifics of how its internal components are created and configured, allowing for cleaner integration with different serving engines like vLLM and a more streamlined standalone mode. The change improves modularity, maintainability, and extensibility by centralizing component creation logic within dedicated factory classes.

Highlights

  • New Abstraction: BaseServiceFactory: Introduced a new abstract class, BaseServiceFactory, which defines a standardized interface for creating LMCache service components. This allows different serving engine integrations (e.g., vLLM) to implement their own component creation logic.
  • LMCacheManager Refactor: The LMCacheManager has been significantly refactored to delegate the creation and management of its internal components (engine, lookup client/server, offload server, API server, plugin launcher, health monitor) to an injected ServiceFactory instance. This change enhances modularity and reduces the manager's direct responsibility for component instantiation.
  • vLLM Integration via VllmServiceFactory: A concrete implementation, VllmServiceFactory, was added to encapsulate all vLLM-specific logic for creating LMCache components. This includes determining metadata, MLA configuration, and role-based component instantiation (scheduler, worker, DP rank 0).
  • Standalone Integration via StandaloneServiceFactory: A new StandaloneServiceFactory was introduced to provide LMCache components for the standalone mode, simplifying the StandaloneLMCacheManager by removing vLLM-specific dependencies from its initialization.
  • Code Relocation and Decoupling: Logic related to MLA configuration validation (validate_mla_config) and draft layer calculation (calculate_draft_layers) was moved from LMCacheManager to lmcache/integration/vllm/utils.py, further decoupling the manager from vLLM-specific details.
  • Test Updates: Unit tests for LMCacheManager were updated to reflect the new ServiceFactory dependency, utilizing mock factories to ensure proper testing of component creation and lifecycle management.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • lmcache/integration/base_service_factory.py
    • Added BaseServiceFactory abstract class to define the interface for creating LMCache service components.
  • lmcache/integration/vllm/utils.py
    • Added utility functions validate_mla_config, calculate_draft_layers, and is_dp_rank0 for vLLM-specific configurations.
  • lmcache/integration/vllm/vllm_service_factory.py
    • Added VllmServiceFactory to encapsulate vLLM-specific logic for creating LMCache service components.
  • lmcache/integration/vllm/vllm_v1_adapter.py
    • Updated the LMCacheConnectorV1Impl to instantiate and use the new VllmServiceFactory for LMCacheManager initialization.
  • lmcache/v1/manager.py
    • Refactored LMCacheManager to delegate component creation to an injected BaseServiceFactory instance, removing direct component initialization logic.
  • lmcache/v1/standalone/manager.py
    • Updated StandaloneLMCacheManager to utilize the new StandaloneServiceFactory for component creation, simplifying its initialization.
  • lmcache/v1/standalone/standalone_service_factory.py
    • Added StandaloneServiceFactory to provide LMCache service components specifically for standalone mode.
  • tests/v1/test_manager.py
    • Updated unit tests for LMCacheManager to align with the new ServiceFactory abstraction and moved validation/calculation tests to vLLM utils.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@sammshen sammshen mentioned this pull request Mar 4, 2026
12 tasks
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and valuable refactoring by abstracting the creation of LMCache service components into a ServiceFactory, successfully decoupling LMCacheManager from vLLM-specific implementation details for better maintainability. However, a critical security concern exists: the refactoring continues to instantiate an InternalAPIServer that lacks authentication and binds to all network interfaces by default. This poses a significant risk if enabled in production, exposing internal management interfaces. While the code is well-structured, addressing this security vulnerability is paramount, and I also have a couple of suggestions to further improve the new abstractions.

Comment thread lmcache/v1/manager.py Outdated
Comment on lines +107 to +109
self._api_server: Optional[InternalAPIServer] = (
service_factory.maybe_create_internal_api_server(lmcache_manager=self)
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The InternalAPIServer is instantiated here without any authentication or authorization mechanism. By default, this server binds to 0.0.0.0 (as defined in the configuration), which exposes internal management and monitoring APIs to the network. This could allow unauthorized users to access sensitive cache statistics or perform management operations if the server is enabled by an administrator. It is recommended to implement authentication for the API server and change the default binding to 127.0.0.1.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Comment on lines +29 to +69
class BaseServiceFactory:
"""Abstract base for creating LMCache service components.

Subclasses must implement all methods to provide the appropriate
components for their serving engine integration.
"""

def get_or_create_metadata(self) -> Optional["LMCacheMetadata"]:
raise NotImplementedError

def get_or_create_lmcache_engine(self) -> Optional["LMCacheEngine"]:
raise NotImplementedError

def maybe_create_lookup_client(self) -> Optional["LookupClientInterface"]:
raise NotImplementedError

def maybe_create_prometheus_logger(self) -> Optional["PrometheusLogger"]:
raise NotImplementedError

def maybe_create_lookup_server(
self,
) -> Optional[Union["LMCacheLookupServer", "LMCacheAsyncLookupServer"]]:
raise NotImplementedError

def maybe_create_offload_server(self) -> Optional["ZMQOffloadServer"]:
raise NotImplementedError

def maybe_create_runtime_plugin_launcher(
self,
) -> Optional["RuntimePluginLauncher"]:
raise NotImplementedError

def maybe_create_internal_api_server(
self, lmcache_manager: "LMCacheManager"
) -> Optional["InternalAPIServer"]:
raise NotImplementedError

def maybe_create_health_monitor(
self, lmcache_manager: "LMCacheManager"
) -> Optional["HealthMonitor"]:
raise NotImplementedError
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To make the abstract nature of BaseServiceFactory more explicit and to enforce implementation of its methods in subclasses at compile/import time rather than at runtime, it's a good practice to use Python's abc module. This involves adding from abc import ABC, abstractmethod at the top of the file, inheriting from abc.ABC, and decorating abstract methods with @abstractmethod.

class BaseServiceFactory(ABC):
    """Abstract base for creating LMCache service components.

    Subclasses must implement all methods to provide the appropriate
    components for their serving engine integration.
    """

    @abstractmethod
    def get_or_create_metadata(self) -> Optional["LMCacheMetadata"]:
        raise NotImplementedError

    @abstractmethod
    def get_or_create_lmcache_engine(self) -> Optional["LMCacheEngine"]:
        raise NotImplementedError

    @abstractmethod
    def maybe_create_lookup_client(self) -> Optional["LookupClientInterface"]:
        raise NotImplementedError

    @abstractmethod
    def maybe_create_prometheus_logger(self) -> Optional["PrometheusLogger"]:
        raise NotImplementedError

    @abstractmethod
    def maybe_create_lookup_server(
        self,
    ) -> Optional[Union["LMCacheLookupServer", "LMCacheAsyncLookupServer"]]:
        raise NotImplementedError

    @abstractmethod
    def maybe_create_offload_server(self) -> Optional["ZMQOffloadServer"]:
        raise NotImplementedError

    @abstractmethod
    def maybe_create_runtime_plugin_launcher(
        self,
    ) -> Optional["RuntimePluginLauncher"]:
        raise NotImplementedError

    @abstractmethod
    def maybe_create_internal_api_server(
        self, lmcache_manager: "LMCacheManager"
    ) -> Optional["InternalAPIServer"]:
        raise NotImplementedError

    @abstractmethod
    def maybe_create_health_monitor(
        self, lmcache_manager: "LMCacheManager"
    ) -> Optional["HealthMonitor"]:
        raise NotImplementedError

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

logger = init_logger(__name__)


class VllmServiceFactory(BaseServiceFactory):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The VllmServiceFactory class inherits from BaseServiceFactory but does not implement the maybe_create_prometheus_logger method. While this doesn't cause an error currently because BaseServiceFactory doesn't use abc.ABC, it's good practice to implement all methods from the base abstract class for completeness and to avoid potential runtime errors if the base class changes. Since PrometheusLogger is created on-demand in other methods, you can implement this method to simply return None.

    def maybe_create_prometheus_logger(self) -> Optional["PrometheusLogger"]:
        # PrometheusLogger is created on-demand within other components
        # (e.g., engine creation for scheduler, health monitor setup).
        return None

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@sammshen sammshen requested a review from maobaolong March 4, 2026 08:40
@sammshen
Copy link
Copy Markdown
Contributor Author

sammshen commented Mar 6, 2026

@maobaolong would you like to take a look. this is part of what we discussed previously.

Copy link
Copy Markdown
Collaborator

@maobaolong maobaolong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the instance_id are matched, i found there are several way to set it.

  • ENGINE_NAME = "vllm-instance"
  • ENGINE_NAME = "LMCacheEngine"
  • _config.lmcache_instance_id

Could you help to do a double check? @sammshen

Samuel Shen and others added 5 commits March 17, 2026 04:30
Signed-off-by: Samuel Shen <slshen@uchciago.edu>
Signed-off-by: Samuel Shen <slshen@uchciago.edu>
Signed-off-by: Samuel Shen <slshen@uchciago.edu>
Signed-off-by: Samuel Shen <slshen@uchciago.edu>
Signed-off-by: Samuel Shen <slshen@tensormesh.ai>
@sammshen
Copy link
Copy Markdown
Contributor Author

@maobaolong @chunxiaozheng fixed! PTAL

Copy link
Copy Markdown
Collaborator

@maobaolong maobaolong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sammshen LGTM, thanks for this big refactor.

@sammshen
Copy link
Copy Markdown
Contributor Author

sammshen commented Mar 17, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

To use Codex here, create an environment for this repo.

@sammshen
Copy link
Copy Markdown
Contributor Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6639a3f547

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread lmcache/v1/manager.py Outdated
Comment on lines +83 to +87
try:
self._init_components()
self._lmcache_engine_metadata: Optional[LMCacheMetadata] = (
service_factory.get_or_create_metadata()
)
self._lmcache_engine: Optional[LMCacheEngine] = (
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Initialize component fields before factory creation

All component attributes are now assigned only inside the try block, so if any early factory call fails (for example get_or_create_metadata), later methods like post_init() and stop_services() can dereference missing attributes (self._lmcache_engine, self._offload_server, etc.) and raise AttributeError instead of staying in the intended degraded mode. This turns a recoverable init failure into follow-on crashes during lifecycle handling.

Useful? React with 👍 / 👎.

Comment on lines +140 to +142
local_worker_id, local_world_size = calculate_local_rank_and_world_size(
self.vllm_config
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid GPU probing for scheduler-only metadata path

get_or_create_metadata() now unconditionally calls calculate_local_rank_and_world_size, which relies on GPU/XPU platform detection and can raise when the scheduler process is CPU-only. In the scheduler-without-bypass flow, no engine is created, so forcing device probing here can fail LMCache initialization in control-plane scheduler deployments that previously only needed metadata.

Useful? React with 👍 / 👎.

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@sammshen
Copy link
Copy Markdown
Contributor Author

@cursor review

Comment thread lmcache/v1/manager.py
Comment thread lmcache/v1/standalone/standalone_service_factory.py Outdated
Signed-off-by: Samuel Shen <slshen@tensormesh.ai>
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

Comment thread lmcache/v1/internal_api_server/vllm/lookup_api.py Outdated
Signed-off-by: Samuel Shen <slshen@tensormesh.ai>
@sammshen sammshen enabled auto-merge (squash) March 19, 2026 04:44
@github-actions github-actions Bot added the full Run comprehensive tests on this PR label Mar 19, 2026
Comment thread lmcache/v1/manager.py
return {"error": "only supported for scheduler role"}

if self._lookup_client is None:
return {"error": "lookup client not available"}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recreate lookup methods fail after partial failure

Medium Severity

recreate_lookup_client now guards on self._lookup_client is None instead of the previous role-based check. Since close_lookup_client sets self._lookup_client = None before create_lookup_client runs, if the create step fails, the client stays None and all subsequent recreate_lookup_client calls return an error, preventing recovery. The same issue affects recreate_lookup_server. The old role-based guard always allowed retrying regardless of current state.

Additional Locations (1)
Fix in Cursor Fix in Web


@abstractmethod
def maybe_create_prometheus_logger(self) -> Optional["PrometheusLogger"]:
raise NotImplementedError
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Abstract method maybe_create_prometheus_logger is never called

Low Severity

maybe_create_prometheus_logger is declared as an @abstractmethod in BaseServiceFactory and implemented in both VllmServiceFactory and StandaloneServiceFactory, but LMCacheManager never calls it. This forces every new BaseServiceFactory subclass to implement a method that is never used, adding unnecessary interface burden.

Fix in Cursor Fix in Web

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

There are 5 total unresolved issues (including 2 from previous reviews).

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

def maybe_create_health_monitor(
self, lmcache_manager: "LMCacheManager"
) -> Optional[HealthMonitor]:
raise NotImplementedError
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Public abstract methods missing docstrings in BaseServiceFactory

Medium Severity

Multiple new public abstract methods in BaseServiceFactory lack docstrings: get_or_create_metadata, get_or_create_lmcache_engine, maybe_create_lookup_client, maybe_create_prometheus_logger, maybe_create_lookup_server, maybe_create_offload_server, maybe_create_runtime_plugin_launcher, maybe_create_internal_api_server, and maybe_create_health_monitor. Per project conventions, all new public functions require docstrings. These are the core contract methods for the new abstraction.

Fix in Cursor Fix in Web

Triggered by project rule: LMCache Code Review Style Guide



def is_dp_rank0(vllm_config: "VllmConfig") -> bool:
return vllm_config.parallel_config.data_parallel_rank_local == 0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New public function is_dp_rank0 missing docstring

Low Severity

The new public function is_dp_rank0 lacks a docstring. Per project conventions, all new public functions require docstrings describing what the function does, its arguments, and its return value.

Fix in Cursor Fix in Web

Triggered by project rule: LMCache Code Review Style Guide

return None

def maybe_create_runtime_plugin_launcher(self):
return None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StandaloneServiceFactory overrides lack return type hints

Low Severity

Several overriding methods in StandaloneServiceFactory (maybe_create_lookup_client, maybe_create_prometheus_logger, maybe_create_lookup_server, maybe_create_offload_server, maybe_create_runtime_plugin_launcher) are missing return type annotations. Per project conventions, all new functions must have type hints for arguments and return values.

Fix in Cursor Fix in Web

Triggered by project rule: LMCache Code Review Style Guide

@sammshen sammshen merged commit f97cdfe into LMCache:dev Mar 20, 2026
27 checks passed
realAaronWu pushed a commit to realAaronWu/LMCache that referenced this pull request Mar 20, 2026
* new service factory

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* new service factory

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* remove all roles

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* fix UT

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* add back vllm instance

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

* address codex comments

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

* address cursor comments

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

* fix role in lookup api

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

---------

Signed-off-by: Samuel Shen <slshen@uchciago.edu>
Signed-off-by: Samuel Shen <slshen@tensormesh.ai>
Co-authored-by: Samuel Shen <slshen@uchciago.edu>
Signed-off-by: Aaron Wu <aaron.wu@dell.com>
deng451e pushed a commit to deng451e/LMCache that referenced this pull request Mar 21, 2026
* new service factory

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* new service factory

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* remove all roles

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* fix UT

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* add back vllm instance

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

* address codex comments

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

* address cursor comments

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

* fix role in lookup api

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

---------

Signed-off-by: Samuel Shen <slshen@uchciago.edu>
Signed-off-by: Samuel Shen <slshen@tensormesh.ai>
Co-authored-by: Samuel Shen <slshen@uchciago.edu>
deng451e pushed a commit to deng451e/LMCache that referenced this pull request Mar 25, 2026
* new service factory

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* new service factory

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* remove all roles

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* fix UT

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* add back vllm instance

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

* address codex comments

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

* address cursor comments

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

* fix role in lookup api

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

---------

Signed-off-by: Samuel Shen <slshen@uchciago.edu>
Signed-off-by: Samuel Shen <slshen@tensormesh.ai>
Co-authored-by: Samuel Shen <slshen@uchciago.edu>
deng451e pushed a commit to deng451e/LMCache that referenced this pull request Mar 27, 2026
* new service factory

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* new service factory

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* remove all roles

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* fix UT

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* add back vllm instance

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

* address codex comments

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

* address cursor comments

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

* fix role in lookup api

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

---------

Signed-off-by: Samuel Shen <slshen@uchciago.edu>
Signed-off-by: Samuel Shen <slshen@tensormesh.ai>
Co-authored-by: Samuel Shen <slshen@uchciago.edu>
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
* new service factory

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* new service factory

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* remove all roles

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* fix UT

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* add back vllm instance

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

* address codex comments

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

* address cursor comments

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

* fix role in lookup api

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

---------

Signed-off-by: Samuel Shen <slshen@uchciago.edu>
Signed-off-by: Samuel Shen <slshen@tensormesh.ai>
Co-authored-by: Samuel Shen <slshen@uchciago.edu>
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
* new service factory

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* new service factory

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* remove all roles

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* fix UT

Signed-off-by: Samuel Shen <slshen@uchciago.edu>

* add back vllm instance

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

* address codex comments

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

* address cursor comments

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

* fix role in lookup api

Signed-off-by: Samuel Shen <slshen@tensormesh.ai>

---------

Signed-off-by: Samuel Shen <slshen@uchciago.edu>
Signed-off-by: Samuel Shen <slshen@tensormesh.ai>
Co-authored-by: Samuel Shen <slshen@uchciago.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants