Add scheduler instance_id and model_name to L0 KV lifecycle tracking by Oasis-Git · Pull Request #3043 · LMCache/LMCache

Oasis-Git · 2026-04-15T21:44:03Z

Add instance_id field to BlockAllocationRecord (per-record, default 0)
Server looks up model_name from gpu_context_meta and adds to event
L0LifecycleSubscriber reads instance_id from each record, model_name from event metadata
Key shadow map by (instance_id, block_id) for multi-instance support
Emit OTel histograms with instance_id and model_name attributes for per-instance, per-model Prometheus metric slicing
Update EVENTS.md, METRICS.md, and observability.rst docs
Add test verifying OTel attributes on histogram data points

What this PR does / why we need it:

Special notes for your reviewers:

If applicable:

this PR contains user facing changes - docs added
this PR contains unit tests

Note

Medium Risk
Changes the wire protocol and handler signature for REPORT_BLOCK_ALLOCATION, so mismatched client/server versions could break block-allocation reporting; metric attribute additions and shadow-map keying also affect lifecycle metric cardinality and behavior.

Overview
Adds instance_id and model_name propagation to MP_VLLM_BLOCK_ALLOCATION from the vLLM adapter through the multiprocess protocol/server event metadata, so L0 lifecycle tracking can distinguish blocks across multiple scheduler instances.

Updates L0LifecycleSubscriber to key its shadow state by (instance_id, block_id) and to emit eviction/reuse histograms with instance_id/model_name OTel attributes for Prometheus slicing; tests and observability docs are updated accordingly, including a new assertion that histogram data points carry these attributes.

^{Reviewed by Cursor Bugbot for commit 2343215. Bugbot is set up for automated code reviews on this repo. Configure here.}

gemini-code-assist

Code Review

This pull request introduces multi-instance and multi-model support for GPU KV cache block lifecycle tracking by adding instance_id and model_name attributes to observability events and metrics. Key changes include updating the shadow map in l0_lifecycle.py to use composite keys and modifying the BlockAllocationRecord to carry instance information. Feedback suggests that server.py should group records by instance_id before publishing events to ensure correct model attribution in mixed batches, and recommends using explicit type hints for BlockAllocationRecord instead of object to improve type safety and maintainability.

gemini-code-assist · 2026-04-15T21:50:48Z

+        # Look up model_name from the first record's instance_id.
+        model_name = ""
+        if records:
+            meta = self.gpu_context_meta.get(records[0].instance_id)
+            model_name = meta[0] if meta else ""
        self._event_bus.publish(
            Event(
                event_type=EventType.MP_VLLM_BLOCK_ALLOCATION,
-                metadata={"records": records},
+                metadata={"model_name": model_name, "records": records},
            )
        )


Looking up model_name from only the first record in the batch is potentially incorrect. In a multi-instance or multi-model scenario, a single batch of records might contain entries from different instance_ids belonging to different models. This would lead to incorrect metric attribution. Furthermore, ensure that all relevant metadata, such as MemoryFormat (fmt), is retrieved and passed correctly during batched allocation to ensure correctness in multi-processing modes. Consider grouping the records by instance_id and publishing separate events for each unique instance.

Suggested change

# Look up model_name from the first record's instance_id.

model_name = ""

if records:

meta = self.gpu_context_meta.get(records[0].instance_id)

model_name = meta[0] if meta else ""

self._event_bus.publish(

Event(

event_type=EventType.MP_VLLM_BLOCK_ALLOCATION,

metadata={"records": records},

metadata={"model_name": model_name, "records": records},

)

)

by_instance = {}

for record in records:

by_instance.setdefault(record.instance_id, []).append(record)

for instance_id, inst_records in by_instance.items():

meta = self.gpu_context_meta.get(instance_id)

model_name = meta[0] if meta else ""

self._event_bus.publish(

Event(

event_type=EventType.MP_VLLM_BLOCK_ALLOCATION,

metadata={"model_name": model_name, "records": inst_records},

)

)

References

When performing batched memory allocation, ensure the MemoryFormat (fmt) is retrieved from the cache and passed to the allocator for correctness in multi-processing modes.

gemini-code-assist · 2026-04-15T21:50:48Z

+    def _process_record(self, model_name: str, record: object, now: float) -> None:
        """Process a single BlockAllocationRecord."""
        req_id: str = record.req_id  # type: ignore[attr-defined]
        block_ids: list[int] = record.new_block_ids  # type: ignore[attr-defined]
        token_ids: list[int] = record.new_token_ids  # type: ignore[attr-defined]
+        instance_id: int = record.instance_id  # type: ignore[attr-defined]


The record parameter is typed as object, which forces the use of type: ignore[attr-defined] when accessing its fields. Per the Repository Style Guide (line 24), all new/modified functions should have proper type hints. Please import BlockAllocationRecord from lmcache.v1.multiprocess.custom_types and use it as the type hint for the record parameter to improve maintainability and type safety.

References

All new functions have type hints (arguments + return values) ^(link)

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit c0f9e17. Configure here.}

ApostaC · 2026-04-15T23:27:45Z

        is unhealthy the report is silently dropped.

        Args:
+            instance_id: The GPU instance ID (scheduler/worker identity).


GPU instance ID -> scheduler instance id

- Adapter sends os.getpid() and self.model_name in report_block_allocations — no vLLM change needed - Protocol: [int, str, list[BlockAllocationRecord]] - Server passes instance_id and model_name to EventBus - L0LifecycleSubscriber keys shadow map by (instance_id, block_id) - Emit OTel histograms with instance_id and model_name attributes for per-instance, per-model Prometheus metric slicing - Update EVENTS.md, METRICS.md, and observability.rst docs - Add test verifying OTel attributes on histogram data points Signed-off-by: yuwei <yuwei@dev.local>

ApostaC

LGTM!

sammshen

LGTM

Oasis-Git requested review from ApostaC, YaoJiayi, deng451e, hickeyma, maobaolong, royyhuang and sammshen as code owners April 15, 2026 21:44

gemini-code-assist Bot reviewed Apr 15, 2026

View reviewed changes

Oasis-Git force-pushed the l0-add branch from 3b624e2 to c0f9e17 Compare April 15, 2026 21:53

cursor Bot reviewed Apr 15, 2026

View reviewed changes

Comment thread lmcache/v1/multiprocess/protocols/observability.py Outdated

ApostaC reviewed Apr 15, 2026

View reviewed changes

Oasis-Git force-pushed the l0-add branch from c0f9e17 to a1f1d46 Compare April 15, 2026 23:41

Oasis-Git force-pushed the l0-add branch from a1f1d46 to 80fee09 Compare April 15, 2026 23:44

Merge branch 'dev' into l0-add

2343215

Oasis-Git added full Run comprehensive tests on this PR and removed full Run comprehensive tests on this PR labels Apr 16, 2026

ApostaC approved these changes Apr 16, 2026

View reviewed changes

sammshen approved these changes Apr 16, 2026

View reviewed changes

deng451e approved these changes Apr 16, 2026

View reviewed changes

ApostaC merged commit c92323f into LMCache:dev Apr 16, 2026
32 of 33 checks passed

Oasis-Git deleted the l0-add branch April 16, 2026 23:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add scheduler instance_id and model_name to L0 KV lifecycle tracking#3043

Add scheduler instance_id and model_name to L0 KV lifecycle tracking#3043
ApostaC merged 2 commits intoLMCache:devfrom
Oasis-Git:l0-add

Oasis-Git commented Apr 15, 2026 •

edited by cursor Bot

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 15, 2026

Uh oh!

gemini-code-assist Bot Apr 15, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

ApostaC Apr 15, 2026

Uh oh!

ApostaC left a comment

Uh oh!

sammshen left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Oasis-Git commented Apr 15, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ApostaC Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

ApostaC left a comment

Choose a reason for hiding this comment

Uh oh!

sammshen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Oasis-Git commented Apr 15, 2026 •

edited by cursor Bot

Loading