Skip to content

[CONTRIBUTION]: For SGLang, with multi-nodes and multiple data parallels, kv events gather is insufficient #5097

@huitianbai

Description

@huitianbai

Type of Change

Bug fix

Problem Statement

In current implement of sglang backend, only the main worker set up for kv events.
And only one ZmqKvEventPublisher is created which can only recieve dp 0's kv events.
If we want to router a request with prefix caching, it always fails.

Proposed Solution

Firstly, SGLang should change its enable_kv_cache_events judgement, in "scheduler.py" for now, it is
" self.enable_kv_cache_events = bool(
server_args.kv_events_config and tp_rank == 0
) ".

Fix it like:
"
self.enable_kv_cache_events = bool(
server_args.kv_events_config and tp_rank % (self.tp_size // self.dp_size) == 0
)
"

Launch scripts:
node 0: --kv-events-config '{"publisher":"zmq","topic":"kv-events","endpoint":"tcp://*:5557"}'
node 1: --kv-events-config '{"publisher":"zmq","topic":"kv-events","endpoint":"tcp://{node0_ip}:5557"}'

dynamo should add an option for ZmqKvEventPublisher to bind or connect an address. Then let sglang main node build ZmqKvEventPublisher for all dp. (Just see vllm backend)

Estimated PR Size

S (11-50 lines)

Files/Components Affected

components/src/dynamo/sglang/publisher.py
components/src/dynamo/sglang/request_handlers/handler_base.py
lib/bindings/python/rust/llm/kv.rs
lib/llm/src/kv_router/publisher.rs

0001-kv-event-publisher.patch

Metadata

Metadata

Labels

contribution-requestExternal contributor proposing to implement a change

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions