Type of Change
Bug fix
Problem Statement
In current implement of sglang backend, only the main worker set up for kv events.
And only one ZmqKvEventPublisher is created which can only recieve dp 0's kv events.
If we want to router a request with prefix caching, it always fails.
Proposed Solution
Firstly, SGLang should change its enable_kv_cache_events judgement, in "scheduler.py" for now, it is
" self.enable_kv_cache_events = bool(
server_args.kv_events_config and tp_rank == 0
) ".
Fix it like:
"
self.enable_kv_cache_events = bool(
server_args.kv_events_config and tp_rank % (self.tp_size // self.dp_size) == 0
)
"
Launch scripts:
node 0: --kv-events-config '{"publisher":"zmq","topic":"kv-events","endpoint":"tcp://*:5557"}'
node 1: --kv-events-config '{"publisher":"zmq","topic":"kv-events","endpoint":"tcp://{node0_ip}:5557"}'
dynamo should add an option for ZmqKvEventPublisher to bind or connect an address. Then let sglang main node build ZmqKvEventPublisher for all dp. (Just see vllm backend)
Estimated PR Size
S (11-50 lines)
Files/Components Affected
components/src/dynamo/sglang/publisher.py
components/src/dynamo/sglang/request_handlers/handler_base.py
lib/bindings/python/rust/llm/kv.rs
lib/llm/src/kv_router/publisher.rs
0001-kv-event-publisher.patch
Type of Change
Bug fix
Problem Statement
In current implement of sglang backend, only the main worker set up for kv events.
And only one ZmqKvEventPublisher is created which can only recieve dp 0's kv events.
If we want to router a request with prefix caching, it always fails.
Proposed Solution
Firstly, SGLang should change its enable_kv_cache_events judgement, in "scheduler.py" for now, it is
" self.enable_kv_cache_events = bool(
server_args.kv_events_config and tp_rank == 0
) ".
Fix it like:
"
self.enable_kv_cache_events = bool(
server_args.kv_events_config and tp_rank % (self.tp_size // self.dp_size) == 0
)
"
Launch scripts:
node 0: --kv-events-config '{"publisher":"zmq","topic":"kv-events","endpoint":"tcp://*:5557"}'
node 1: --kv-events-config '{"publisher":"zmq","topic":"kv-events","endpoint":"tcp://{node0_ip}:5557"}'
dynamo should add an option for ZmqKvEventPublisher to bind or connect an address. Then let sglang main node build ZmqKvEventPublisher for all dp. (Just see vllm backend)
Estimated PR Size
S (11-50 lines)
Files/Components Affected
components/src/dynamo/sglang/publisher.py
components/src/dynamo/sglang/request_handlers/handler_base.py
lib/bindings/python/rust/llm/kv.rs
lib/llm/src/kv_router/publisher.rs
0001-kv-event-publisher.patch