[Core][Hybrid allocator + kv connector 1/n] Enable hybrid allocator + KV cache connector#25712
Conversation
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
There was a problem hiding this comment.
Code Review
This pull request successfully enables the use of the hybrid allocator with the KV cache connector by removing the explicit restriction and adding the necessary logic to handle multiple KV cache groups. The changes are well-structured, introducing a SupportsHMA interface to check for compatibility. My review focuses on improving code quality and performance. I've identified an opportunity to refactor duplicated code for better maintainability and two instances where an expensive deepcopy operation can be replaced with a more efficient shallow copy, which should improve initialization performance.
|
@NickLucche @njhill This is the refactored version of #25363 , PTAL |
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
… KV cache connector (vllm-project#25712) Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
… KV cache connector (vllm-project#25712) Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Follow on from vllm-project#25712 `VllmConfig` is explicitly designed as a dataclass containing user-provided configuration and model metadata. It is a global configuration object that lives throughout the entire engine lifetime and is meant to be immutable after `__post_init__()`. `KVCacheConfig` is worker-specific, runtime-computed state. It has limited lifetime, and its purpose is limited to initializing the KV Cache in the model runner. Even if we add KV cache hints to model config.json in future, this would be parsed into `ModelConfig`, used as input to the `get_kv_cache_configs()` computation, and the resulting `KVCacheConfig` would still be runtime state. We are currently creating per-worker copies of VllmConfig in order to attach the runtime `KVCacheConfig` state. But instead we should just explicitly pass `KVCacheConfig` to the connector. Make sure to handle backwards compatibility for external connector implementations (loaded via module path) that have the old style constructor signature. Signed-off-by: Mark McLoughlin <markmc@redhat.com>
… KV cache connector (vllm-project#25712) Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
… KV cache connector (vllm-project#25712) Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
… KV cache connector (vllm-project#25712) Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
…atures without `KVCacheConfig` (#39832) The v0.12.0 release contained initial support for HMA in KV Connectors. As part of these changes, a KVCacheConfig argument was added to KV connector constructors. Backwards compatibility support for out-of-tree connectors was included in this change, with a very prominent warning. See #25712 and #27887. Since the warning has been around for over 5 months, we can safely remove the support of it. Signed-off-by: yewentao256 <zhyanwentao@126.com>
… KV cache connector (vllm-project#25712) Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
…atures without `KVCacheConfig` (vllm-project#39832) The v0.12.0 release contained initial support for HMA in KV Connectors. As part of these changes, a KVCacheConfig argument was added to KV connector constructors. Backwards compatibility support for out-of-tree connectors was included in this change, with a very prominent warning. See vllm-project#25712 and vllm-project#27887. Since the warning has been around for over 5 months, we can safely remove the support of it. Signed-off-by: yewentao256 <zhyanwentao@126.com>
…atures without `KVCacheConfig` (vllm-project#39832) The v0.12.0 release contained initial support for HMA in KV Connectors. As part of these changes, a KVCacheConfig argument was added to KV connector constructors. Backwards compatibility support for out-of-tree connectors was included in this change, with a very prominent warning. See vllm-project#25712 and vllm-project#27887. Since the warning has been around for over 5 months, we can safely remove the support of it. Signed-off-by: yewentao256 <zhyanwentao@126.com>
… KV cache connector (vllm-project#25712) Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
… KV cache connector (vllm-project#25712) Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
… KV cache connector (vllm-project#25712) Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
…atures without `KVCacheConfig` (vllm-project#39832) The v0.12.0 release contained initial support for HMA in KV Connectors. As part of these changes, a KVCacheConfig argument was added to KV connector constructors. Backwards compatibility support for out-of-tree connectors was included in this change, with a very prominent warning. See vllm-project#25712 and vllm-project#27887. Since the warning has been around for over 5 months, we can safely remove the support of it. Signed-off-by: yewentao256 <zhyanwentao@126.com>
…atures without `KVCacheConfig` (vllm-project#39832) The v0.12.0 release contained initial support for HMA in KV Connectors. As part of these changes, a KVCacheConfig argument was added to KV connector constructors. Backwards compatibility support for out-of-tree connectors was included in this change, with a very prominent warning. See vllm-project#25712 and vllm-project#27887. Since the warning has been around for over 5 months, we can safely remove the support of it. Signed-off-by: yewentao256 <zhyanwentao@126.com>
…atures without `KVCacheConfig` (vllm-project#39832) The v0.12.0 release contained initial support for HMA in KV Connectors. As part of these changes, a KVCacheConfig argument was added to KV connector constructors. Backwards compatibility support for out-of-tree connectors was included in this change, with a very prominent warning. See vllm-project#25712 and vllm-project#27887. Since the warning has been around for over 5 months, we can safely remove the support of it. Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
…atures without `KVCacheConfig` (vllm-project#39832) The v0.12.0 release contained initial support for HMA in KV Connectors. As part of these changes, a KVCacheConfig argument was added to KV connector constructors. Backwards compatibility support for out-of-tree connectors was included in this change, with a very prominent warning. See vllm-project#25712 and vllm-project#27887. Since the warning has been around for over 5 months, we can safely remove the support of it. Signed-off-by: yewentao256 <zhyanwentao@126.com>
Purpose
Refactor of #25363 . This PR enables the combination of hybrid allocator + KV cache connector in a backward-compatible way.
Test Script
Test Result
Success.
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.