Skip to content

Modify README to include info on loading LLaMA#18

Merged
zhuohan123 merged 1 commit intomainfrom
llama-readme
Mar 31, 2023
Merged

Modify README to include info on loading LLaMA#18
zhuohan123 merged 1 commit intomainfrom
llama-readme

Conversation

@zhuohan123
Copy link
Copy Markdown
Member

No description provided.

@zhuohan123 zhuohan123 requested a review from WoosukKwon March 31, 2023 17:04
@zhuohan123 zhuohan123 merged commit e3f00d1 into main Mar 31, 2023
@WoosukKwon WoosukKwon deleted the llama-readme branch March 31, 2023 17:29
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
luo-cheng2021 pushed a commit to luo-cheng2021/vllm that referenced this pull request Apr 1, 2024
vLLM with OpenVINO usage instruction
z103cb referenced this pull request in z103cb/opendatahub_vllm Apr 22, 2024
This PR implements a subset of the metrics from the TGIS image. I tried
to make sure that everything from our current ops dashboard is
supported. These are:

- tgi_tokenize_request_tokens 
- tgi_tokenize_request_input_count 
- tgi_request_input_count 
- tgi_request_failure 
- tgi_request_queue_duration 
- tgi_queue_size 
- tgi_batch_current_size 
- tgi_batch_inference_duration 
- tgi_request_input_length 
- tgi_request_generated_tokens

---------

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
fxmarty pushed a commit to fxmarty/vllm-public that referenced this pull request May 31, 2024
make the vllm setup mode configurable and make install mode as defaul…
yukavio pushed a commit to yukavio/vllm that referenced this pull request Jul 3, 2024
yukavio pushed a commit to yukavio/vllm that referenced this pull request Jul 3, 2024
@alixiaodi alixiaodi mentioned this pull request Aug 2, 2024
robertgshaw2-redhat referenced this pull request in robertgshaw2-redhat/vllm May 2, 2025
* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* fix merge

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
robertgshaw2-redhat referenced this pull request in robertgshaw2-redhat/vllm May 3, 2025
* [Update] LMcache connector v1 implementation

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* [Add] examples for disaggregated prefill

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* [add] extra information about evns

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* Initial stubs for P/D scheduling changes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Updates

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Rs branch (#3)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Rs branch (#5)

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Remove Unneeded Arguments (#7)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* stash

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* cleanup

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Improve disagg-example.sh (#8)

- fix spelling
- CUDA_VISIBLE_DEVICES should be set externally

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* added connector

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* update

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* remove

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* seems to load properly

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Revert "updated"

This reverts commit 97316d9.

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* stash

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* added

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* diffs for local dev on macos

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* update

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updaed

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Checkpoint.

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* WIP

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated on scheduler side

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Hacking away

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* cleanup

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* ensure request removed from running list

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Runs E2E. Garbage output. Crashes on 2nd request

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* update

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* rename files

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* update

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Second request no longer crashes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Remove gpu_model_runner hacks

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Clean up Justfile

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* [Bugfix] Stale finished requests in EMPTY_MODEL_RUNNER_OUTPUT

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* update

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* justfile edits

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Update

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Fixes - lm_eval gsm8k has correctness

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* "just delete the assert"

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* fixup precommit issues

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Fixes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated (#12)

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Add Accuracy Test (#13)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Preemption Bugfixes (#15)

* stash fixed double free issue

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* fixed issue

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated (#16)

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Fix Bad Merge | Fix Memory Leak in Upstream (#18)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* fix merge

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* clean up justfile, examples

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* more cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* more cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* more cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* more cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* More cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* more cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* more cleanup, precommit fixes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* More cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* run_accuracy_test.sh UX

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* squash warnings

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* pre-commit

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* update

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Add get_finished to base kv connector

Signed-off-by: mgoin <mgoin64@gmail.com>

* revert test.txt

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* review comments

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

---------

Signed-off-by: ApostaC <yihua98@uchicago.edu>
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: ApostaC <yihua98@uchicago.edu>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
Co-authored-by: Robert Shaw <rshaw@neuralmagic.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
robertgshaw2-redhat referenced this pull request in robertgshaw2-redhat/vllm May 4, 2025
* [Update] LMcache connector v1 implementation

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* [Add] examples for disaggregated prefill

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* [add] extra information about evns

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* Initial stubs for P/D scheduling changes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Updates

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Rs branch (#3)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Rs branch (#5)

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Remove Unneeded Arguments (#7)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* stash

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* cleanup

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Improve disagg-example.sh (#8)

- fix spelling
- CUDA_VISIBLE_DEVICES should be set externally

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* added connector

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* update

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* remove

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* seems to load properly

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Revert "updated"

This reverts commit 97316d9.

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* stash

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* added

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* diffs for local dev on macos

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* update

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updaed

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Checkpoint.

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* WIP

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated on scheduler side

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Hacking away

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* cleanup

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* ensure request removed from running list

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Runs E2E. Garbage output. Crashes on 2nd request

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* update

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* rename files

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* update

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Second request no longer crashes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Remove gpu_model_runner hacks

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Clean up Justfile

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* [Bugfix] Stale finished requests in EMPTY_MODEL_RUNNER_OUTPUT

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* update

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* justfile edits

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Update

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Fixes - lm_eval gsm8k has correctness

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* "just delete the assert"

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* fixup precommit issues

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Fixes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated (#12)

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Add Accuracy Test (#13)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Preemption Bugfixes (#15)

* stash fixed double free issue

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* fixed issue

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated (#16)

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Fix Bad Merge | Fix Memory Leak in Upstream (#18)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* fix merge

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* cleanup code

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* cleanup code

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* stash

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatted

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* revert

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* more spurious changes

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Update vllm/distributed/kv_transfer/kv_connector/v1/nixl_connector.py

Co-authored-by: Tyler Michael Smith <tysmith@redhat.com>

* Update vllm/distributed/kv_transfer/kv_connector/v1/nixl_connector.py

Co-authored-by: Tyler Michael Smith <tysmith@redhat.com>

---------

Signed-off-by: ApostaC <yihua98@uchicago.edu>
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>
Co-authored-by: ApostaC <yihua98@uchicago.edu>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Tyler Michael Smith <tysmith@redhat.com>
Co-authored-by: Robert Shaw <rshaw@neuralmagic.com>
robertgshaw2-redhat referenced this pull request in robertgshaw2-redhat/vllm May 6, 2025
* [Update] LMcache connector v1 implementation

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* [Add] examples for disaggregated prefill

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* [add] extra information about evns

Signed-off-by: ApostaC <yihua98@uchicago.edu>

* Initial stubs for P/D scheduling changes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Updates

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Rs branch (#3)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Rs branch (#5)

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Remove Unneeded Arguments (#7)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* stash

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* cleanup

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Improve disagg-example.sh (#8)

- fix spelling
- CUDA_VISIBLE_DEVICES should be set externally

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* added connector

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* update

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* remove

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* seems to load properly

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Revert "updated"

This reverts commit 97316d9.

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* stash

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* added

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* diffs for local dev on macos

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* update

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updaed

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Checkpoint.

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Cleanup

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* WIP

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated on scheduler side

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Hacking away

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* cleanup

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* ensure request removed from running list

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Runs E2E. Garbage output. Crashes on 2nd request

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* update

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* rename files

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* updated

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* update

Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>

* Second request no longer crashes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Remove gpu_model_runner hacks

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Clean up Justfile

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* [Bugfix] Stale finished requests in EMPTY_MODEL_RUNNER_OUTPUT

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* update

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* justfile edits

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Update

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Fixes - lm_eval gsm8k has correctness

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* "just delete the assert"

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* fixup precommit issues

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Fixes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* updated (#12)

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Add Accuracy Test (#13)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Preemption Bugfixes (#15)

* stash fixed double free issue

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* fixed issue

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatrd

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated (#16)

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Fix Bad Merge | Fix Memory Leak in Upstream (#18)

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* fix merge

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

---------

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* cleanup code

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* cleanup code

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* stash

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updatted

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* revert

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* more spurious changes

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* updated

Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>

* Support MLA in NIXL connector

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* WIP adding tests

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* wip

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

* Fixes

Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

---------

Signed-off-by: ApostaC <yihua98@uchicago.edu>
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>
Co-authored-by: ApostaC <yihua98@uchicago.edu>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com>
Co-authored-by: Robert Shaw <rshaw@neuralmagic.com>
zyongye pushed a commit to zyongye/vllm that referenced this pull request Aug 5, 2025
Signed-off-by: Hongxia Yang <hongxia.yang@amd.com>
zyongye pushed a commit to zyongye/vllm that referenced this pull request Aug 6, 2025
Signed-off-by: Hongxia Yang <hongxia.yang@amd.com>
inkcherry pushed a commit to inkcherry/vllm that referenced this pull request Nov 6, 2025
yma11 pushed a commit to yma11/vllm that referenced this pull request Nov 14, 2025
Signed-off-by: chaojun-zhang <chaojun.zhang@intel.com>
tjtanaa pushed a commit to tjtanaa/vllm that referenced this pull request Jan 29, 2026
…ut-request-input-processor-patch

[Inputs, Engine]Add Omni model components and input processing for hidden states support
yuezhu1 pushed a commit to yuezhu1/vllm that referenced this pull request Mar 30, 2026
…-project#7)

Adds five new fields to LoRAConfig in vllm/config/lora.py to support
runtime dynamic resizing of GPU LoRA adapter slots:

  - min_loras (int, ge=1): floor for dynamic slot shrinking
  - dynamic_lora_slots (bool): enables automatic watermark-driven scaling
  - lora_mem_high_watermark (float, 0<x<1): scale-down threshold
  - lora_mem_low_watermark (float, 0<x<1): scale-up threshold
  - lora_slot_resize_cooldown_s (float, ge=0): anti-thrash cooldown

Cross-field validation added to _validate_lora_config():
  - min_loras <= max_loras (when dynamic_lora_slots=True)
  - lora_mem_low_watermark < lora_mem_high_watermark (when dynamic=True)

Field-level bounds (ge/gt/lt) enforced by Pydantic at construction time.

dynamic_lora_slots added to compute_hash() as it affects the CudaGraph
specialization path (disables LoRA cudagraph when True, see issue vllm-project#14).

All new fields default to safe values so existing configs are unaffected
when dynamic_lora_slots=False (the default).

Includes 16 unit tests in tests/lora/test_lora_config_dynamic.py
covering defaults, valid configs, all validation error paths, and
compute_hash() behavior.

Closes vllm-project#7
Closes vllm-project#18

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Chen Wang <Chen.Wang1@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants