Skip to content

Add a script for serving experiments & Collect system stats in scheduler#30

Merged
WoosukKwon merged 36 commits intomainfrom
experiment
Apr 12, 2023
Merged

Add a script for serving experiments & Collect system stats in scheduler#30
WoosukKwon merged 36 commits intomainfrom
experiment

Conversation

@WoosukKwon
Copy link
Copy Markdown
Collaborator

Example usage:

  • Generating a single completion: python benchmark/benchmark_text_completion.py --dataset alpaca_opt_text_completion.pkl --model facebook/opt-13b --request-rate 1.0 --duration 3600 --n1 1.0
  • Generating two completions in parallel: python benchmark/benchmark_text_completion.py --dataset alpaca_opt_text_completion.pkl --model facebook/opt-13b --request-rate 1.0 --duration 3600 --n2 1.0
  • Generating two completions with beam search: python benchmark/benchmark_text_completion.py --dataset alpaca_opt_text_completion.pkl --model facebook/opt-13b --request-rate 1.0 --duration 3600 --n2-beam 1.0

@WoosukKwon WoosukKwon requested a review from zhuohan123 April 6, 2023 09:46
slyalin pushed a commit to slyalin/vllm that referenced this pull request Apr 19, 2024
…ce_artifacts

Revert "Produce artifacts for bare metal installation in Dockerfile.openvino"
dtrifiro pushed a commit to dtrifiro/vllm that referenced this pull request May 21, 2024
This PR logs all errors during validation or generation
for a request like TGIS does. 

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
z103cb pushed a commit to dtrifiro/vllm that referenced this pull request May 21, 2024
…ensions

Dockerfile.ubi: get rid of prebuilt-wheel stage
tianyil1 pushed a commit to tianyil1/vllm that referenced this pull request Jun 5, 2024
…um_wa

WA: Disable cumsum in HPU _prepare_prompt
fxmarty pushed a commit to fxmarty/vllm-public that referenced this pull request Jun 12, 2024
@alixiaodi alixiaodi mentioned this pull request Aug 2, 2024
wuhuikx pushed a commit to wuhuikx/vllm that referenced this pull request Mar 27, 2025
Some PR for plugin support is not merged by vllm yet. This PR add monkey
patch to vllm-ascend to make vllm-ascend work with vllm directly.

This patch code should be removed once the related function is supported
by vllm originally.

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
zyongye added a commit to zyongye/vllm that referenced this pull request Aug 5, 2025
Signed-off-by: simon-mo <xmo@berkeley.edu>
Co-authored-by: simon-mo <xmo@berkeley.edu>
zyongye added a commit to zyongye/vllm that referenced this pull request Aug 6, 2025
Signed-off-by: simon-mo <xmo@berkeley.edu>
Co-authored-by: simon-mo <xmo@berkeley.edu>
heheda12345 pushed a commit to heheda12345/vllm that referenced this pull request Sep 29, 2025
inkcherry pushed a commit to inkcherry/vllm that referenced this pull request Nov 6, 2025
dik654 pushed a commit to dik654/vllm-for-study that referenced this pull request Nov 18, 2025
New Industry Use Cases (vllm-project#21-30):
- vllm-project#21 Game Development: AI game testing + balance tuning
- vllm-project#22 Construction: Vision AI safety inspection
- vllm-project#23 Agriculture/Smart Farm: Crop monitoring + pest detection
- vllm-project#24 Government/Public: Document automation + citizen services
- vllm-project#25 Energy/Utilities: Grid monitoring + anomaly detection
- vllm-project#26 Environment/Sustainability: Carbon tracking + ESG reporting
- vllm-project#27 Fashion/Apparel: Trend analysis + inventory optimization
- vllm-project#28 Sports/Fitness: Performance analytics + tactical analysis
- vllm-project#29 Automotive/Mobility: Autonomous driving simulation
- vllm-project#30 Space/Aerospace: Satellite image analysis

Advanced Architecture Patterns:
1. Event-Driven Pattern: Webhook → Event Bus → Agent triggers
2. Streaming Pattern: Large dataset processing with chunking
3. Batch Processing Pattern: Celery-based parallel processing
4. Circuit Breaker Pattern: Fault tolerance + auto recovery
5. CQRS + Event Sourcing: Command/Query separation
6. Saga Pattern: Distributed transaction management

Guide now covers:
- 30+ industry-specific MCP implementations
- 6 production-ready architecture patterns
- Real-world scalability solutions
- Enterprise integration strategies
- Total: 8,672 lines (from 7,249)
soodoshll pushed a commit to soodoshll/vllm that referenced this pull request Jan 30, 2026
* add implementation

Signed-off-by: Max Hu <hyoung2991@gmail.com>

* add impl

Signed-off-by: Max Hu <hyoung2991@gmail.com>

* add flashinfer

* fix tp

Signed-off-by: Max Hu <hyoung2991@gmail.com>

* Temporary change for ViT

* fix workspace_buffer device.

* change max_seqlen to 128k.

* remove duplicate multiplier.

* fix accuracy and refactor

* more fix

* change dockerfile

* format

Signed-off-by: Max Hu <hyoung2991@gmail.com>

* fix version

Signed-off-by: Max Hu <hyoung2991@gmail.com>

* change python version

* remove qwen25 transformer support

* change dockerfile

* add build versions

* chagne version

* change version

* change

* change

* change

* change

* change

* build image

* change back

* change to 10.0f

* fix fi import

Signed-off-by: Max Hu <hyoung2991@gmail.com>

* change to build in dev image

Signed-off-by: Max Hu <hyoung2991@gmail.com>

* change location

Signed-off-by: Max Hu <hyoung2991@gmail.com>

* change location

Signed-off-by: Max Hu <hyoung2991@gmail.com>

* change

Signed-off-by: Max Hu <hyoung2991@gmail.com>

* change cubin and jitcache to wheels

Signed-off-by: Max Hu <hyoung2991@gmail.com>

* change

Signed-off-by: Max Hu <hyoung2991@gmail.com>

* add comment

Signed-off-by: Max Hu <hyoung2991@gmail.com>

---------

Signed-off-by: Max Hu <hyoung2991@gmail.com>
Co-authored-by: Anerudhan Gopal <agopal@nvidia.com>
Co-authored-by: Baorun Mu <bmu@nvidia.com>
chopper0126 pushed a commit to chopper0126/vllm that referenced this pull request Feb 2, 2026
add multistream and core limitation of communication stream
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant