lahmuller

Clay lahmuller

Achievements

flash-attention flash-attention Public

Forked from ROCm/flash-attention

Fast and memory-efficient exact attention

Python
nano-vllm nano-vllm Public

Forked from GeeeekExplorer/nano-vllm

Nano vLLM

Python
snowflakedb/ArcticInference snowflakedb/ArcticInference Public

ArcticInference: vLLM plugin for high-throughput, low-latency inference

Python 429 60
sgl-project/sglang sgl-project/sglang Public

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 27.6k 5.8k