- Quincy, WA
Highlights
- Pro
Pinned Loading
-
-
nvfp4-vllm
nvfp4-vllm PublicScripts for quantizing HuggingFace models to NVFP4 (4-bit) and serving them with vLLM on NVIDIA Blackwell GPUs. Includes an interactive chat client for testing. Confirmed working on RTX PRO 6000 Bl…
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.





