From CUDA Graph Crash to Tensor Core: Tracing a vLLM INT8 Bug

01. Intro While deploying an INT8-quantized LLM with vLLM on NVIDIA A100 GPU, I ran into this error during startup — pytorch/ao#2376: RuntimeError: self.size(0) needs to be greater than 16, but g...

Apr 27, 2026 Hardware & Kernels

Recovery Manual for Uncontrollable

Recently, I suffered from Menhera (mental issue), caused by an uncontrollable situation. For a reminder, I would like to record “escape manual”. 01. Cure mental with physical distance first Escap...

Apr 25, 2026 Diary

안뇽하세요....

블로그를 시작했습니다. 잘 부탁드립니다. Most posts will be technical write-ups of things I’ve read, implemented, or debugged.

Apr 10, 2026 Diary

From CUDA Graph Crash to Tensor Core: Tracing a vLLM INT8 Bug

Recovery Manual for Uncontrollable

안뇽하세요....

Trending Tags