Popular repositories Loading
-
-
-
hermes-agent
hermes-agent PublicForked from NousResearch/hermes-agent
The agent that grows with you
Python
-
turboquant-xpu
turboquant-xpu PublicTurboQuant KV-cache compression ported to Intel Arc B580 (XPU) via Triton — pure PyTorch fallback path. Triton kernel port pending fix for tl.gather materialization bug.
Python
-
turboquant
turboquant PublicForked from OnlyTerp/turboquant
First open-source implementation of Google TurboQuant (ICLR 2026) -- near-optimal KV cache compression for LLM inference. 5x compression with near-zero quality loss.
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.