Hi 👋, I'm GentleCold
- 🎓 Pursuing master's degree in ECNU.
- 🧐 Interested in LLM inference acceleration and KV cache systems.
- ⚡️ Working on GPU/SSD offloading and distributed cache sharing.
- 🐧 Using Arch Linux btw.
- DaseR: RAG-native KV cache service for LLM inference.
- pegaflow: high-performance KV cache storage with GPU offloading, SSD caching, and RDMA-based sharing.
- LMCache: exploring KV cache reuse and offloading for LLM serving.
- nano-vllm: learning and experimenting with compact vLLM-style inference systems.

