Excited to share @Standard_Kernel's seed round and some reflections on what we’ve learned about kernel generation and what we believe is next. Grateful to our amazing team, supporters, and the broader community pushing this space forward.
New blog post from Nvidia: LLM-generated GPU kernels showing speedups over FlexAttention and achieving 100% numerical correctness on 🌽KernelBench Level 1
Excited to share what friends and I have been working on at @Standard_Kernel
We've raised from General Catalyst (@generalcatalyst), Felicis (@felicis), and a group of exceptional angels.
We have some great H100 BF16 kernels in pure CUDA+PTX, featuring:
- Matmul 102%-105% perf
✨ New blog post 👀: We have some very fast AI-generated kernels generated with a simple test-time only search. They are performing close to or in some cases even beating the standard expert-optimized production kernels shipped in PyTorch. (1/6)
[🔗 link in final post]
Kernels are the kernel of deep learning.
🙃...but writing kernels sucks.
Can LLMs help? 🤔
Introducing 🌽 KernelBench (Preview), a new coding benchmark designed to evaluate the ability of LLMs to generate ⚡️efficient💨 GPU kernels for optimizing neural network performance.
graduated from MIT with a B.S. and M.Eng. in computer science! deeply grateful to everyone who made the past four years an incredibly enjoyable experience, and i am excited to continue my pursuit in performance engineering and machine learning systems beyond college #mit2023
We remain committed to our partnership with OpenAI and have confidence in our product roadmap, our ability to continue to innovate with everything we announced at Microsoft Ignite, and in continuing to support our customers and partners. We look forward to getting to know Emmett
KernelBench v0.1 is out, featuring:
- A guideline on analyzing the validity of results and ruling out physically impossible performance claims.
- Support for randomized testing beyond normal distributions.
- Fixed problem sizes and improved numerics