If you are an AI startup blocked on GPUs, send me a note.
At Lamini, we figured out how to use AMD GPUs, which gives us a relatively large supply compared to the rest of the market.
Greg Diamos
809 posts
I build AI supercomputers
Joined April 2013
- Lamini now supports JSON output at full speed. We guarantee that your LLM produces a valid JSON object according to your spec. Works for any LLM we support, e.g. Mistral 7b show below. API Docs: lamini-ai.github.io/API_v2/complet…
- Replying to @remusrisnovIMO, ROCm is 90% of the way there, but the market treats it like it is 0% of the way there. We filled in the 90%->100% gaps for LLM finetuning and inference with a team of a few good hackers.
- New LLMs for new years! On my vacation I streamed a 5 hour walkthrough of building an optimized medical LLM from scratch using an AMD supercomputer from a beach in Hawaii. All of the code is open source! github.com/lamini-ai/lami… youtu.be/Xkzd_YNbWmc?fe… via @YouTube
- I'm so excited to release two giant speech datasets today: mlcommons.org/en/peoples-spe… mlcommons.org/en/multilingua… with clean CC licenses with academic and commercial use! Special shout out to Daniel Galvez, Mark Mazumder, and the whole team who put in a huge effort to create these.
- Here's a new earnings call Q&A dataset I made with Lamini. It has millions of questions and answers generating using a Lamini LLM reading earnings call transcripts. huggingface.co/datasets/lamin… Still uploading - 10k finished so far - should have about 1M by tonight
- Excited to release 1.109 billion times faster LLM switching using a PEFT cache in HBM. See the blog for details of how it works. Finding an opportunity for a 1 billion times speedup suggests that we are just scratching the surface of fine tuning custom LLMs.Training multiple LLMs taking forever? 😤 Costing you a fortune?💸 Enter PEFT! Get ready to multiply!! 🚀 1000 models, just 1 machine! 🤖 3 months of training -> 3 milliseconds ⚡️ Just one API call, load and train with Lamini! 👉lamini.ai/blog/one-billi… 👀youtube.com/shorts/7X8fSSe…
- Hiring a world class ML Engineer at Lamini. Drop me a note if you want to push model accuracy to the limit on unlimited AMD compute.
- This is the paper that convinced me - proceedings.mlr.press/v28/coates13.p… Showing that a Frankenstein CUDA cluster could beat a 10,000 cpu map reduce cluster# CUDA/C++ origins of Deep Learning Fun fact many people might have heard about the ImageNet / AlexNet moment of 2012, and the deep learning revolution it started. en.wikipedia.org/wiki/AlexNet What's maybe a bit less known is that the code backing this winning submission to the
- Nice work replicating this. The recipe is out. AMD GPUs work. lamini.ai/blog/lamini-ll…I've been working on my AI servers in my garage. (8x AMD Instinct mi100) Yesterday I got the first server running and inferencing with oobabooga. Now duplicating the os drive to get the 2nd server running. Then once infiniband is setup I will start working on getting
- We are hiring an HPC (MPI / OpenAI Triton) Engineer at Lamini. Apply here: jobs.lever.co/laminiai/af688… We are inventing and building the largest AMD LLM training system in the world. Join us in strongly scaling to 1000s of GPUs and beyond.
- Hit me up if you need GPUs for finetuning LLMs. We are bringing online more capacity together with AMD. Start training on AMD in 3 lines of code. pip install lamini from lamini import LlamaV2Runner model = LlamaV2Runner() model.load_data(path=...) model.train(args=...)Love working with @LaminiAI and @realSharonZhou making LLMs easy and accessible for all on @AMD @AMDInstinct GPUs! So cool what can be done with @LaminiAI LLM Superstations!!
- I remember the same time, it wasn’t luck. My first exposure to the CUDA vision was from John Nickolls . He very clearly saw it as becoming the dominant form of computing.I worked at Intel on Larrabee applications in 2007. Then I went to NVIDIA to work on ML in 2008. So I was there at both places at that time and I can say: NVIDIA's dominance didn't come from luck. It came from vision and execution. Which Intel lacked.
- We used a supercomputer to perform the largest study to date of how deep learning scales up with more data and faster computers. It turns out to be simple and predictable across diverse applications. research.baidu.com/deep-learning-…













