Long-context, test-time compute, and e2e Reinforcement Learning to build a superhuman coding agent (that then builds the rest of AGI for us). Join us magic.dev
LTM-2-Mini is our first model with a 100 million token context window. That’s 10 million lines of code, or 750 novels.
Full blog: magic.dev/blog/100m-toke…
Evals, efficiency, and more ↓
Excited to announce we’re building an Applied Team focused on post-training. Come explore what's possible with our new (and still unreleased) LTM2 models and their 100M token context window. Apply here: magic.dev/careers/5652b4…
Very excited to welcome @nvidia as Magic's latest investor! With their support, we’re looking forward to scaling long context and inference-time compute.
With context solved, we now focus on unbounded inference-time compute as the next (and potentially last) breakthrough we believe is needed to build reliable AGI.
Imagine if you could spend $100 and 10 minutes on one task and reliably get a great pull request for an entire
Our LTM (Long Term Memory) mechanism needs >1,000x less compute and memory than Llama 3.1 405B’s attention. Llama 3.1 would need 638 H100s *per user* to store a 100M token KV cache. LTM needs a small fraction of one.
SSMs, RNNs, and RAG all exploit weaknesses in evals like
Needle in a Haystack tests
The tech report also details a number of microbenchmark “needle in a haystack” tests (modeled after @GregKamradt’s github.com/gkamradt/LLMTe…) that probe the model’s ability to retrieve specific information from its context.
For text, Gemini 1.5 Pro
I have been continuously in awe of the brilliance, tenacity, and kindness of @EricSteinb and the small but mighty team at Magic.dev. So much so that we've decided to invest $100m!
If you're interested in building the future, please do reach out to me or the team!
Magic.dev has trained a groundbreaking model with many millions of tokens of context that performed far better in our evals than anything we've tried before.
They're using it to build an advanced AI programmer that can reason over your entire codebase and the
I love my team a lot and sometimes it’s stressful but life has never been so fulfilling. If you want to build AGI on a small team of people who care a lot with thousands of GPUs, please apply :)
We've raised $117M from @natfriedman and others to build an AI software engineer.
Code generation is both a product and a path to AGI, requiring new algorithms, lots of CUDA, frontier-scale training, RL, and a new UI.
We are hiring!
We've raised $117M from @natfriedman and others to build an AI software engineer.
Code generation is both a product and a path to AGI, requiring new algorithms, lots of CUDA, frontier-scale training, RL, and a new UI.
We are hiring!
5M tokens of context. Let that sink in.
Yes, there's caveats. But consider what's to come:
- Entire codebases in prompts
- Novel-length spec docs as instructions
- k-shots where k = 10K
- Few-shots where each "shot" is 50K LoC → diff
Those who declared the imminent death of
Meet LTM-1: LLM with *5,000,000 prompt tokens*
That's ~500k lines of code or ~5k files, enough to fully cover most repositories.
LTM-1 is a prototype of a neural network architecture we designed for giant context windows.