(🧵) Today, we release Meta Code World Model (CWM), a 32-billion-parameter dense LLM that enables novel research on improving code generation through agentic reasoning and planning with world models.
ai.meta.com/research/publi…
Gabriel Synnaeve
9,103 posts
Nerd & Dad.
RL & CodeGen research since before it was cool.
- We've just released MusicGen, and there is a @huggingface demo now, here is a thread about me playing with it just right now. huggingface.co/spaces/faceboo… A 🧵👇
- This is an excellent history of LLMs, doesn't miss seminal papers I know. Reminds you we're standing on the shoulders of giants, and giants are still being born today. gregorygundersen.com/blog/2025/10/0…
- Reinforcement learning with execution feedback (RLEF). Lots of sweat went into this one, but what works in principle works in practice: for code generation we can turn compute into training data: arxiv.org/abs/2410.02089 This works for LLMs, but will lead to world models.
- Everything I know in RL in one tweet: exploration>exploitation, easy to leverage off-policy positive rewards, hard to leverage off-policy negative rewards, update the policy often, focus on throughput, self-play or find asymmetric grounding, clip everything but check statistics.
- Multi-token prediction models are here
- Want to do research in code generation with LLMs and wonky deep learning from the 90s? We're recruiting one Master student (M2) intern for 2025 at FAIR Paris in my team metacareers.com/jobs/106871446…
- The wav2letter Santa has brought 50k hours of read speech in 8 languages in CC-BY 4.0: - dataset: openslr.org/94/ - paper: arxiv.org/abs/2012.03411 - pretrained models: github.com/facebookresear…
- To all the defeatists who think there is nothing else but scale: * 5 years between Self-Attention Is All You Need and FlashAttention * Transformers still require warmup. Researchers: get back to work! The future is bright :)
- Do you need to quantize models? Try diffq, `pip install diffq` and
- Replying to @syhw4/ Here is an example of the Code World Model tracing the execution of the piece of code counting the "r"s in "strawberry". Think of it like a neural `pdb` that you can set to any initial frame state, and that reasoning can query as a tool in token space.
- Happy to be releasing Code Llama! We've built it on Llama 2 and improved it for code use cases. In particular it supports infilling out of the box, and was trained with sequences up to 16k tokens. Looking forward to what the community will build with it! 1/7
- Replying to @syhw2/ When humans plan, we imagine the possible outcomes of different actions. When we reason about code we simulate part of its execution in our head. The current generation of LLMs struggles to do this. What kind of research will an explicitly trained code world model enable?
- Flashlight's v0.3 release: a lightweight, modern C++ deep learning autograd-based library with SOTA models in speech recognition, language modeling, and vision: github.com/flashlight/fla… dataloading/model/training/docs to follow [1/5]



