Pinned
1/?) As promised to Sander Dieleman (@sedielem), we’re finally excited to share:
Towards Closing the Autoregressive Gap in Language Modeling via Entropy-Gated Continuous Bitstream Diffusion
We show that continuous diffusion can achieve very strong language modeling performance















