We're growing rapidly at @RadicalNumerics and scaling our core teams. Join us in building the next generation of scientific world models.
We're hiring across a few roles, each with significant ownership and cross-functional scope:
- Member of Technical Staff, Post-Training
-
Radical Numerics
10 posts
Systems, scaling and architecture for next-gen scientific world models.
- Scaling scientific world models requires co-designing architectures, training objectives, and numerics. Today, we share the first posts in our series on low-precision pretraining, starting with NVIDIA's NVFP4 recipe for stable 4-bit training. Part 1: radicalnumerics.ai/blog/nvfp4-par… Part
- Radical Numerics repostedExcited to share Phalanx, our new layer for sequence modeling! Each block communicates with its neighbor, like the shield cover of a neighboring hoplite. Phalanx can replace sliding window attention and trains faster than optimized baselines while maintaining quality.Sliding window attention (SWA) is powering frontier hybrid models for efficiency. Is there something better? Introducing Phalanx, a faster and better quality drop-in replacement for sliding window attention (SWA). Phalanx is a new family of hardware and numerics-aware windowed
- Replying to @RadicalNumericsMore on Phalanx and our research kernel library: Blog: radicalnumerics.ai/blog/phalanx Code: github.com/RadicalNumeric… Report: radicalnumerics.ai/assets/phalanx…
- Sliding window attention (SWA) is powering frontier hybrid models for efficiency. Is there something better? Introducing Phalanx, a faster and better quality drop-in replacement for sliding window attention (SWA). Phalanx is a new family of hardware and numerics-aware windowed
- Radical Numerics repostedWe just released the largest open-source diffusion language model (RND1). RND1 is important to me on a personal level: it symbolizes our commitment to open-source exploration of radically different designs for AI at scale — training objectives, architectures, domains. There isIntroducing RND1, the most powerful base diffusion language model (DLM) to date. RND1 (Radical Numerics Diffusion) is an experimental DLM with 30B params (3B active) with a sparse MoE architecture. We are making it open source, releasing weights, training details, and code to
GIF - Replying to @RadicalNumerics
- Replying to @RadicalNumericsWe’re also hiring aggressively. Reach out if you’re interested in building automated research environments and agents. (AI researchers and SWEs, pre/mid/post training, architecture design, kernels, lots of backend system design, and automation) Our team is behind the tech for
GIF - Replying to @RadicalNumericsMore on RND1 models: Blog: radicalnumerics.ai/blog/rnd1 Code: github.com/RadicalNumeric… Report: radicalnumerics.ai/assets/rnd1_re… Weights: huggingface.co/radicalnumeric…
- Introducing RND1, the most powerful base diffusion language model (DLM) to date. RND1 (Radical Numerics Diffusion) is an experimental DLM with 30B params (3B active) with a sparse MoE architecture. We are making it open source, releasing weights, training details, and code to
GIF








