This website uses cookies
Read our Privacy policy and Terms of use for more information.
How AI models are built, what sets each generation apart, and where their architectures lead – from frontier LLMs to world models, traced in context.
AI 101
+2

13 min read
Jun 4, 2026
NVIDIA Cosmos is a platform of world foundation models for Physical AI: video curation, tokenizer, diffusion and autoregressive WFMs, and guardrails explained. Plus sensational 2026 update – Cosmos 3 omnimodel world model

AI 101
+2

13 min read
Apr 8, 2026
Gemma 4 runs locally via Ollama with zero API cost. Full architecture breakdown — attention mix, MoE, per-layer embeddings — and why OpenClaw users are switching from Claude.

AI 101
+3

12 min read
Mar 18, 2026
Nemotron Coalition is NVIDIA's bet on open frontier AI — with Mistral, Cursor, Black Forest Labs and others. How Nemotron 3 works and who holds power.


AI 101
+2

15 min read
Jan 21, 2026
What are VLA models? Learn Vision-Language-Action architecture, key systems (π0, Helix, SmolVLA) & the leap to VLA+. Deep dive.


AI 101
+2

11 min read
Nov 19, 2025
LeJEPA by Yann LeCun: provably stable self-supervised learning without heuristics. SIGReg, isotropic Gaussian embeddings & world models explained. Turing Post.


AI 101
+1

9 min read
Nov 12, 2025
Tracing the rise of China’s agentic intelligence strategy – from Kimi’s early vision to today’s open-source breakthrough


AI 101
+2

13 min read
Oct 8, 2025
A glimpse at Code World Model, PSI, and others – redefining how models catch the world in their nets

Concepts
+2

11 min read
Sep 17, 2025
Everything you need to know about models that defend AI today


Concepts
+4

10 min read
Aug 27, 2025
Explore how rethinking world model building patterns can turn our vision upside down and lead to a new Physical, Agentic, and Nested (PAN) system

AI 101
+1

12 min read
Aug 6, 2025
What is GPT-OSS? OpenAI's open-weight MoE models explained: architecture, Ollama setup, memory requirements, and benchmarks vs DeepSeek & Qwen3.



AI 101
+1

4 min read
Jul 23, 2025
Refreshing Smol and Qwen models, Liquid Foundation Models with latest Hyena Edge, and legendary BERT

AI 101
+1

14 min read
Jun 18, 2025
Reasoning models use chain-of-thought to solve complex problems. Compare o1, DeepSeek-R1 & QwQ — and learn when to use each.


AI 101
+3

11 min read
May 28, 2025
What is BERT, how does it work, and why does it matter in 2026? Covers MLM, NSP, fine-tuning, RoBERTa, DistilBERT, ModernBERT, NeoBERT, and ConstBERT

AI 101
+2

11 min read
Apr 30, 2025
we discuss a new wave of architecture from Liquid AI – built from first principles, optimized for real hardware, and challenging the Transformer playbook with smarter, leaner models

AI 101
+3

12 min read
Apr 9, 2025
A deep dive into the history and current advancements in world models and why they are an important puzzle piece for the future of AI

AI 101
+2

9 min read
Mar 19, 2025
we discuss the timeline of Qwen models, focusing on their agentic capabilities and how they compete with other models, and also explore what is Qwen-Agent framework and how you can use it

AI 101
+1

12 min read
Feb 26, 2025
SmolLM2, SmolVLM, and SmolVLM2 explained: how Hugging Face trains small language models on curated datasets and multi-stage pipelines

AI 101
+1

6 min read
Nov 27, 2024
LLaVA-o1 reasons step-by-step through 4 structured stages & stage-level beam search. How it works, benchmarks vs GPT-4o-mini & Gemini-1.5-pro, and where it falls short.

AI 101
+1

7 min read
Nov 6, 2024
Trace Mistral AI's roadmap from Mistral 7B to Mixtral 8×7B and Ministral. Architecture, benchmarks, and edge computing use cases.

AI 101
+1

6 min read
Oct 16, 2024
Explore how OpenAI made their automatic speech recognition (ASR) model multilingual and multitasking

AI 101
+1

5 min read
Sep 25, 2024
OLMoE: open-source sparse Mixture-of-Experts with 1B active and 7B total parameters. How it works, how it was trained, and why it matters for open-source AI.


AI 101
+1

8 min read
Aug 28, 2024
We discuss the innovation suggested by the DeepSeek team, how it improves the models' performance, and dive into the architectures and implementation of the models


Turing Post is an AI newsletter for engineers, researchers, founders, and technical managers who want to understand how machine learning and AI actually work.
Built on more than two decades in tech and seven years focused on AI, we track the research that matters, the systems being built, and the ideas shaping the field, from LLMs and AI agents to JEPA, world models, retrieval, inference, evaluation, AI infrastructure, and agentic workflows.
Join 115,000+ professionals who rely on Turing Post for precise, grounded analysis of AI’s past, present, and future.