Models

How AI models are built, what sets each generation apart, and where their architectures lead – from frontier LLMs to world models, traced in context.

AI 101

What is Cosmos World Foundation Model Platform?

13 min read

Jun 4, 2026

What is Cosmos World Foundation Model Platform?

NVIDIA Cosmos is a platform of world foundation models for Physical AI: video curation, tokenizer, diffusion and autoregressive WFMs, and guardrails explained. Plus sensational 2026 update – Cosmos 3 omnimodel world model

Alyona Vert.

AI 101

13 min read

Apr 8, 2026

AI 101: Gemma 4 with OpenClaw: Architecture, Setup, and Why Developers Are Switching

Gemma 4 runs locally via Ollama with zero API cost. Full architecture breakdown — attention mix, MoE, per-layer embeddings — and why OpenClaw users are switching from Claude.

Alyona Vert.

AI 101

Nemotron 3 and the Surprising Coalition Building New AI in the Open

12 min read

Mar 18, 2026

Nemotron 3 and the Surprising Coalition Building New AI in the Open

Nemotron Coalition is NVIDIA's bet on open frontier AI — with Mistral, Cursor, Black Forest Labs and others. How Nemotron 3 works and who holds power.

Alyona Vert., +1

AI 101

VLA Models Explained: Architecture, Types & the Leap to VLA+

15 min read

Jan 21, 2026

VLA Models Explained: Architecture, Types & the Leap to VLA+

What are VLA models? Learn Vision-Language-Action architecture, key systems (π0, Helix, SmolVLA) & the leap to VLA+. Deep dive.

Alyona Vert., +1

AI 101

LeJEPA: Provable Self-Supervised Learning Without Heuristic

11 min read

Nov 19, 2025

LeJEPA: Provable Self-Supervised Learning Without Heuristic

LeJEPA by Yann LeCun: provably stable self-supervised learning without heuristics. SIGReg, isotropic Gaussian embeddings & world models explained. Turing Post.

Alyona Vert., +1

AI 101

Kimi K2 Thinking: Inside Moonshot AI's Agentic Reasoning Model

9 min read

Nov 12, 2025

Kimi K2 Thinking: Inside Moonshot AI's Agentic Reasoning Model

Tracing the rise of China’s agentic intelligence strategy – from Kimi’s early vision to today’s open-source breakthrough

Alyona Vert., +1

AI 101

13 min read

Oct 8, 2025

AI 101: What's New in World Models?

A glimpse at Code World Model, PSI, and others – redefining how models catch the world in their nets

Alyona Vert.

Concepts

11 min read

Sep 17, 2025

What are Guardian Models?

Everything you need to know about models that defend AI today

Ksenia Se, +1

Concepts

What is PAN? How to Build a Better World Model?

10 min read

Aug 27, 2025

What is PAN? How to Build a Better World Model?

Explore how rethinking world model building patterns can turn our vision upside down and lead to a new Physical, Agentic, and Nested (PAN) system

Alyona Vert.

AI 101

12 min read

Aug 6, 2025

AI 101: Everything You Need to Know about GPT OSS

What is GPT-OSS? OpenAI's open-weight MoE models explained: architecture, Ollama setup, memory requirements, and benchmarks vs DeepSeek & Qwen3.

Alyona Vert., +2

Global AI Affairs

Breakdown: Kimi K2, DeepSeek-R1, Qwen3 (+Coder), and GLM-4.5

13 min read

Jul 30, 2025

Breakdown: Kimi K2, DeepSeek-R1, Qwen3 (+Coder), and GLM-4.5

Kimi K2, DeepSeek-R1, Qwen3, and GLM-4.5 compared on benchmarks and agentic use cases. Which Chinese open-source model leads in reasoning and coding in 2026?

Alyona Vert., +1

AI 101

4 Outstanding Families of Models You Must Know About

4 min read

Jul 23, 2025

4 Outstanding Families of Models You Must Know About

Refreshing Smol and Qwen models, Liquid Foundation Models with latest Hyena Edge, and legendary BERT

Alyona Vert.

AI 101

Reasoning Models Explained: o1, DeepSeek-R1 & Beyond

14 min read

Jun 18, 2025

Reasoning Models Explained: o1, DeepSeek-R1 & Beyond

Reasoning models use chain-of-thought to solve complex problems. Compare o1, DeepSeek-R1 & QwQ — and learn when to use each.

Ksenia Se, +1

AI 101

Decoding BERT: From Original NLP Game-Changer to Today's Efficient AI (feat. ConstBERT)

11 min read

May 28, 2025

Decoding BERT: From Original NLP Game-Changer to Today's Efficient AI (feat. ConstBERT)

What is BERT, how does it work, and why does it matter in 2026? Covers MLM, NSP, fine-tuning, RoBERTa, DistilBERT, ModernBERT, NeoBERT, and ConstBERT

Alyona Vert.

AI 101

Can Liquid Models Beat Transformers? Meet Hyena Edge – the Newest Member of the LFM Family

11 min read

Apr 30, 2025

Can Liquid Models Beat Transformers? Meet Hyena Edge – the Newest Member of the LFM Family

we discuss a new wave of architecture from Liquid AI – built from first principles, optimized for real hardware, and challenging the Transformer playbook with smarter, leaner models

Alyona Vert.

AI 101

12 min read

Apr 9, 2025

What are World Models?

A deep dive into the history and current advancements in world models and why they are an important puzzle piece for the future of AI

Alyona Vert.

AI 101

What is Qwen-Agent framework? Inside the Qwen family

9 min read

Mar 19, 2025

What is Qwen-Agent framework? Inside the Qwen family

we discuss the timeline of Qwen models, focusing on their agentic capabilities and how they compete with other models, and also explore what is Qwen-Agent framework and how you can use it

Alyona Vert.

AI 101

12 min read

Feb 26, 2025

Inside the family of Smol models

SmolLM2, SmolVLM, and SmolVLM2 explained: how Hugging Face trains small language models on curated datasets and multi-stage pipelines

Alyona Vert.

AI 101

LLaVA-o1: Step-by-Step Visual Reasoning VLM Explained

6 min read

Nov 27, 2024

LLaVA-o1: Step-by-Step Visual Reasoning VLM Explained

LLaVA-o1 reasons step-by-step through 4 structured stages & stage-level beam search. How it works, benchmarks vs GPT-4o-mini & Gemini-1.5-pro, and where it falls short.

Alyona Vert.

AI 101

Inside Les Ministraux: Mistral's Small Model Strategy

7 min read

Nov 6, 2024

Inside Les Ministraux: Mistral's Small Model Strategy

Trace Mistral AI's roadmap from Mistral 7B to Mixtral 8×7B and Ministral. Architecture, benchmarks, and edge computing use cases.

Alyona Vert.

AI 101

Whisper Model Explained: OpenAI's Open-Source Speech Recognition

6 min read

Oct 16, 2024

Whisper Model Explained: OpenAI's Open-Source Speech Recognition

Explore how OpenAI made their automatic speech recognition (ASR) model multilingual and multitasking

Alyona Vert.

AI 101

5 min read

Sep 25, 2024

What is OLMoE?

OLMoE: open-source sparse Mixture-of-Experts with 1B active and 7B total parameters. How it works, how it was trained, and why it matters for open-source AI.

Alyona Vert., +1

AI 101

8 min read

Aug 28, 2024

Inside DeepSeek Models

We discuss the innovation suggested by the DeepSeek team, how it improves the models' performance, and dive into the architectures and implementation of the models

Ksenia Se, +1

AI 101

What Is JEPA? Joint Embedding Predictive Architecture

11 min read

Jun 12, 2024

What Is JEPA? Joint Embedding Predictive Architecture

JEPA explained: Yann LeCun's Joint Embedding Predictive Architecture for world modeling. Covers I-JEPA, V-JEPA, MC-JEPA, architecture & key concepts.

Valeriia Kuka

AI 101

5 min read

May 29, 2024

What is Mamba?

Mamba is a selective SSM that processes sequences in linear time — no attention needed. How it works, how it compares to Transformers, and why it matters.

Ksenia Se, +1