Blog

Blog

SMG: The Case for Disaggregating CPU from GPU in LLM Serving

How It Started: Hitting the GIL Wall at Scale We've been running production model serving…

Simo Lin, Chang Su, and Keyang Ru, members of LightSeek FoundationApril 30, 2026

Blog

Introducing AutoSP

¹ SSAIL Lab, University of Illinois Urbana-Champaign, ² Anyscale, ³ Snowflake TL;DR: AutoSP automatically converts…

Ahan Gupta¹, Zhihao Wang¹, Neel Dani¹, Masahiro Tanaka², Olatunji Ruwase³, Minjia Zhang¹April 29, 2026

Blog Case Studies

IBM Research uses vLLM at the heart of its RITS Platform

TL;DR: vLLM has been critical to democratizing access to our research community to the latest…

PyTorch FoundationApril 24, 2026

Blog

Optimizing Effective Training Time for Meta’s Internal Recommendation/Ranking Workloads

Motivation and Introduction Across the industry, teams training and serving large AI models face aggressive…

Ruilin Chen, Yuzhen Huang, Hang Qi, Mingming Ding, Damian Reeves, Boris Sarana, Kevin Tang, Satendra Gera, Gagan Jain, Sahil Shah, Oguz Ulgen, Mayank Garg, Meet Vadakkanchery, James March, Sophie Lin, Wei SunApril 17, 2026

Announcements Blog

PyTorch Conference Europe 2026: A Landmark Moment for Open Source AI in Paris

The first-ever PyTorch Conference Europe April 7-8, 2026 brought together more than 600 researchers, developers,…

PyTorch FoundationApril 15, 2026

Blog

Faster Diffusion on Blackwell: MXFP8 and NVFP4 with Diffusers and TorchAO

Diffusion models for image and video generation have been surging in popularity, delivering super-realistic visual…

Vasiliy Kuznetsov (Meta) and Sayak Paul (Hugging Face)April 8, 2026

Announcements Blog Press Release

PyTorch Foundation Announces Safetensors as Newest Contributed Project to Secure AI Model Execution

Safetensors is welcomed into the PyTorch Foundation to secure model distribution and build trusted agentic…

PyTorch FoundationApril 8, 2026

Blog

Monarch: an API to your supercomputer

Getting distributed training jobs to run on huge clusters is hard! This is especially true…

The PyTorch Team at MetaApril 8, 2026

Blog

SOTA Normalization Performance with torch.compile

Introduction Normalization methods (LayerNorm/RMSNorm) are foundational in deep learning and are used to normalize values…

Shunting Zhang, Paul Zhang, Elias Ellison, Markus Hoehnerbach, Jason Ansel, Natalia GimelsheinApril 8, 2026

Announcements Blog

ExecuTorch Becomes a Part of PyTorch Core to Expand On-Device Inference Capabilities

Today, we’re excited to share that ExecuTorch is becoming a part of PyTorch Core. ExecuTorch…

PyTorch FoundationApril 7, 2026

Announcements Blog Press Release

PyTorch Foundation Welcomes Helion as a Foundation-Hosted Project to Standardize Open, Portable, and Accessible AI Kernel Authoring

Helion joins community of leading open source AI projects to simplify kernel development across the…

PyTorch FoundationApril 7, 2026

Blog

Generating State-of-the-Art GEMMs with TorchInductor’s CuteDSL backend

Introduction TorchInductor currently supports three autotuning backends for matrix multiplications: Triton, CUTLASS (C++), and cuBLAS.…

Nikhil Patel, Michael Lazos, Driss Guessous, Elias Ellison, MetaApril 7, 2026

Announcements

RSVP for the 2026 PyTorch Docathon

We're excited to announce that the 2026 PyTorch Docathon will take place May 5-19! This…

Team PyTorchApril 3, 2026

Announcements

Call for Proposals Open for PyTorch Conference North America 2026

Submit a Session Proposal or Register Now to secure Super Early Bird pricing for PyTorch…

PyTorch FoundationApril 2, 2026

Announcements Blog

PyTorch Ecosystem Landscape Welcomes PhysicsNeMo, Unsloth, ONNX, and KTransformers

The PyTorch Ecosystem Working Group is happy to welcome several new projects to the PyTorch…

PyTorch Ecosystem Working GroupApril 2, 2026

Blog

Flight Recorder: A New Lens for Understanding NCCL Watchdog Timeouts

If you’ve ever trained a large AI model and had it fail with an error…

Phillip Liu, Uttam Thakore, Junjie Wang, Justin YangMarch 25, 2026

Blog

Enabling Up to 41% Faster Pre-training: MXFP8 and DeepEP for DeepSeek-V3 on B200 with TorchTitan

TL;DR In a joint effort between PyTorch and Nebius, we enabled training DeepSeek-V3 Mixture-of-Experts models…

PyTorch and Nebius (Hooman Ramezani) TeamsMarch 25, 2026

Blog

PyTorch 2.11 Release Blog

We are excited to announce the release of PyTorch® 2.11 (release notes)! The PyTorch 2.11…

PyTorch FoundationMarch 23, 2026

Blog

PyTorch 2.10+TorchAO: Powering AIPC scenarios on Intel® Core™ Ultra Series 3 processors

Overview We are excited to introduce the highlights of Intel® Core™ Ultra Series 3 processors…

Intel PyTorch and Client AI SW teamMarch 20, 2026

Blog

TorchSpec: Speculative Decoding Training at Scale

Introduction Over the past year, large language models have rapidly expanded in both scale and…

TorchSpec team, Mooncake teamMarch 19, 2026

SMG: The Case for Disaggregating CPU from GPU in LLM Serving

Introducing AutoSP

IBM Research uses vLLM at the heart of its RITS Platform

Optimizing Effective Training Time for Meta’s Internal Recommendation/Ranking Workloads

PyTorch Conference Europe 2026: A Landmark Moment for Open Source AI in Paris

Faster Diffusion on Blackwell: MXFP8 and NVFP4 with Diffusers and TorchAO

PyTorch Foundation Announces Safetensors as Newest Contributed Project to Secure AI Model Execution

Monarch: an API to your supercomputer

SOTA Normalization Performance with torch.compile

ExecuTorch Becomes a Part of PyTorch Core to Expand On-Device Inference Capabilities

PyTorch Foundation Welcomes Helion as a Foundation-Hosted Project to Standardize Open, Portable, and Accessible AI Kernel Authoring

Generating State-of-the-Art GEMMs with TorchInductor’s CuteDSL backend

RSVP for the 2026 PyTorch Docathon

Call for Proposals Open for PyTorch Conference North America 2026

PyTorch Ecosystem Landscape Welcomes PhysicsNeMo, Unsloth, ONNX, and KTransformers

Flight Recorder: A New Lens for Understanding NCCL Watchdog Timeouts

Enabling Up to 41% Faster Pre-training: MXFP8 and DeepEP for DeepSeek-V3 on B200 with TorchTitan

PyTorch 2.11 Release Blog

PyTorch 2.10+TorchAO: Powering AIPC scenarios on Intel® Core™ Ultra Series 3 processors

TorchSpec: Speculative Decoding Training at Scale

Docs

Tutorials

Resources

Stay in touch for updates, event info, and the latest news