PyTorch | Microsoft Open Source Blog

•

July 7, 2025

•

7 min read

Optimizing memory usage in large language models fine-tuning with KAITO: Best practices from Phi-3

The Cloud Native team at Azure is working to make AI on Kubernetes more cost-effective and approachable for a broader range of users.

•

February 6, 2024

•

7 min read

On-Device Training: Training a model in browser

Continuing the ONNXRuntime On-Device Training blog series, we are introducing ONNX Runtime Training for Web.

News
Python

•

August 1, 2023

•

6 min read

Introducing ONNX Script: Authoring ONNX with the ease of Python

ONNX Script is a new open-source library for directly authoring ONNX models in Python.

•

June 26, 2023

•

4 min read

Olive: A user-friendly toolchain for hardware-aware model optimization

Introducing Olive, an easy-to-use toolchain for optimizing models with hardware awareness. With Olive, you don't need to be an expert to explore diverse hardware optimization toolchains.

•

May 2, 2022

•

5 min read

Optimizing and deploying transformer INT8 inference with ONNX Runtime-TensorRT on NVIDIA GPUs

Mohit Ayani, Solutions Architect, NVIDIA Shang Zhang, Senior AI Developer Technology Engineer, NVIDIA Jay Rodge, Product Marketing Manager-AI, NVIDIA Transformer-based models have revolutionized the natural language processing (NLP) domain.

•

April 19, 2022

•

8 min read

Scaling-up PyTorch inference: Serving billions of daily NLP inferences with ONNX Runtime

Scale, performance, and efficient deployment of state-of-the-art Deep Learning models are ubiquitous challenges as applied machine learning grows across the industry.

•

March 21, 2022

•

6 min read

Supporting efficient large model training on AMD Instinct™ GPUs with DeepSpeed

This post was co-authored by Jithun Nair and Aswin Mathews, members of technical staff at AMD. In recent years, large-scale deep learning models have demonstrated impressive capabilities, excelling at tasks across natural language processing, computer vision, and speech domains.

•

August 4, 2021

•

6 min read

Introducing Distributed Data Parallel support on PyTorch Windows

Model training has been and will be in the foreseeable future one of the most frustrating things machine learning developers face. It takes quite a long time and people can’t really do anything about it.

•

July 13, 2021

•

3 min read

Accelerate PyTorch training with torch-ort

With a simple change to your PyTorch training script, you can now speed up training large language models with torch_ort.ORTModule, running on the target hardware of your choice. Training deep learning models requires ever-increasing compute and memory resources. Today we release torch_ort.

•

July 13, 2021

•

4 min read

ONNX Runtime release 1.8.1 previews support for accelerated training on AMD GPUs with the AMD ROCm™ Open Software Platform

This post was co-authored by Jeff Daily, a Principal Member of Technical Staff, Deep Learning Software for AMD. ONNX Runtime is an open-source project that is designed to accelerate machine learning across a wide range of frameworks, operating systems, and hardware platforms.

•

June 30, 2021

•

7 min read

Journey to optimize large scale transformer model inference with ONNX Runtime

With its resource-efficient and high-performance nature, ONNX Runtime helped us meet the need of deploying a large-scale multi-layer generative transformer model for code, a.k.a., GPT-C, to empower IntelliCode with the whole line of code completion suggestions in Visual Studio and Visual Studio Code.

PyTorch

•

May 25, 2021

•

3 min read

Delivering reliable production experiences with PyTorch Enterprise on Microsoft Azure

At Microsoft, we use PyTorch to power products such as Bing and Azure Cognitive Services and we actively contribute to several PyTorch open-source projects, including PyTorch Profiler, ONNX Runtime, DeepSpeed, and more. Today, we’re announcing a new initiative in collaboration with Facebook—the PyTorch Enterprise Support Program. This new program enables service providers to develop and offer tailored enterprise-grade support to their customers.

•

October 12, 2020

•

2 min read

Introducing ONNX Runtime mobile – a reduced size, high performance package for edge devices

ONNX Runtime is an open source project that is designed to accelerate machine learning across a wide range of frameworks, operating systems, and hardware platforms. Today, we are excited to announce ONNX Runtime release v1.5 as part of our AI at Scale initiative.