GEMM — Triton vs PyTorch Benchmark

Hey guys - Not an expert here. Here is an educational benchmarking tool to explore how different parameters affect matrix multiplication (GEMM) performance between a custom Triton kernel and PyTorch’s built-in GEMM, evaluated on NVIDIA B200.

The Triton GEMM was mostly taken from : https://triton-lang.org/main/getting-started/tutorials/03-matrix-multiplication.html#sphx-glr-getting-started-tutorials-03-matrix-multiplication-py

Features

Compare Triton vs PyTorch GEMM throughput
Toggle Tensor Core usage, tiling, and pipeline stages
Enable fused bias + activation inside the kernel
Visualize results in a Streamlit dashboard (GFLOP/s, latency, relative error)

Parameters that impact performance

Precision: float16, bfloat16, float32
Tile sizes: BLOCK_M, BLOCK_N, BLOCK_K
Compute scheduling: num_warps, num_stages, GROUP_M
Fusion: Bias and activation fused into the GEMM kernel
Reuse / Grouped mapping / Pipeline: Improves cache locality and overlap of load+compute

Run

python3 -m venv venv
source venv/bin/activate
pip3 install -r requirements.txt
streamlit run gemm_benchmark_tritonvstorch.py

What it should look like !

License

Under MIT License, see LICENSE

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
Image_Example.png		Image_Example.png
LICENSE		LICENSE
README.md		README.md
gemm_baseline.py		gemm_baseline.py
gemm_benchmark_tritonvstorch.py		gemm_benchmark_tritonvstorch.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GEMM — Triton vs PyTorch Benchmark

Features

Parameters that impact performance

Run

What it should look like !

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GEMM — Triton vs PyTorch Benchmark

Features

Parameters that impact performance

Run

What it should look like !

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages