GPU Cloud Provider Benchmarking Toolkit — a collection of scripts to benchmark and compare GPU cloud providers across networking, compute, storage, and ML workloads.
| # | Benchmark | Script | What it measures |
|---|---|---|---|
| 1 | System Info | cluster_sysinfo.sh |
GPU, CPU, memory, NVLink, InfiniBand, NUMA topology |
| 2 | Network Speed | speed_test.sh |
Download/upload bandwidth, ping latency |
| 3 | PyTorch Install | pytorch_install_bench.sh |
Time to install PyTorch via uv and pip (cold cache) |
| 4 | HF Download | hf_download_bench.sh |
HuggingFace model download throughput |
| 5 | Container Pull | container_pull_bench.sh |
Docker/Podman image pull time (NGC PyTorch) |
| 6 | GPU Compute | gpu_compute_bench.sh |
FP32/FP16/BF16 matmul TFLOPS, memory bandwidth |
| 7 | Storage I/O | storage_io_bench.sh |
Sequential read/write, random 4K IOPS |
| 8 | LLM Inference | llm_inference_bench.sh |
TTFT, TPOT, TBT, throughput via vLLM |
# Clone
git clone https://github.com/chloepilonv/neocloud-bench.git
cd neocloud-bench
# (Optional) Set up HF token for download benchmark
cp .env.example .env
# Edit .env with your HuggingFace token
# Run a benchmark
cd scripts/
./speed_test.sh my_provider 3
./cluster_sysinfo.sh my_provider
./gpu_compute_bench.sh my_provider 3
# Results are saved to results/{benchmark_name}/All benchmark scripts follow the same pattern:
./scripts/{benchmark}.sh <provider_name> [num_runs]provider_name— a label for the cloud provider being tested (e.g.,provider_a)num_runs— number of iterations (default: 3)
Results are written as CSV files to results/{benchmark_name}/.
System Info — No runs parameter, just collects hardware/software inventory:
./scripts/cluster_sysinfo.sh provider_aNetwork Speed — Uses speedtest-cli (auto-installs if missing):
./scripts/speed_test.sh provider_a 5PyTorch Install — Benchmarks cold-cache install time with both uv and pip:
./scripts/pytorch_install_bench.sh provider_a 3HuggingFace Download — Requires HF_TOKEN in .env for gated models:
./scripts/hf_download_bench.sh provider_a 3Container Pull — Requires Docker or Podman:
./scripts/container_pull_bench.sh provider_a 3GPU Compute — Requires PyTorch with CUDA:
./scripts/gpu_compute_bench.sh provider_a 3Storage I/O — Uses dd for sequential, fio for random I/O (optional):
./scripts/storage_io_bench.sh provider_a 3LLM Inference — Requires vllm installed. Configurable model via MODEL env var:
MODEL=Qwen/Qwen2.5-Coder-14B ./scripts/llm_inference_bench.sh provider_a 3pip install -r requirements.txt
# Run from repo root
python analysis/plot_speed_test.py
python analysis/plot_pytorch_install.py
python analysis/plot_hf_download.py
python analysis/plot_container_pull.py
python analysis/plot_gpu_compute.py
python analysis/plot_storage_io.py
python analysis/plot_llm_inference.pypython analysis/generate_report.pyGenerates analysis/output/report_{timestamp}.md and report_{timestamp}.html.
The HTML report is self-contained with embedded Plotly charts, sortable tables, and a clean design.
The examples/ directory contains sanitized sample data with realistic but fabricated numbers. Use it to test the analysis scripts:
python analysis/generate_report.py --results-dir examples/results/neocloud-bench/
scripts/ # Benchmark scripts (bash)
analysis/ # Analysis & plotting scripts (python)
generate_report.py # Unified report generator
plot_*.py # Individual benchmark chart generators
results/ # Benchmark output (gitignored)
examples/results/ # Sanitized example data
requirements.txt # Python dependencies
.env.example # Template for secrets
- Bash, Python 3.8+
- Per-benchmark: see individual script headers
- For analysis:
pip install -r requirements.txt - For LLM inference:
pip install vllm
- Fork the repo
- Create a feature branch
- Run your benchmarks and verify CSV output
- Submit a PR
MIT