-
-
Notifications
You must be signed in to change notification settings - Fork 0
PROJECT: Performance Benchmarks Suite for Numba, Dask, JAX (CPU/GPU) #264
Copy link
Copy link
Open
Description
Overview
Build a comprehensive benchmarks suite for computational libraries commonly used in QuantEcon projects to help define optimal workflow and program structure guidelines.
Motivation
Different computational backends (Numba, Dask, JAX CPU, JAX GPU) have different performance characteristics and overheads:
- JAX GPU has kernel launch overheads, making it only worthwhile for problems above a certain size
- Numba has JIT compilation overhead on first call
- Dask has scheduling overhead that may not pay off for small datasets
- JAX CPU vs GPU crossover points vary by operation type
Currently, guidance on when to use each backend is based on rules of thumb. A systematic benchmarking suite would provide data-driven recommendations.
Proposed Features
1. Benchmark Categories
- Array operations: matrix multiplication, element-wise ops, reductions
- Numerical algorithms: linear solvers, optimization, root finding
- Monte Carlo simulations: varying sample sizes
- Dynamic programming: value function iteration, policy iteration
- Common QuantEcon patterns: Markov chains, asset pricing, etc.
2. Problem Size Scaling
For each benchmark, test across problem sizes to identify:
- Minimum problem size where GPU becomes beneficial
- Crossover points between backends
- Memory bandwidth vs compute bound regimes
3. Automated Results Updates
- CI workflow that runs benchmarks on schedule (weekly/monthly)
- Triggered on new releases of key libraries (JAX, Numba, etc.)
- Results published to a dashboard or static site
- Historical tracking to show performance changes over versions
4. Guidelines Generation
- Automatically generate recommendations based on benchmark results
- "Use JAX GPU when matrix size > N" type guidance
- Integration with lecture documentation
Example Output
Matrix Multiplication Crossover Points (JAX 0.4.x, CUDA 12.x, A100)
-------------------------------------------------------------------
Size < 500x500: CPU faster (GPU overhead dominates)
Size 500-2000: Similar performance
Size > 2000x2000: GPU 10-100x faster
Recommendation: Use JAX GPU for matrices larger than 1000x1000
Technical Approach
- Use
pytest-benchmarkor similar for consistent measurement - Run on standardized hardware (GitHub Actions runners, cloud GPU instances)
- Store results in structured format (JSON/Parquet)
- Generate reports and visualizations automatically
Related
- QuantEcon lectures on JAX: https://jax.quantecon.org
- Numba lectures: https://python-programming.quantecon.org
- Google JAX benchmarks: https://github.com/google/jax/tree/main/benchmarks
Questions to Explore
- What hardware configurations should we benchmark on?
- Should this be a standalone repo or part of an existing project?
- How to handle GPU availability in CI (cost considerations)?
- What's the update frequency that balances freshness vs compute cost?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels