PROJECT: Performance Benchmarks Suite for Numba, Dask, JAX (CPU/GPU)

## Overview

Build a comprehensive benchmarks suite for computational libraries commonly used in QuantEcon projects to help define optimal workflow and program structure guidelines.

## Motivation

Different computational backends (Numba, Dask, JAX CPU, JAX GPU) have different performance characteristics and overheads:

- **JAX GPU** has kernel launch overheads, making it only worthwhile for problems above a certain size
- **Numba** has JIT compilation overhead on first call
- **Dask** has scheduling overhead that may not pay off for small datasets
- **JAX CPU vs GPU** crossover points vary by operation type

Currently, guidance on when to use each backend is based on rules of thumb. A systematic benchmarking suite would provide data-driven recommendations.

## Proposed Features

### 1. Benchmark Categories

- **Array operations**: matrix multiplication, element-wise ops, reductions
- **Numerical algorithms**: linear solvers, optimization, root finding
- **Monte Carlo simulations**: varying sample sizes
- **Dynamic programming**: value function iteration, policy iteration
- **Common QuantEcon patterns**: Markov chains, asset pricing, etc.

### 2. Problem Size Scaling

For each benchmark, test across problem sizes to identify:
- Minimum problem size where GPU becomes beneficial
- Crossover points between backends
- Memory bandwidth vs compute bound regimes

### 3. Automated Results Updates

- CI workflow that runs benchmarks on schedule (weekly/monthly)
- Triggered on new releases of key libraries (JAX, Numba, etc.)
- Results published to a dashboard or static site
- Historical tracking to show performance changes over versions

### 4. Guidelines Generation

- Automatically generate recommendations based on benchmark results
- "Use JAX GPU when matrix size > N" type guidance
- Integration with lecture documentation

## Example Output

```
Matrix Multiplication Crossover Points (JAX 0.4.x, CUDA 12.x, A100)
-------------------------------------------------------------------
Size < 500x500:     CPU faster (GPU overhead dominates)
Size 500-2000:      Similar performance
Size > 2000x2000:   GPU 10-100x faster

Recommendation: Use JAX GPU for matrices larger than 1000x1000
```

## Technical Approach

- Use `pytest-benchmark` or similar for consistent measurement
- Run on standardized hardware (GitHub Actions runners, cloud GPU instances)
- Store results in structured format (JSON/Parquet)
- Generate reports and visualizations automatically

## Related

- QuantEcon lectures on JAX: https://jax.quantecon.org
- Numba lectures: https://python-programming.quantecon.org
- Google JAX benchmarks: https://github.com/google/jax/tree/main/benchmarks

## Questions to Explore

1. What hardware configurations should we benchmark on?
2. Should this be a standalone repo or part of an existing project?
3. How to handle GPU availability in CI (cost considerations)?
4. What's the update frequency that balances freshness vs compute cost?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PROJECT: Performance Benchmarks Suite for Numba, Dask, JAX (CPU/GPU) #264

Overview

Motivation

Proposed Features

1. Benchmark Categories

2. Problem Size Scaling

3. Automated Results Updates

4. Guidelines Generation

Example Output

Technical Approach

Related

Questions to Explore

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

PROJECT: Performance Benchmarks Suite for Numba, Dask, JAX (CPU/GPU) #264

Description

Overview

Motivation

Proposed Features

1. Benchmark Categories

2. Problem Size Scaling

3. Automated Results Updates

4. Guidelines Generation

Example Output

Technical Approach

Related

Questions to Explore

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions