Rework CI benchmark suite for per-stage granularity and I/O elimination

## Current State

**What's measured:**
- `bundle` benchmark: end-to-end `bundler.generate()` (scan + link + generate combined)
- `scan` benchmark: `bundler.scan()` (module loading + parsing + AST scanning)
- Test cases: threejs, rome_ts, multi-duplicated-symbols, with sourcemap/minify variants
- Micro-benchmarks: sourcemap joining, string concatenation

**What's NOT measured independently:**
- Link stage (symbol binding, tree shaking, import/export resolution, cross-module optimization)
- Generate stage (code splitting, cross-chunk linking, chunk rendering, minification)

## Problems

### 1. I/O dominates measurements

Both the `bundle` and `scan` benchmarks read hundreds of files from disk. This introduces noise on real machines (disk cache variability) and unreliable instruction counts in CodSpeed (I/O syscalls). The Node.js CI benchmark has a 110% alert threshold, meaning up to 10% regressions go undetected.

### 2. No per-stage granularity

The bundler pipeline has 3 distinct stages — scan, link, generate — but only end-to-end and scan-only are benchmarked. When someone optimizes tree-shaking or code generation, the improvement is diluted across the full pipeline and often invisible.

### 3. Recent optimizations were undetectable

Examples from commit history:
- `HashMap → IndexVec/IndexBitSet` for symbol tracking (link stage)
- Flag-based convergence in `include_statements` (tree shaking, link stage)
- String operation fast paths (generate stage)
- `IndexBitSet` for skipped plugins checking (link stage)
- Path allocation avoidance (scan stage, but diluted by I/O)

These are all sub-stage optimizations that get lost in end-to-end numbers.

### 4. Link and generate stages are pure computation but aren't benchmarked independently

The link stage takes `NormalizedScanStageOutput` and does symbol resolution, tree shaking, etc. entirely in memory (zero I/O). The generate stage is also mostly I/O-free. Yet neither is benchmarked independently, so we miss the chance for noise-free measurements.

### 5. Bundler is hardcoded to `OsFileSystem`

`ScanStage`, `SharedResolver`, `Bundle`, and `prepare_build_context()` all use `OsFileSystem` directly. A `MemoryFileSystem` implementation already exists in `crates/rolldown_fs/src/memory.rs` (feature-gated behind `memory`), but it can't be used because the entry points aren't generic over the `FileSystem` trait.

### 6. No benchmark cases targeting specific optimization patterns

No heavy tree-shaking scenario (large library, small subset used), no heavy code-splitting scenario (many entry points, shared deps), no deep re-export chain scenario (barrel files).

## Key Code References

- `crates/bench/benches/bundle.rs` — current end-to-end bundle benchmark
- `crates/bench/benches/scan.rs` — current scan-only benchmark
- `crates/rolldown/src/bundle/bundle.rs:235-247` — `bundle_up()` showing scan → link → generate pipeline
- `crates/rolldown/src/stages/link_stage/mod.rs:197` — `LinkStage::link()` (pure computation)
- `crates/rolldown/src/stages/generate_stage/mod.rs:82` — `GenerateStage::generate()`
- `crates/rolldown_fs/src/memory.rs` — existing `MemoryFileSystem` implementation
- `crates/rolldown/src/utils/prepare_build_context.rs` — where `OsFileSystem` is hardcoded

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework CI benchmark suite for per-stage granularity and I/O elimination #8642

Current State

Problems

1. I/O dominates measurements

2. No per-stage granularity

3. Recent optimizations were undetectable

4. Link and generate stages are pure computation but aren't benchmarked independently

5. Bundler is hardcoded to `OsFileSystem`

6. No benchmark cases targeting specific optimization patterns

Key Code References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Rework CI benchmark suite for per-stage granularity and I/O elimination #8642

Description

Current State

Problems

1. I/O dominates measurements

2. No per-stage granularity

3. Recent optimizations were undetectable

4. Link and generate stages are pure computation but aren't benchmarked independently

5. Bundler is hardcoded to OsFileSystem

6. No benchmark cases targeting specific optimization patterns

Key Code References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

5. Bundler is hardcoded to `OsFileSystem`