perf(semantic): pre-reserve unresolved_references using Stats::references by Dunqing · Pull Request #22580 · oxc-project/oxc

Dunqing · 2026-05-19T11:59:30Z

Summary

The unresolved_references Vec<(Ident, ReferenceId)> inside SemanticBuilder grows from cap = 0 via the default doubling policy — that's ~13 reallocations to reach the ~5k references of a moderately-sized TypeScript file (193 KB binder.ts), copying a cumulative ~16k entries (~250 KB of memory traffic) just to grow the buffer.

Stats already knows the exact reference count up front (via Stats::count or passed in by the caller via SemanticBuilder::with_stats). Pre-reserving with stats.references eliminates all the growth reallocations.

8-line internal change. No public API impact.

Benchmark

`cargo bench --bench semantic` (local Criterion)

Bench	main	this PR	Δ
`RadixUI.jsx` (3 KB)	3.155 µs	3.046 µs	−3.5%
`react.development.js` (72 KB)	136.47 µs	133.78 µs	−2.0%
`cal.com.tsx` (1 MB TSX)	2.7414 ms	2.7064 ms	−1.3%
`binder.ts` (193 KB TS)	314.90 µs	309.46 µs	−1.7%

All Criterion intervals non-overlapping.

CodSpeed (precise, instruction-count based, from #22571 which tested the same commit)

Bench	Δ
`pipeline[binder.ts]`	−4.34%
`semantic[binder.ts]`	−9.75%
`semantic[react.development.js]`	−5.06%

CodSpeed's instruction-count measurement is more sensitive to allocator-call elimination than wall-clock Criterion, so the numbers are larger but consistent in direction.

Why `reserve_exact` (not `reserve`)

stats.references is the exact total push count (asserted in debug builds via Stats::assert_accurate). No need for amortized growth headroom.
Matches the existing precedent in Scoping::reserve which uses reserve_exact on arena-allocated structures.
Sub-100% reserves (tested 50%) were empirically worse than no reserve — the single resulting realloc is a large-block alloc that misses the size-class fast paths the small doubling reallocs hit. So 100% is the optimum.

Test Plan

cargo test -p oxc_semantic --lib --tests — 71 tests pass
cargo bench --bench semantic — wins above
CodSpeed re-run on commit (positive results above)

AI disclosure: drafted with Claude Code, reviewed manually.

codspeed-hq · 2026-05-19T12:06:36Z

Merging this PR will improve performance by 6.29%

⚡ 3 improved benchmarks
✅ 45 untouched benchmarks
⏩ 3 skipped benchmarks¹

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
⚡	Simulation	`pipeline[binder.ts]`	13.9 ms	13.4 ms	+4.08%
⚡	Simulation	`semantic[react.development.js]`	1.1 ms	1.1 ms	+5.17%
⚡	Simulation	`semantic[binder.ts]`	2.9 ms	2.6 ms	+9.71%

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing claude/perf-reserve-unresolved-refs (2dced86) with main (ae98296)}

3 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

Boshen · 2026-05-19T13:53:34Z

Merge activity

May 19, 1:53 PM UTC: The merge label '0-merge' was detected. This PR will be added to the Graphite merge queue once it meets the requirements.
May 19, 1:53 PM UTC: Boshen added this pull request to the Graphite merge queue.
May 19, 1:59 PM UTC: Merged by the Graphite merge queue.

Boshen

Nice!

…nces (#22580) ## Summary The `unresolved_references` `Vec<(Ident, ReferenceId)>` inside `SemanticBuilder` grows from `cap = 0` via the default doubling policy — that's ~13 reallocations to reach the ~5k references of a moderately-sized TypeScript file (193 KB `binder.ts`), copying a cumulative ~16k entries (~250 KB of memory traffic) just to grow the buffer. `Stats` already knows the exact reference count up front (via `Stats::count` or passed in by the caller via `SemanticBuilder::with_stats`). Pre-reserving with `stats.references` eliminates all the growth reallocations. 8-line internal change. No public API impact. ## Benchmark ### `cargo bench --bench semantic` (local Criterion) | Bench | main | this PR | Δ | |---|---|---|---| | `RadixUI.jsx` (3 KB) | 3.155 µs | 3.046 µs | **−3.5%** | | `react.development.js` (72 KB) | 136.47 µs | 133.78 µs | **−2.0%** | | `cal.com.tsx` (1 MB TSX) | 2.7414 ms | 2.7064 ms | **−1.3%** | | `binder.ts` (193 KB TS) | 314.90 µs | 309.46 µs | **−1.7%** | All Criterion intervals non-overlapping. ### CodSpeed (precise, instruction-count based, from #22571 which tested the same commit) | Bench | Δ | |---|---| | `pipeline[binder.ts]` | **−4.34%** | | `semantic[binder.ts]` | **−9.75%** | | `semantic[react.development.js]` | **−5.06%** | CodSpeed's instruction-count measurement is more sensitive to allocator-call elimination than wall-clock Criterion, so the numbers are larger but consistent in direction. ## Why `reserve_exact` (not `reserve`) - `stats.references` is the exact total push count (asserted in debug builds via `Stats::assert_accurate`). No need for amortized growth headroom. - Matches the existing precedent in `Scoping::reserve` which uses `reserve_exact` on arena-allocated structures. - Sub-100% reserves (tested 50%) were empirically *worse* than no reserve — the single resulting realloc is a large-block alloc that misses the size-class fast paths the small doubling reallocs hit. So 100% is the optimum. ## Test Plan - [x] `cargo test -p oxc_semantic --lib --tests` — 71 tests pass - [x] `cargo bench --bench semantic` — wins above - [x] CodSpeed re-run on commit (positive results above) AI disclosure: drafted with Claude Code, reviewed manually.

## Summary Adds `kitchen-sink.tsx` — a comprehensive synthetic TypeScript+JSX fixture maintained at [oxc-project/benchmark-files](https://github.com/oxc-project/benchmark-files) — to both `TestFiles::minimal()` (bench input set) and `TestFiles::complicated()` (alloc-tracking input set). The existing files in each set are untouched; this is a strict append. ## Why The existing bench input set didn't reliably surface general-purpose perf wins above the ~1-2% measurement noise floor: - #22580 (semantic pre-reserve) — visible because `binder.ts` exercises it - #22594 (formatter buffer) — visible - #22596 (minifier `try_fold_concat`) — **not visible** on the old set - #22599 (semantic resolve-refs no-temp-Vec) — **not visible** - #22603 (semantic var-hoist SmallVec) — **not visible** The kitchen-sink fixes that by exercising every AST node, every transformer plugin, every minifier optimization opportunity, and every semantic step in one large file. Verified by re-benching #22596 against this fixture: **minifier mean −1.5%, min −3.7%** — above noise, signal confirmed. ## Fixture stats (cross-checked locally) | Metric | Value | |---|---| | Source size | 21,117 lines / 732.90 kB | | AST nodes | ~133,000 | | Scopes | ~4,750 | | Symbols | ~7,000 | | Resolved references | ~16,000 | | Semantic diagnostics | 0 errors / 0 warnings | ## Snap baselines `tasks/track_memory_allocations/allocs_*.snap` updated with the kitchen-sink row across all 5 pipelines (parser / semantic / transformer / minifier / formatter). Future PRs that change allocation behavior on this fixture will produce a snap diff in CI. ## Bench-cleaner fix `tasks/benchmark/benches/lexer.rs`'s `SourceCleaner` was missing `visit_ts_template_literal_type` — TypeScript type-level template literals (e.g. `` `${T}-${U}` `` in conditional / mapped types) are syntactically identical to value-level template literals, so the bench-mode lexer (without parser context) cannot distinguish them. Without the cleaner converting them to plain strings, kitchen-sink's type-level templates caused the lexer bench to swallow ~1 KB spans as a single `TemplateHead` and produce spurious "Unterminated string" / "Invalid Unicode escape" errors. One-line fix to mirror the existing `visit_template_literal` handling. AI disclosure: drafted with Claude Code, reviewed manually.

…ec (#22599) ## Summary `SemanticBuilder::resolve_references_for_current_scope` is the early-resolution hook fired at function / arrow / catch boundaries — it resolves the references collected since a checkpoint against the outer scope chain so that e.g. `function f(x = outerRef) {}` correctly binds `outerRef` to the outer scope rather than the function body. The old implementation copied the pending slice into a fresh `Vec` on every call: ```rust let refs = self.unresolved_references.slice_from(checkpoint).to_vec(); self.unresolved_references.truncate(checkpoint); for (name, reference_id) in refs { ... } ``` The `to_vec()` is a fresh heap allocation, needed because `walk_up_resolve_reference` takes `&mut self` and would otherwise alias the borrow on the underlying `Vec<(Ident, ReferenceId)>`. Rewrite in-place with a retain-style write cursor — each `(Ident, ReferenceId)` is read by **value** (both are `Copy`), which detaches it from the borrow on the inner Vec. Unresolved refs are compacted forward; resolved refs are dropped via the final `truncate`. No temporary `Vec`, no allocation, **no `unsafe`** — safe bounds-checked indexing via new `get` / `set` methods on `UnresolvedReferences`. ```rust let mut write_idx = checkpoint; for read_idx in checkpoint..end { let (name, reference_id) = self.unresolved_references.get(read_idx); if !self.walk_up_resolve_reference(name, reference_id) { if write_idx != read_idx { self.unresolved_references.set(write_idx, name, reference_id); } write_idx += 1; } } self.unresolved_references.truncate(write_idx); ``` ## Allocation impact `cargo allocs`, `allocs_semantic.snap` (baseline = parent commit, after var-hoist landed): | File | Size | Sys allocs before | Sys allocs after | Δ | |---|---|---|---|---| | `checker.ts` | 2.92 MB | 2,309 | **53** | **−2,256 (−98%)** | | `App.tsx` | 415 kB | 174 | **35** | **−139 (−80%)** | | `binder.ts` | 193 kB | 196 | **19** | **−177 (−90%)** | | `kitchen-sink.tsx` | 733 kB | 1,320 | **290** | **−1,030 (−78%)** | | `pdf.mjs` | 567 kB | 1,366 | 1,355 | −11 | | `antd.js` | 6.69 MB | 1,072 | 1,072 | 0 (ES5 bundled code rarely closes over outer scope from function params) | | `RadixUI.jsx` | 2.5 kB | 10 | 10 | 0 | ## How I found this After [#22580](#22580) and [#22590](#22590) the remaining semantic sys allocs were unexplained. Ran a backtrace-capturing `System` allocator wrapper on `checker.ts` and aggregated allocations by call stack — the overwhelming majority of captured allocations converged on a single site: ``` oxc_semantic::builder::SemanticBuilder::resolve_references_for_current_scope oxc_semantic::builder::SemanticBuilder as ...::Visit::visit_function ... ``` Tracing back to the `slice_from(checkpoint).to_vec()` line and confirming that the entire allocation pattern disappears with the in-place rewrite. ## Why this matters for downstream consumers `resolve_references_for_current_scope` fires once per function / arrow / catch — for normal TS / TSX code that's hundreds to thousands of calls per build. Each call was a fresh heap allocation; eliminating them removes that allocator pressure entirely. For rolldown's preprocessing pipeline (which builds semantic 3-4 times per file across hundreds of files per bundle), the per-build savings compound. ## No behavior change The in-place algorithm preserves the original semantics exactly: `unresolved_references[..checkpoint]` is untouched, and `[checkpoint..]` is the list of refs that didn't resolve, in their original order. Same as the old code. ## Test Plan - [x] `cargo test -p oxc_semantic --lib --tests` — pass - [x] `cargo allocs` — semantic snapshot updated to reflect the reductions AI disclosure: drafted with Claude Code, reviewed manually.

## Summary When the minifier folds `.concat()` calls into template literals via `try_fold_concat` in `peephole/replace_known_methods.rs`, the original implementation built two short-lived intermediate buffers per call: - `quasi_strs: Vec<Cow<'a, str>>` — fresh `std::Vec` allocation per call. - A `String` inside each `Cow::Owned` — created when `Cow::to_mut()` cloned a `Borrowed` into an `Owned`, then grown via doubling as more arg strings were pushed in. For files with many `.concat()` calls — common in bundled ES5 libraries where template literals weren't available — both structures were freshly allocated thousands of times per minify, with the `String` growing via doubling each time. This PR replaces those per-call buffers with a single reusable scratch `String` held on `MinifierState`, and constructs `TemplateElement` AST nodes directly into the arena `Vec` as args are drained. The intermediate `Vec<Cow<'a, str>>` is gone entirely. After the first few folds the scratch's capacity stabilizes and subsequent folds are amortized zero std-heap allocs. The state machine simplifies in the process: the `pushed_quasi: bool` flag is gone. Its invariant ("scratch holds an in-progress quasi") is naturally maintained because every expression flush clears the scratch, so a back-to-back expression's flush produces the required empty separator quasi without a branch. ## Allocation impact `cargo allocs`, `allocs_minifier.snap` (baseline = parent commit, after resolve-refs and var-hoist landed): | File | Size | Sys allocs before | Sys allocs after | Sys reallocs before | Sys reallocs after | |---|---|---|---|---|---| | `antd.js` | 6.69 MB ES5 | **4,652** | **3,084** (`−1,568`, `−33.7%`) | **1,622** | **53** (`−1,569`) | | Other tracked files | various | unchanged | unchanged | unchanged | unchanged | Only `antd.js` shows a change because it's the only tracked file that exercises `.concat()` folding heavily. ES5 bundled code uses `.concat()` instead of template literals; antd.js has thousands of these. The headline `Sys reallocs` improvement (`1,622` → `53`) comes from eliminating the per-call `String` doubling. The headline `Sys allocs` improvement (`4,652` → `3,084`) comes from eliminating the per-call `Vec<Cow<'a, str>>` allocation. ## Commits This PR contains two commits, kept separate for review history: 1. **`perf(minifier): pre-size buffers in try_fold_concat to eliminate growth reallocs`** — the original tactical fix that pre-sized the doomed-to-be-removed intermediate buffers. Dropped sys reallocs first but didn't touch sys allocs. 2. **`perf(minifier): drop per-call buffers in try_fold_concat`** — the root-cause refactor described above. Removes the intermediate buffers entirely. Cuts both columns. Once the second commit lands, the pre-sizing in the first becomes dead-code (the buffers it pre-sized no longer exist). They're kept as separate commits so the diff against `main` is reviewable — feel free to squash on merge. ## How I found this After #22580, profiling on `antd.js` minifier showed it had unusually high sys reallocs vs other tracked files. I wrote a backtrace-capturing `System` allocator wrapper and ran it on the minify path. The top realloc sites all converged on `try_fold_concat` — the overwhelming majority of captured reallocs came from this one function. After landing the pre-size fix, code review pushed for a root-cause fix that also addressed the `Sys allocs` column, which this refactor delivers. ## Timing The minifier bench suite (originally) didn't include antd.js or other `.concat()`-heavy fixtures, so bench numbers locally were flat (no regression, no obvious improvement). With kitchen-sink in the bench input set (#22609), CodSpeed shows: **minifier mean −1.5%, min −3.7%** on kitchen-sink. ## Verification - `cargo test -p oxc_minifier` — pass. - `cargo minsize` — no size regressions (`minsize.snap` byte-identical). - `cargo allocs` — only `antd.js` row changed in `allocs_minifier.snap`. - `cargo fmt -p oxc_minifier` clean. AI disclosure: drafted with Claude Code, reviewed manually.

### 🚀 Features - e857b0c napi/minify: Expose legalComments option and result (#20370) (Boshen) - 661132d parser: More friendly error messages for rest assignment target and rest binding element (#22719) (sapphi-red) - ee659b6 transformer/legacy-decorator: Add `strictNullChecks` option for nullable-union design:type (#22266) (Kyle Cannon) ### 🐛 Bug Fixes - e1d064e transformer/class-properties: Reparent lifted private method helpers (#22716) (Cameron) - 4ac0fca minifier: Preserve `0 && (module.exports = { ... })` cjs-module-lexer hint (#22729) (Dunqing) - 40ff611 minifier: Mark peephole loop changed when dropping dead-after-throw statement (#22722) (Dunqing) - 2f7b210 codegen: Emit pife-arrow/function leading comments inside the wrap (#22720) (Dunqing) - e184f74 parser: Improve invalid `import` property access diagnostic (#22693) (camc314) - 7baed9c transformer/private-method: Clear inherited strict flags (#22508) (camc314) - a9ad27e parser: Keep annotation comments leading without preceding newline (#22711) (Dunqing) - 9ea4d64 minifier: Re-evaluate pure/no-side-effects flags after peephole inlining (#22595) (Dunqing) - 07afbb6 minifier: Drop empty-body IIFE wrapper when called with arguments (#22589) (Dunqing) - fa7c463 semantic: Correct TS enum member symbol spans (#22689) (camc314) - 26b9396 semantic: Resolve parameter decorators outside parameter scope (#22623) (camc314) - b284045 parser: Switch to module goal eagerly on `export` (#22684) (Boshen) - dfa931d semantic: Propagate unresolved auto-increment enum value instead of defaulting to 0 (#22646) (Dunqing) - 69a6ba6 transformer/legacy-decorator: Emit Array for ReadonlyArray<T> in decorator metadata (#22265) (Kyle Cannon) - e421ef0 transformer/legacy-decorator: Return runtime binding for design:type (#22640) (Dunqing) - d61e1d7 codegen: Preserve verbatim text of pure/no-side-effects comments (#22525) (Dunqing) - 702b14e minifier: Preserve IIFE structure in DCE-only mode (#22547) (Dunqing) - 917da24 parser: Apply PURE comment through member-access chains (#22566) (Dunqing) - a069b1c codegen: Preserve quotes for cjs-module-lexer equality strings (#22551) (Dunqing) ### ⚡ Performance - 2f623b0 semantic: Skip unresolved checks for re-exports (#22660) (camc314) - 0d9553d semantic: Early-exit `check_object_expression` for objects with <2 properties (#22668) (Dunqing) - d721ad9 semantic: Use direct grandparent lookup for TS type parameters (#22658) (camc314) - 0aff288 semantic: Reorder numeric literal strict mode checks (#22657) (camc314) - 4d5ddb1 semantic: Reorder binding identifier checks (#22656) (camc314) - e32acd8 semantic: Reorder identifier ambient binding check (#22653) (camc314) - 09fe178 semantic: Reorder ident reference strict mode check (#22652) (camc314) - 4b6add2 semantic: Avoid duplicate ident clone for bindings (#22663) (camc314) - 82f9662 parser: Check identifier kind before context flag (#22662) (camc314) - d7cd951 parser: Fast path identifier parsing and inline operator helpers (#22650) (Boshen) - 7b84314 semantic: Use direct byte access for numeric leading-zero check (#22642) (camc314) - 0345a31 semantic: Pre-size class elements hash map (#22618) (camc314) - 04d3065 minifier: Drop per-call buffers in try_fold_concat (#22596) (Dunqing) - 4f289f1 semantic: Resolve_references_for_current_scope without a temp Vec (#22599) (Dunqing) - e862c15 semantic: Avoid heap alloc for var hoist scope ids (#22603) (Dunqing) - 8ff8674 semantic: Early return if `excess` is `0` in `Stats::increase_by` (#22616) (camc314) - 7a4120e semantic: Pre-reserve unresolved_references using Stats::references (#22580) (Dunqing) Co-authored-by: Dunqing <29533304+Dunqing@users.noreply.github.com>

github-actions Bot added the A-semantic Area - Semantic label May 19, 2026

Dunqing mentioned this pull request May 19, 2026

perf(semantic): pre-reserve ClassTable per-class IndexVecs using Stats::classes #22581

Closed

3 tasks

Boshen added the 0-merge Merge with Graphite Merge Queue label May 19, 2026

Boshen approved these changes May 19, 2026

View reviewed changes

graphite-app Bot force-pushed the claude/perf-reserve-unresolved-refs branch from 2dced86 to 7a4120e Compare May 19, 2026 13:54

graphite-app Bot merged commit 7a4120e into main May 19, 2026
29 checks passed

graphite-app Bot removed the 0-merge Merge with Graphite Merge Queue label May 19, 2026

graphite-app Bot deleted the claude/perf-reserve-unresolved-refs branch May 19, 2026 13:59

oxc-guard Bot mentioned this pull request May 26, 2026

release(crates): oxc v0.133.0 #22734

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(semantic): pre-reserve unresolved_references using Stats::references#22580

perf(semantic): pre-reserve unresolved_references using Stats::references#22580
graphite-app[bot] merged 1 commit into
mainfrom
claude/perf-reserve-unresolved-refs

Dunqing commented May 19, 2026

Uh oh!

codspeed-hq Bot commented May 19, 2026 •

edited

Loading

Uh oh!

Boshen commented May 19, 2026 •

edited by graphite-app Bot

Loading

Uh oh!

Boshen left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Dunqing commented May 19, 2026

Summary

Benchmark

cargo bench --bench semantic (local Criterion)

CodSpeed (precise, instruction-count based, from #22571 which tested the same commit)

Why reserve_exact (not reserve)

Test Plan

Uh oh!

codspeed-hq Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will improve performance by 6.29%

Performance Changes

Footnotes

Uh oh!

Boshen commented May 19, 2026 • edited by graphite-app Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Boshen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`cargo bench --bench semantic` (local Criterion)

Why `reserve_exact` (not `reserve`)

codspeed-hq Bot commented May 19, 2026 •

edited

Loading

Boshen commented May 19, 2026 •

edited by graphite-app Bot

Loading