perf(semantic): pre-reserve unresolved_references using Stats::references#22580
Merged
Merged
Conversation
Merging this PR will improve performance by 6.29%
Performance Changes
Tip Curious why this is faster? Comment Comparing Footnotes
|
3 tasks
Member
Merge activity
|
…nces (#22580) ## Summary The `unresolved_references` `Vec<(Ident, ReferenceId)>` inside `SemanticBuilder` grows from `cap = 0` via the default doubling policy — that's ~13 reallocations to reach the ~5k references of a moderately-sized TypeScript file (193 KB `binder.ts`), copying a cumulative ~16k entries (~250 KB of memory traffic) just to grow the buffer. `Stats` already knows the exact reference count up front (via `Stats::count` or passed in by the caller via `SemanticBuilder::with_stats`). Pre-reserving with `stats.references` eliminates all the growth reallocations. 8-line internal change. No public API impact. ## Benchmark ### `cargo bench --bench semantic` (local Criterion) | Bench | main | this PR | Δ | |---|---|---|---| | `RadixUI.jsx` (3 KB) | 3.155 µs | 3.046 µs | **−3.5%** | | `react.development.js` (72 KB) | 136.47 µs | 133.78 µs | **−2.0%** | | `cal.com.tsx` (1 MB TSX) | 2.7414 ms | 2.7064 ms | **−1.3%** | | `binder.ts` (193 KB TS) | 314.90 µs | 309.46 µs | **−1.7%** | All Criterion intervals non-overlapping. ### CodSpeed (precise, instruction-count based, from #22571 which tested the same commit) | Bench | Δ | |---|---| | `pipeline[binder.ts]` | **−4.34%** | | `semantic[binder.ts]` | **−9.75%** | | `semantic[react.development.js]` | **−5.06%** | CodSpeed's instruction-count measurement is more sensitive to allocator-call elimination than wall-clock Criterion, so the numbers are larger but consistent in direction. ## Why `reserve_exact` (not `reserve`) - `stats.references` is the exact total push count (asserted in debug builds via `Stats::assert_accurate`). No need for amortized growth headroom. - Matches the existing precedent in `Scoping::reserve` which uses `reserve_exact` on arena-allocated structures. - Sub-100% reserves (tested 50%) were empirically *worse* than no reserve — the single resulting realloc is a large-block alloc that misses the size-class fast paths the small doubling reallocs hit. So 100% is the optimum. ## Test Plan - [x] `cargo test -p oxc_semantic --lib --tests` — 71 tests pass - [x] `cargo bench --bench semantic` — wins above - [x] CodSpeed re-run on commit (positive results above) AI disclosure: drafted with Claude Code, reviewed manually.
2dced86 to
7a4120e
Compare
This was referenced May 19, 2026
graphite-app Bot
pushed a commit
that referenced
this pull request
May 20, 2026
## Summary Adds `kitchen-sink.tsx` — a comprehensive synthetic TypeScript+JSX fixture maintained at [oxc-project/benchmark-files](https://github.com/oxc-project/benchmark-files) — to both `TestFiles::minimal()` (bench input set) and `TestFiles::complicated()` (alloc-tracking input set). The existing files in each set are untouched; this is a strict append. ## Why The existing bench input set didn't reliably surface general-purpose perf wins above the ~1-2% measurement noise floor: - #22580 (semantic pre-reserve) — visible because `binder.ts` exercises it - #22594 (formatter buffer) — visible - #22596 (minifier `try_fold_concat`) — **not visible** on the old set - #22599 (semantic resolve-refs no-temp-Vec) — **not visible** - #22603 (semantic var-hoist SmallVec) — **not visible** The kitchen-sink fixes that by exercising every AST node, every transformer plugin, every minifier optimization opportunity, and every semantic step in one large file. Verified by re-benching #22596 against this fixture: **minifier mean −1.5%, min −3.7%** — above noise, signal confirmed. ## Fixture stats (cross-checked locally) | Metric | Value | |---|---| | Source size | 21,117 lines / 732.90 kB | | AST nodes | ~133,000 | | Scopes | ~4,750 | | Symbols | ~7,000 | | Resolved references | ~16,000 | | Semantic diagnostics | 0 errors / 0 warnings | ## Snap baselines `tasks/track_memory_allocations/allocs_*.snap` updated with the kitchen-sink row across all 5 pipelines (parser / semantic / transformer / minifier / formatter). Future PRs that change allocation behavior on this fixture will produce a snap diff in CI. ## Bench-cleaner fix `tasks/benchmark/benches/lexer.rs`'s `SourceCleaner` was missing `visit_ts_template_literal_type` — TypeScript type-level template literals (e.g. `` `${T}-${U}` `` in conditional / mapped types) are syntactically identical to value-level template literals, so the bench-mode lexer (without parser context) cannot distinguish them. Without the cleaner converting them to plain strings, kitchen-sink's type-level templates caused the lexer bench to swallow ~1 KB spans as a single `TemplateHead` and produce spurious "Unterminated string" / "Invalid Unicode escape" errors. One-line fix to mirror the existing `visit_template_literal` handling. AI disclosure: drafted with Claude Code, reviewed manually.
graphite-app Bot
pushed a commit
that referenced
this pull request
May 21, 2026
…ec (#22599) ## Summary `SemanticBuilder::resolve_references_for_current_scope` is the early-resolution hook fired at function / arrow / catch boundaries — it resolves the references collected since a checkpoint against the outer scope chain so that e.g. `function f(x = outerRef) {}` correctly binds `outerRef` to the outer scope rather than the function body. The old implementation copied the pending slice into a fresh `Vec` on every call: ```rust let refs = self.unresolved_references.slice_from(checkpoint).to_vec(); self.unresolved_references.truncate(checkpoint); for (name, reference_id) in refs { ... } ``` The `to_vec()` is a fresh heap allocation, needed because `walk_up_resolve_reference` takes `&mut self` and would otherwise alias the borrow on the underlying `Vec<(Ident, ReferenceId)>`. Rewrite in-place with a retain-style write cursor — each `(Ident, ReferenceId)` is read by **value** (both are `Copy`), which detaches it from the borrow on the inner Vec. Unresolved refs are compacted forward; resolved refs are dropped via the final `truncate`. No temporary `Vec`, no allocation, **no `unsafe`** — safe bounds-checked indexing via new `get` / `set` methods on `UnresolvedReferences`. ```rust let mut write_idx = checkpoint; for read_idx in checkpoint..end { let (name, reference_id) = self.unresolved_references.get(read_idx); if !self.walk_up_resolve_reference(name, reference_id) { if write_idx != read_idx { self.unresolved_references.set(write_idx, name, reference_id); } write_idx += 1; } } self.unresolved_references.truncate(write_idx); ``` ## Allocation impact `cargo allocs`, `allocs_semantic.snap` (baseline = parent commit, after var-hoist landed): | File | Size | Sys allocs before | Sys allocs after | Δ | |---|---|---|---|---| | `checker.ts` | 2.92 MB | 2,309 | **53** | **−2,256 (−98%)** | | `App.tsx` | 415 kB | 174 | **35** | **−139 (−80%)** | | `binder.ts` | 193 kB | 196 | **19** | **−177 (−90%)** | | `kitchen-sink.tsx` | 733 kB | 1,320 | **290** | **−1,030 (−78%)** | | `pdf.mjs` | 567 kB | 1,366 | 1,355 | −11 | | `antd.js` | 6.69 MB | 1,072 | 1,072 | 0 (ES5 bundled code rarely closes over outer scope from function params) | | `RadixUI.jsx` | 2.5 kB | 10 | 10 | 0 | ## How I found this After [#22580](#22580) and [#22590](#22590) the remaining semantic sys allocs were unexplained. Ran a backtrace-capturing `System` allocator wrapper on `checker.ts` and aggregated allocations by call stack — the overwhelming majority of captured allocations converged on a single site: ``` oxc_semantic::builder::SemanticBuilder::resolve_references_for_current_scope oxc_semantic::builder::SemanticBuilder as ...::Visit::visit_function ... ``` Tracing back to the `slice_from(checkpoint).to_vec()` line and confirming that the entire allocation pattern disappears with the in-place rewrite. ## Why this matters for downstream consumers `resolve_references_for_current_scope` fires once per function / arrow / catch — for normal TS / TSX code that's hundreds to thousands of calls per build. Each call was a fresh heap allocation; eliminating them removes that allocator pressure entirely. For rolldown's preprocessing pipeline (which builds semantic 3-4 times per file across hundreds of files per bundle), the per-build savings compound. ## No behavior change The in-place algorithm preserves the original semantics exactly: `unresolved_references[..checkpoint]` is untouched, and `[checkpoint..]` is the list of refs that didn't resolve, in their original order. Same as the old code. ## Test Plan - [x] `cargo test -p oxc_semantic --lib --tests` — pass - [x] `cargo allocs` — semantic snapshot updated to reflect the reductions AI disclosure: drafted with Claude Code, reviewed manually.
graphite-app Bot
pushed a commit
that referenced
this pull request
May 21, 2026
## Summary
When the minifier folds `.concat()` calls into template literals via `try_fold_concat` in `peephole/replace_known_methods.rs`, the original implementation built two short-lived intermediate buffers per call:
- `quasi_strs: Vec<Cow<'a, str>>` — fresh `std::Vec` allocation per call.
- A `String` inside each `Cow::Owned` — created when `Cow::to_mut()` cloned a `Borrowed` into an `Owned`, then grown via doubling as more arg strings were pushed in.
For files with many `.concat()` calls — common in bundled ES5 libraries where template literals weren't available — both structures were freshly allocated thousands of times per minify, with the `String` growing via doubling each time.
This PR replaces those per-call buffers with a single reusable scratch `String` held on `MinifierState`, and constructs `TemplateElement` AST nodes directly into the arena `Vec` as args are drained. The intermediate `Vec<Cow<'a, str>>` is gone entirely. After the first few folds the scratch's capacity stabilizes and subsequent folds are amortized zero std-heap allocs.
The state machine simplifies in the process: the `pushed_quasi: bool` flag is gone. Its invariant ("scratch holds an in-progress quasi") is naturally maintained because every expression flush clears the scratch, so a back-to-back expression's flush produces the required empty separator quasi without a branch.
## Allocation impact
`cargo allocs`, `allocs_minifier.snap` (baseline = parent commit, after resolve-refs and var-hoist landed):
| File | Size | Sys allocs before | Sys allocs after | Sys reallocs before | Sys reallocs after |
|---|---|---|---|---|---|
| `antd.js` | 6.69 MB ES5 | **4,652** | **3,084** (`−1,568`, `−33.7%`) | **1,622** | **53** (`−1,569`) |
| Other tracked files | various | unchanged | unchanged | unchanged | unchanged |
Only `antd.js` shows a change because it's the only tracked file that exercises `.concat()` folding heavily. ES5 bundled code uses `.concat()` instead of template literals; antd.js has thousands of these.
The headline `Sys reallocs` improvement (`1,622` → `53`) comes from eliminating the per-call `String` doubling. The headline `Sys allocs` improvement (`4,652` → `3,084`) comes from eliminating the per-call `Vec<Cow<'a, str>>` allocation.
## Commits
This PR contains two commits, kept separate for review history:
1. **`perf(minifier): pre-size buffers in try_fold_concat to eliminate growth reallocs`** — the original tactical fix that pre-sized the doomed-to-be-removed intermediate buffers. Dropped sys reallocs first but didn't touch sys allocs.
2. **`perf(minifier): drop per-call buffers in try_fold_concat`** — the root-cause refactor described above. Removes the intermediate buffers entirely. Cuts both columns.
Once the second commit lands, the pre-sizing in the first becomes dead-code (the buffers it pre-sized no longer exist). They're kept as separate commits so the diff against `main` is reviewable — feel free to squash on merge.
## How I found this
After #22580, profiling on `antd.js` minifier showed it had unusually high sys reallocs vs other tracked files. I wrote a backtrace-capturing `System` allocator wrapper and ran it on the minify path. The top realloc sites all converged on `try_fold_concat` — the overwhelming majority of captured reallocs came from this one function. After landing the pre-size fix, code review pushed for a root-cause fix that also addressed the `Sys allocs` column, which this refactor delivers.
## Timing
The minifier bench suite (originally) didn't include antd.js or other `.concat()`-heavy fixtures, so bench numbers locally were flat (no regression, no obvious improvement). With kitchen-sink in the bench input set (#22609), CodSpeed shows: **minifier mean −1.5%, min −3.7%** on kitchen-sink.
## Verification
- `cargo test -p oxc_minifier` — pass.
- `cargo minsize` — no size regressions (`minsize.snap` byte-identical).
- `cargo allocs` — only `antd.js` row changed in `allocs_minifier.snap`.
- `cargo fmt -p oxc_minifier` clean.
AI disclosure: drafted with Claude Code, reviewed manually.
Dunqing
added a commit
that referenced
this pull request
May 26, 2026
### 🚀 Features - e857b0c napi/minify: Expose legalComments option and result (#20370) (Boshen) - 661132d parser: More friendly error messages for rest assignment target and rest binding element (#22719) (sapphi-red) - ee659b6 transformer/legacy-decorator: Add `strictNullChecks` option for nullable-union design:type (#22266) (Kyle Cannon) ### 🐛 Bug Fixes - e1d064e transformer/class-properties: Reparent lifted private method helpers (#22716) (Cameron) - 4ac0fca minifier: Preserve `0 && (module.exports = { ... })` cjs-module-lexer hint (#22729) (Dunqing) - 40ff611 minifier: Mark peephole loop changed when dropping dead-after-throw statement (#22722) (Dunqing) - 2f7b210 codegen: Emit pife-arrow/function leading comments inside the wrap (#22720) (Dunqing) - e184f74 parser: Improve invalid `import` property access diagnostic (#22693) (camc314) - 7baed9c transformer/private-method: Clear inherited strict flags (#22508) (camc314) - a9ad27e parser: Keep annotation comments leading without preceding newline (#22711) (Dunqing) - 9ea4d64 minifier: Re-evaluate pure/no-side-effects flags after peephole inlining (#22595) (Dunqing) - 07afbb6 minifier: Drop empty-body IIFE wrapper when called with arguments (#22589) (Dunqing) - fa7c463 semantic: Correct TS enum member symbol spans (#22689) (camc314) - 26b9396 semantic: Resolve parameter decorators outside parameter scope (#22623) (camc314) - b284045 parser: Switch to module goal eagerly on `export` (#22684) (Boshen) - dfa931d semantic: Propagate unresolved auto-increment enum value instead of defaulting to 0 (#22646) (Dunqing) - 69a6ba6 transformer/legacy-decorator: Emit Array for ReadonlyArray<T> in decorator metadata (#22265) (Kyle Cannon) - e421ef0 transformer/legacy-decorator: Return runtime binding for design:type (#22640) (Dunqing) - d61e1d7 codegen: Preserve verbatim text of pure/no-side-effects comments (#22525) (Dunqing) - 702b14e minifier: Preserve IIFE structure in DCE-only mode (#22547) (Dunqing) - 917da24 parser: Apply PURE comment through member-access chains (#22566) (Dunqing) - a069b1c codegen: Preserve quotes for cjs-module-lexer equality strings (#22551) (Dunqing) ### ⚡ Performance - 2f623b0 semantic: Skip unresolved checks for re-exports (#22660) (camc314) - 0d9553d semantic: Early-exit `check_object_expression` for objects with <2 properties (#22668) (Dunqing) - d721ad9 semantic: Use direct grandparent lookup for TS type parameters (#22658) (camc314) - 0aff288 semantic: Reorder numeric literal strict mode checks (#22657) (camc314) - 4d5ddb1 semantic: Reorder binding identifier checks (#22656) (camc314) - e32acd8 semantic: Reorder identifier ambient binding check (#22653) (camc314) - 09fe178 semantic: Reorder ident reference strict mode check (#22652) (camc314) - 4b6add2 semantic: Avoid duplicate ident clone for bindings (#22663) (camc314) - 82f9662 parser: Check identifier kind before context flag (#22662) (camc314) - d7cd951 parser: Fast path identifier parsing and inline operator helpers (#22650) (Boshen) - 7b84314 semantic: Use direct byte access for numeric leading-zero check (#22642) (camc314) - 0345a31 semantic: Pre-size class elements hash map (#22618) (camc314) - 04d3065 minifier: Drop per-call buffers in try_fold_concat (#22596) (Dunqing) - 4f289f1 semantic: Resolve_references_for_current_scope without a temp Vec (#22599) (Dunqing) - e862c15 semantic: Avoid heap alloc for var hoist scope ids (#22603) (Dunqing) - 8ff8674 semantic: Early return if `excess` is `0` in `Stats::increase_by` (#22616) (camc314) - 7a4120e semantic: Pre-reserve unresolved_references using Stats::references (#22580) (Dunqing) Co-authored-by: Dunqing <29533304+Dunqing@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The
unresolved_referencesVec<(Ident, ReferenceId)>insideSemanticBuildergrows fromcap = 0via the default doubling policy — that's ~13 reallocations to reach the ~5k references of a moderately-sized TypeScript file (193 KBbinder.ts), copying a cumulative ~16k entries (~250 KB of memory traffic) just to grow the buffer.Statsalready knows the exact reference count up front (viaStats::countor passed in by the caller viaSemanticBuilder::with_stats). Pre-reserving withstats.referenceseliminates all the growth reallocations.8-line internal change. No public API impact.
Benchmark
cargo bench --bench semantic(local Criterion)RadixUI.jsx(3 KB)react.development.js(72 KB)cal.com.tsx(1 MB TSX)binder.ts(193 KB TS)All Criterion intervals non-overlapping.
CodSpeed (precise, instruction-count based, from #22571 which tested the same commit)
pipeline[binder.ts]semantic[binder.ts]semantic[react.development.js]CodSpeed's instruction-count measurement is more sensitive to allocator-call elimination than wall-clock Criterion, so the numbers are larger but consistent in direction.
Why
reserve_exact(notreserve)stats.referencesis the exact total push count (asserted in debug builds viaStats::assert_accurate). No need for amortized growth headroom.Scoping::reservewhich usesreserve_exacton arena-allocated structures.Test Plan
cargo test -p oxc_semantic --lib --tests— 71 tests passcargo bench --bench semantic— wins aboveAI disclosure: drafted with Claude Code, reviewed manually.