perf(semantic): avoid heap alloc for var hoist scope ids#22603
Merged
Conversation
Merging this PR will not alter performance
Comparing Footnotes
|
feae2b1 to
81d2fee
Compare
Member
Author
How to use the Graphite Merge QueueAdd either label to this PR to merge it via the merge queue:
You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
This was referenced May 20, 2026
f6a07b0 to
3622d8a
Compare
81d2fee to
dc53600
Compare
3622d8a to
3cce4d9
Compare
dc53600 to
223eaaf
Compare
223eaaf to
a25c5d7
Compare
3cce4d9 to
f75b4e4
Compare
a25c5d7 to
fc5257e
Compare
fc5257e to
f0f2f65
Compare
f75b4e4 to
87dcb4f
Compare
f0f2f65 to
e8731d7
Compare
graphite-app Bot
pushed a commit
that referenced
this pull request
May 20, 2026
## Summary Adds `kitchen-sink.tsx` — a comprehensive synthetic TypeScript+JSX fixture maintained at [oxc-project/benchmark-files](https://github.com/oxc-project/benchmark-files) — to both `TestFiles::minimal()` (bench input set) and `TestFiles::complicated()` (alloc-tracking input set). The existing files in each set are untouched; this is a strict append. ## Why The existing bench input set didn't reliably surface general-purpose perf wins above the ~1-2% measurement noise floor: - #22580 (semantic pre-reserve) — visible because `binder.ts` exercises it - #22594 (formatter buffer) — visible - #22596 (minifier `try_fold_concat`) — **not visible** on the old set - #22599 (semantic resolve-refs no-temp-Vec) — **not visible** - #22603 (semantic var-hoist SmallVec) — **not visible** The kitchen-sink fixes that by exercising every AST node, every transformer plugin, every minifier optimization opportunity, and every semantic step in one large file. Verified by re-benching #22596 against this fixture: **minifier mean −1.5%, min −3.7%** — above noise, signal confirmed. ## Fixture stats (cross-checked locally) | Metric | Value | |---|---| | Source size | 21,117 lines / 732.90 kB | | AST nodes | ~133,000 | | Scopes | ~4,750 | | Symbols | ~7,000 | | Resolved references | ~16,000 | | Semantic diagnostics | 0 errors / 0 warnings | ## Snap baselines `tasks/track_memory_allocations/allocs_*.snap` updated with the kitchen-sink row across all 5 pipelines (parser / semantic / transformer / minifier / formatter). Future PRs that change allocation behavior on this fixture will produce a snap diff in CI. ## Bench-cleaner fix `tasks/benchmark/benches/lexer.rs`'s `SourceCleaner` was missing `visit_ts_template_literal_type` — TypeScript type-level template literals (e.g. `` `${T}-${U}` `` in conditional / mapped types) are syntactically identical to value-level template literals, so the bench-mode lexer (without parser context) cannot distinguish them. Without the cleaner converting them to plain strings, kitchen-sink's type-level templates caused the lexer bench to swallow ~1 KB spans as a single `TemplateHead` and produce spurious "Unterminated string" / "Invalid Unicode escape" errors. One-line fix to mirror the existing `visit_template_literal` handling. AI disclosure: drafted with Claude Code, reviewed manually.
e8731d7 to
7aba78a
Compare
87dcb4f to
a15be79
Compare
7aba78a to
7abee9b
Compare
camc314
reviewed
May 20, 2026
camc314
reviewed
May 21, 2026
camc314
approved these changes
May 21, 2026
Contributor
Merge activity
|
## Summary
`VariableDeclarator::bind` allocated a fresh `Vec<ScopeId>` for every `var` declarator (`binder.rs:48`), used to collect ancestor scope ids on the path to the enclosing var / function scope. Nesting depth from a declarator to its hoist target is small in practice (1–5 scopes), so the Vec almost never grew past its first 16-byte chunk — but the first push always allocates.
Replace with `SmallVec<[ScopeId; 8]>` — inline storage covers typical depths without touching the heap.
```diff
-let mut var_scope_ids = vec![];
+let mut var_scope_ids: SmallVec<[ScopeId; 8]> = SmallVec::new();
```
## Allocation snapshot impact
Measured against the kitchen-sink baseline (parent of this commit). Sys allocs:
| File | Pipeline | Before | After | Δ |
|---|---|---|---|---|
| `antd.js` (6.69 MB ES5, var-heavy) | semantic | 2,455 | **1,072** | **−1,383 (−56%)** |
| `pdf.mjs` (567 kB) | semantic | 1,401 | 1,366 | −35 |
| `kitchen-sink.tsx` (733 kB) | semantic | 1,326 | 1,320 | −6 |
| `antd.js` | minifier (runs semantic internally) | 7,624 | **4,652** | **−2,972 (−39%)** |
| Other files | semantic | unchanged | unchanged | — |
For ES5-bundled code like antd.js, this single line accounted for a clear majority of the heap allocations in `SemanticBuilder::build`.
## How I found it
Backtrace-attributed allocation profiling of a fresh `SemanticBuilder::build` against `antd.js`: 1,383 of 2,459 captured allocations converged on one `Vec::push` site inside `VariableDeclarator::bind`'s var-hoist branch. Every visit path (`block_statement`, `if_statement`, `for_statement_init`, `function_body`, ...) eventually called through the same `VariableDeclarator::bind` → `Vec::push` line, so a single change here zeros them all out.
## Why this matters for downstream consumers
Rolldown rebuilds `Scoping` 3–4× per file across hundreds of files per bundle. Files containing legacy / bundled / `var`-heavy JS (very common in real-world npm packages) hit this path on every declarator. Cutting the binder's per-declarator heap allocation directly shrinks the per-rebuild alloc cost.
## Why `SmallVec<[ScopeId; 8]>`
- `ScopeId` is `u32` (4 bytes), so 8 inline slots = 32 bytes plus 8 bytes header / discriminant; cheap on a stack frame.
- Real-world depth distribution: top-level var = 0 ancestors collected; var inside `for { if { … } }` ≈ 2–4; arbitrarily deep nesting is rare and falls back to heap correctly.
- `SmallVec` is already in the workspace dependencies; pulled in here for the first time in `oxc_semantic`.
## Test plan
- [x] `cargo test -p oxc_semantic --lib --tests` — pass
- [x] `cargo clippy -p oxc_semantic --release` — clean
- [x] `cargo run -p oxc_track_memory_allocations` — snapshots updated to reflect the new numbers
AI disclosure: drafted with Claude Code, reviewed manually.
223ae1a to
e862c15
Compare
Dunqing
added a commit
that referenced
this pull request
May 26, 2026
### 🚀 Features - e857b0c napi/minify: Expose legalComments option and result (#20370) (Boshen) - 661132d parser: More friendly error messages for rest assignment target and rest binding element (#22719) (sapphi-red) - ee659b6 transformer/legacy-decorator: Add `strictNullChecks` option for nullable-union design:type (#22266) (Kyle Cannon) ### 🐛 Bug Fixes - e1d064e transformer/class-properties: Reparent lifted private method helpers (#22716) (Cameron) - 4ac0fca minifier: Preserve `0 && (module.exports = { ... })` cjs-module-lexer hint (#22729) (Dunqing) - 40ff611 minifier: Mark peephole loop changed when dropping dead-after-throw statement (#22722) (Dunqing) - 2f7b210 codegen: Emit pife-arrow/function leading comments inside the wrap (#22720) (Dunqing) - e184f74 parser: Improve invalid `import` property access diagnostic (#22693) (camc314) - 7baed9c transformer/private-method: Clear inherited strict flags (#22508) (camc314) - a9ad27e parser: Keep annotation comments leading without preceding newline (#22711) (Dunqing) - 9ea4d64 minifier: Re-evaluate pure/no-side-effects flags after peephole inlining (#22595) (Dunqing) - 07afbb6 minifier: Drop empty-body IIFE wrapper when called with arguments (#22589) (Dunqing) - fa7c463 semantic: Correct TS enum member symbol spans (#22689) (camc314) - 26b9396 semantic: Resolve parameter decorators outside parameter scope (#22623) (camc314) - b284045 parser: Switch to module goal eagerly on `export` (#22684) (Boshen) - dfa931d semantic: Propagate unresolved auto-increment enum value instead of defaulting to 0 (#22646) (Dunqing) - 69a6ba6 transformer/legacy-decorator: Emit Array for ReadonlyArray<T> in decorator metadata (#22265) (Kyle Cannon) - e421ef0 transformer/legacy-decorator: Return runtime binding for design:type (#22640) (Dunqing) - d61e1d7 codegen: Preserve verbatim text of pure/no-side-effects comments (#22525) (Dunqing) - 702b14e minifier: Preserve IIFE structure in DCE-only mode (#22547) (Dunqing) - 917da24 parser: Apply PURE comment through member-access chains (#22566) (Dunqing) - a069b1c codegen: Preserve quotes for cjs-module-lexer equality strings (#22551) (Dunqing) ### ⚡ Performance - 2f623b0 semantic: Skip unresolved checks for re-exports (#22660) (camc314) - 0d9553d semantic: Early-exit `check_object_expression` for objects with <2 properties (#22668) (Dunqing) - d721ad9 semantic: Use direct grandparent lookup for TS type parameters (#22658) (camc314) - 0aff288 semantic: Reorder numeric literal strict mode checks (#22657) (camc314) - 4d5ddb1 semantic: Reorder binding identifier checks (#22656) (camc314) - e32acd8 semantic: Reorder identifier ambient binding check (#22653) (camc314) - 09fe178 semantic: Reorder ident reference strict mode check (#22652) (camc314) - 4b6add2 semantic: Avoid duplicate ident clone for bindings (#22663) (camc314) - 82f9662 parser: Check identifier kind before context flag (#22662) (camc314) - d7cd951 parser: Fast path identifier parsing and inline operator helpers (#22650) (Boshen) - 7b84314 semantic: Use direct byte access for numeric leading-zero check (#22642) (camc314) - 0345a31 semantic: Pre-size class elements hash map (#22618) (camc314) - 04d3065 minifier: Drop per-call buffers in try_fold_concat (#22596) (Dunqing) - 4f289f1 semantic: Resolve_references_for_current_scope without a temp Vec (#22599) (Dunqing) - e862c15 semantic: Avoid heap alloc for var hoist scope ids (#22603) (Dunqing) - 8ff8674 semantic: Early return if `excess` is `0` in `Stats::increase_by` (#22616) (camc314) - 7a4120e semantic: Pre-reserve unresolved_references using Stats::references (#22580) (Dunqing) Co-authored-by: Dunqing <29533304+Dunqing@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
VariableDeclarator::bindallocated a freshVec<ScopeId>for everyvardeclarator (binder.rs:48), used to collect ancestor scope ids on the path to the enclosing var / function scope. Nesting depth from a declarator to its hoist target is small in practice (1–5 scopes), so the Vec almost never grew past its first 16-byte chunk — but the first push always allocates.Replace with
SmallVec<[ScopeId; 8]>— inline storage covers typical depths without touching the heap.Allocation snapshot impact
Measured against the kitchen-sink baseline (parent of this commit). Sys allocs:
antd.js(6.69 MB ES5, var-heavy)pdf.mjs(567 kB)kitchen-sink.tsx(733 kB)antd.jsFor ES5-bundled code like antd.js, this single line accounted for a clear majority of the heap allocations in
SemanticBuilder::build.How I found it
Backtrace-attributed allocation profiling of a fresh
SemanticBuilder::buildagainstantd.js: 1,383 of 2,459 captured allocations converged on oneVec::pushsite insideVariableDeclarator::bind's var-hoist branch. Every visit path (block_statement,if_statement,for_statement_init,function_body, ...) eventually called through the sameVariableDeclarator::bind→Vec::pushline, so a single change here zeros them all out.Why this matters for downstream consumers
Rolldown rebuilds
Scoping3–4× per file across hundreds of files per bundle. Files containing legacy / bundled /var-heavy JS (very common in real-world npm packages) hit this path on every declarator. Cutting the binder's per-declarator heap allocation directly shrinks the per-rebuild alloc cost.Why
SmallVec<[ScopeId; 8]>ScopeIdisu32(4 bytes), so 8 inline slots = 32 bytes plus 8 bytes header / discriminant; cheap on a stack frame.for { if { … } }≈ 2–4; arbitrarily deep nesting is rare and falls back to heap correctly.SmallVecis already in the workspace dependencies; pulled in here for the first time inoxc_semantic.Test plan
cargo test -p oxc_semantic --lib --tests— passcargo clippy -p oxc_semantic --release— cleancargo run -p oxc_track_memory_allocations— snapshots updated to reflect the new numbersAI disclosure: drafted with Claude Code, reviewed manually.