perf(common): inline declared_symbols with SmallVec#9920
Merged
Conversation
bf46224 to
019485f
Compare
019485f to
b6494d9
Compare
1aa3534 to
801cf9d
Compare
✅ Deploy Preview for rolldown-rs canceled.
|
Member
Merge activity
|
> [!NOTE] > Base PR #9919 (pack `TaggedSymbolRef` to 8 bytes) is now merged into `main`; this PR contains only the `Vec` → `SmallVec` change on top of it. ### What Switch `StmtInfo::declared_symbols` storage from `Vec<TaggedSymbolRef>` to `SmallVec<[TaggedSymbolRef; 2]>`, keeping up to two declared symbols inline and only spilling to the heap beyond that. ### Why The overwhelming majority of statements declare zero, one, or two top-level symbols, but `Vec` rounds its first allocation up to capacity 4 (an 8-byte element falls in the `MIN_NON_ZERO_CAP = 4` bucket) — a ~3.7× over-allocation that heap-allocates for nearly every statement. Because `TaggedSymbolRef` is packed to 8 bytes (the base PR), `SmallVec<[_; 2]>` is **24 bytes — identical to `Vec`** — so `StmtInfo` does not grow. A `const` assertion pins `size_of::<DeclaredSymbols>() == size_of::<Vec<TaggedSymbolRef>>()`. ### Impact (threejs10x) Measured two independent ways — both agree. **1. Per-site — exact final heap state of every container.** Instrumented `LinkStage::link()` to walk every `StmtInfo.declared_symbols` after a real threejs10x link and read each one's `len()` / `spilled()` / `capacity()`. | storage | heap allocations | live heap bytes | |---|---|---| | `Vec` | 24061 | 793,952 B (~775 KB) | | `SmallVec<[_; 2]>` | 180 | 29,760 B (~29 KB) | **~99% of these allocations elided.** Length histogram (`len: count`): `{0: 7920, 1: 23531, 2: 350, 3: 40, 4: 10, 5: 10, 6: 10, 9: 30, 10: 30, 14: 10, 17: 10, 20: 10, 27: 10, 68: 10}` — non-empty = 24061, `len ≥ 3` (what still spills with inline cap 2) = 180. (Large buckets are multiples of 10: threejs10x = 10× three.js.) **2. Whole-program — real counting global allocator, A/B.** Wrapped the system allocator to count every `alloc`/`realloc` op, toggling *only* the inline capacity: `[_; 2]` (this PR) vs `[_; 0]` (no inline buffer → allocates like `Vec` for every non-empty container; the walk confirms `spilled = 24061` in that build). | build | total alloc/realloc ops | total bytes requested | |---|---|---| | `SmallVec<[_; 2]>` (this PR) | ~1,003,900 | ~1316 MB | | `SmallVec<[_; 0]>` (≈ `Vec`) | ~1,029,000 | ~1316 MB | | **delta** | **≈ −24,600 … −25,000 ops** | **~0** | Run-to-run noise was ±~150–270 ops, so the ~24k delta is ~100× the noise floor. > **Measured vs derived.** The `24061` allocation count is empirical — both the per-site walk and the inline-cap-0 A/B build report it directly (`spilled = 24061`). The `~775 KB` is the one figure *derived* rather than read live: it applies `Vec`'s real `MIN_NON_ZERO_CAP = 4` rounding, whereas the `[_; 0]` A/B build spills exact-fit (~218 KB), so that build confirms the count but not the bytes. > > **What this does and doesn't buy.** The win is **allocation *count*** — ~24k near-per-statement `malloc`s on a hot path turned into inline writes — not memory footprint. Whole-program allocation *volume* is essentially unchanged (~1316 MB either way), and even this struct's live bytes (~746 KB saved) are tiny against that. So the absolute whole-program memory win is modest by design; the value is allocator/CPU pressure, bought with **zero growth of `StmtInfo`** (the `size_of` assertion). ### Verification - `cargo clippy` (both crates, all targets) + `cargo fmt`: clean. - Release `StmtInfo == 80` assertion: holds. - Snapshot tests (`cargo test -p rolldown`, excluding hmr/test262): 1723 passed, 0 failed — zero behavior change. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
801cf9d to
a2e6966
Compare
Merged
shulaoda
added a commit
that referenced
this pull request
Jul 1, 2026
## [1.1.4] - 2026-07-01 ### 🚀 Features - disable `experimental.lazyBarrel` by default (#10071) by @shulaoda ### 🐛 Bug Fixes - dev: disable lazy barrel in dev mode (#10060) by @shulaoda - generate: keep full JSON interface under preserveModules namespa… (#10056) by @IWANABETHATGUY - check finalize_other_specifiers in its own Debug attribute (#10032) by @shulaoda - serialize the KeepAssign unused minify option as "keep_assign" (#10031) by @shulaoda - keep fragments after the newline fragment in MagicString::last_line (#10023) by @shulaoda - generate: undeclared JSON named exports under preserveModules (#10020) (#10027) by @IWANABETHATGUY - deconflict: rename CJS-wrapped locals that shadow chunk-root bindings (#9921) by @IWANABETHATGUY - rolldown: keep entry facade when a shared chunk holds another entry's module (#9997) by @hyf0 - treeshake: also bail JSON default split when the object escapes (#9996) by @IWANABETHATGUY - don't classify await in a strict-mode function as top-level await (#9987) by @shulaoda - avoid spurious leading newline in addon hooks (banner/footer/intro/outro) (#9989) by @shulaoda - handle JSON default mutation bailouts (#9972) by @TheAlexLichter - plugin: make lazy hook metadata enumerable (#9991) by @TheAlexLichter - dev: make init errors in lazy-compiled modules catchable (#9981) by @h-a-n-a - treeshake: keep computed-key side effects on namespace member access (#9986) by @shulaoda - binding: validate replace plugin delimiters length instead of panicking (#9984) by @shulaoda - reconstruct nested rest patterns in into_expression (#9980) by @IWANABETHATGUY - reconstruct rest patterns as spread in into_expression (#9976) by @shulaoda - preserve export keyword on multi-declarator exports under keepNames (#9974) by @shulaoda - deterministically keep the shortest name for deduplicated assets (#9948) by @x1024 - treeshake: apply @__NO_SIDE_EFFECTS__ to cross-chunk namespace calls (#9960) by @IWANABETHATGUY ### 🚜 Refactor - drop redundant program scope enter/leave in finalizer (#10049) by @shulaoda - deconflict: extract collect_chunk_scope_captured_names (#10006) by @IWANABETHATGUY - unify pre-scan multi-declarator split into one decision site (#9982) by @IWANABETHATGUY - common: return bool from SymbolRef::is_not_reassigned (#9962) by @IWANABETHATGUY ### 📚 Documentation - rolldown: remove outdated comment for removing parenthesized expression (#10062) by @Dunqing - use GitHub-flavored alert for Etiquette note in contribution guide (#10012) by @IWANABETHATGUY - replace: explain the delimiters left and right boundaries (#9985) by @shulaoda - ast-mutation: remove stale Address Use section after pre-scan refactor (#9983) by @IWANABETHATGUY - remove fathom (#9968) by @mdong1909 - contribution-guide: code-format main branch references (#9966) by @IWANABETHATGUY - contribution-guide: fix stale REPL note and tidy wording (#9957) by @hyf0 - contribution-guide: clarify when to discuss before opening a PR (#9955) by @hyf0 ### ⚡ Performance - disable preserve_parens across all parse paths (#10057) by @Dunqing - common: inline declared_symbols with SmallVec (#9920) by @IWANABETHATGUY - common: pack TaggedSymbolRef into 8 bytes (#9919) by @IWANABETHATGUY - sourcemap: skip newline scan on the no-sourcemap join fast path (#9936) by @Boshen ### 🧪 Testing - dev: error in lazy module should be catchable (#9975) by @sapphi-red - dev: reject unknown lazy compile modules (#9969) by @sapphi-red ### ⚙️ Miscellaneous Tasks - deps: update actions/cache action to v6 (#10001) by @renovate[bot] - trigger vite ecosystem-ci from PR comments (#10058) by @shulaoda - deps: update napi to v3.10.0 (#10063) by @renovate[bot] - remove unused From impl for RolldownLabelSpan (#10055) by @shulaoda - remove dead Diagnostic::with_kind method (#10054) by @shulaoda - remove unused StatementExt methods (#10053) by @shulaoda - remove unused ExpressionExt methods (#10052) by @shulaoda - remove commented-out re_export_all_names field (#10051) by @shulaoda - deps: update pnpm to v11.9.0 (#10047) by @renovate[bot] - remove the unused BindingGenerateHmrPatchReturn napi type (#10034) by @shulaoda - remove the dead inline_entry_chunk_wrapping scaffolding (#10037) by @shulaoda - deps: bump oxc_resolver to 11.22.0 (#10045) by @Boshen - remove never-constructed MatchImportKind::_Ignore variant (#10041) by @shulaoda - remove the unused ScheduledBuild napi struct (#10033) by @shulaoda - remove dead compute_hmr_update_single method (#10040) by @shulaoda - drop the redundant visited.insert in manual code splitting (#10038) by @shulaoda - remove the dead output_assets vector in render_chunk_to_assets (#10036) by @shulaoda - remove the unused From<String>/Display impls for BindingLogLevel (#10035) by @shulaoda - deps: upgrade oxc to 0.138.0 and migrate to per-type AST construction (#10018) by @shulaoda - deps: update rust crates (#9911) by @renovate[bot] - deps: update test262 submodule for tests (#10016) by @rolldown-guard[bot] - deps: update github actions (#9999) by @renovate[bot] - deps: update npm packages (#10000) by @renovate[bot] ###◀️ Revert - "fix(plugin): make lazy hook metadata enumerable (#9991)" (#10005) by @shulaoda ### ❤️ New Contributors * @x1024 made their first contribution in [#9948](#9948) Co-authored-by: shulaoda <165626830+shulaoda@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Note
Base PR #9919 (pack
TaggedSymbolRefto 8 bytes) is now merged intomain; this PR contains only theVec→SmallVecchange on top of it.What
Switch
StmtInfo::declared_symbolsstorage fromVec<TaggedSymbolRef>toSmallVec<[TaggedSymbolRef; 2]>, keeping up to two declared symbols inline and only spilling to the heap beyond that.Why
The overwhelming majority of statements declare zero, one, or two top-level symbols, but
Vecrounds its first allocation up to capacity 4 (an 8-byte element falls in theMIN_NON_ZERO_CAP = 4bucket) — a ~3.7× over-allocation that heap-allocates for nearly every statement.Because
TaggedSymbolRefis packed to 8 bytes (the base PR),SmallVec<[_; 2]>is 24 bytes — identical toVec— soStmtInfodoes not grow. Aconstassertion pinssize_of::<DeclaredSymbols>() == size_of::<Vec<TaggedSymbolRef>>().Impact (threejs10x)
Measured two independent ways — both agree.
1. Per-site — exact final heap state of every container. Instrumented
LinkStage::link()to walk everyStmtInfo.declared_symbolsafter a real threejs10x link and read each one'slen()/spilled()/capacity().VecSmallVec<[_; 2]>~99% of these allocations elided. Length histogram (
len: count):{0: 7920, 1: 23531, 2: 350, 3: 40, 4: 10, 5: 10, 6: 10, 9: 30, 10: 30, 14: 10, 17: 10, 20: 10, 27: 10, 68: 10}— non-empty = 24061,len ≥ 3(what still spills with inline cap 2) = 180. (Large buckets are multiples of 10: threejs10x = 10× three.js.)2. Whole-program — real counting global allocator, A/B. Wrapped the system allocator to count every
alloc/reallocop, toggling only the inline capacity:[_; 2](this PR) vs[_; 0](no inline buffer → allocates likeVecfor every non-empty container; the walk confirmsspilled = 24061in that build).SmallVec<[_; 2]>(this PR)SmallVec<[_; 0]>(≈Vec)Run-to-run noise was ±~150–270 ops, so the ~24k delta is ~100× the noise floor.
Verification
cargo clippy(both crates, all targets) +cargo fmt: clean.StmtInfo == 80assertion: holds.cargo test -p rolldown, excluding hmr/test262): 1723 passed, 0 failed — zero behavior change.🤖 Generated with Claude Code