Skip to content

perf(common): inline declared_symbols with SmallVec#9920

Merged
graphite-app[bot] merged 1 commit into
mainfrom
perf/inline-declared-symbols
Jun 28, 2026
Merged

perf(common): inline declared_symbols with SmallVec#9920
graphite-app[bot] merged 1 commit into
mainfrom
perf/inline-declared-symbols

Conversation

@IWANABETHATGUY

@IWANABETHATGUY IWANABETHATGUY commented Jun 22, 2026

Copy link
Copy Markdown
Member

Note

Base PR #9919 (pack TaggedSymbolRef to 8 bytes) is now merged into main; this PR contains only the VecSmallVec change on top of it.

What

Switch StmtInfo::declared_symbols storage from Vec<TaggedSymbolRef> to SmallVec<[TaggedSymbolRef; 2]>, keeping up to two declared symbols inline and only spilling to the heap beyond that.

Why

The overwhelming majority of statements declare zero, one, or two top-level symbols, but Vec rounds its first allocation up to capacity 4 (an 8-byte element falls in the MIN_NON_ZERO_CAP = 4 bucket) — a ~3.7× over-allocation that heap-allocates for nearly every statement.

Because TaggedSymbolRef is packed to 8 bytes (the base PR), SmallVec<[_; 2]> is 24 bytes — identical to Vec — so StmtInfo does not grow. A const assertion pins size_of::<DeclaredSymbols>() == size_of::<Vec<TaggedSymbolRef>>().

Impact (threejs10x)

Measured two independent ways — both agree.

1. Per-site — exact final heap state of every container. Instrumented LinkStage::link() to walk every StmtInfo.declared_symbols after a real threejs10x link and read each one's len() / spilled() / capacity().

storage heap allocations live heap bytes
Vec 24061 793,952 B (~775 KB)
SmallVec<[_; 2]> 180 29,760 B (~29 KB)

~99% of these allocations elided. Length histogram (len: count): {0: 7920, 1: 23531, 2: 350, 3: 40, 4: 10, 5: 10, 6: 10, 9: 30, 10: 30, 14: 10, 17: 10, 20: 10, 27: 10, 68: 10} — non-empty = 24061, len ≥ 3 (what still spills with inline cap 2) = 180. (Large buckets are multiples of 10: threejs10x = 10× three.js.)

2. Whole-program — real counting global allocator, A/B. Wrapped the system allocator to count every alloc/realloc op, toggling only the inline capacity: [_; 2] (this PR) vs [_; 0] (no inline buffer → allocates like Vec for every non-empty container; the walk confirms spilled = 24061 in that build).

build total alloc/realloc ops total bytes requested
SmallVec<[_; 2]> (this PR) ~1,003,900 ~1316 MB
SmallVec<[_; 0]> (≈ Vec) ~1,029,000 ~1316 MB
delta ≈ −24,600 … −25,000 ops ~0

Run-to-run noise was ±~150–270 ops, so the ~24k delta is ~100× the noise floor.

Measured vs derived. The 24061 allocation count is empirical — both the per-site walk and the inline-cap-0 A/B build report it directly (spilled = 24061). The ~775 KB is the one figure derived rather than read live: it applies Vec's real MIN_NON_ZERO_CAP = 4 rounding, whereas the [_; 0] A/B build spills exact-fit (~218 KB), so that build confirms the count but not the bytes.

What this does and doesn't buy. The win is allocation count — ~24k near-per-statement mallocs on a hot path turned into inline writes — not memory footprint. Whole-program allocation volume is essentially unchanged (~1316 MB either way), and even this struct's live bytes (~746 KB saved) are tiny against that. So the absolute whole-program memory win is modest by design; the value is allocator/CPU pressure, bought with zero growth of StmtInfo (the size_of assertion).

Verification

  • cargo clippy (both crates, all targets) + cargo fmt: clean.
  • Release StmtInfo == 80 assertion: holds.
  • Snapshot tests (cargo test -p rolldown, excluding hmr/test262): 1723 passed, 0 failed — zero behavior change.

🤖 Generated with Claude Code

@IWANABETHATGUY IWANABETHATGUY force-pushed the perf/tagged-symbol-ref-split branch from bf46224 to 019485f Compare June 28, 2026 04:00
@graphite-app graphite-app Bot force-pushed the perf/tagged-symbol-ref-split branch from 019485f to b6494d9 Compare June 28, 2026 06:12
Base automatically changed from perf/tagged-symbol-ref-split to main June 28, 2026 06:16
@IWANABETHATGUY IWANABETHATGUY force-pushed the perf/inline-declared-symbols branch from 1aa3534 to 801cf9d Compare June 28, 2026 09:21
@netlify

netlify Bot commented Jun 28, 2026

Copy link
Copy Markdown

Deploy Preview for rolldown-rs canceled.

Name Link
🔨 Latest commit a2e6966
🔍 Latest deploy log https://app.netlify.com/projects/rolldown-rs/deploys/6a40e887218dd60008c91de3

@IWANABETHATGUY IWANABETHATGUY marked this pull request as ready for review June 28, 2026 09:22

@hyf0 hyf0 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

hyf0 commented Jun 28, 2026

Copy link
Copy Markdown
Member

Merge activity

> [!NOTE]
> Base PR #9919 (pack `TaggedSymbolRef` to 8 bytes) is now merged into `main`; this PR contains only the `Vec` → `SmallVec` change on top of it.

### What

Switch `StmtInfo::declared_symbols` storage from `Vec<TaggedSymbolRef>` to `SmallVec<[TaggedSymbolRef; 2]>`, keeping up to two declared symbols inline and only spilling to the heap beyond that.

### Why

The overwhelming majority of statements declare zero, one, or two top-level symbols, but `Vec` rounds its first allocation up to capacity 4 (an 8-byte element falls in the `MIN_NON_ZERO_CAP = 4` bucket) — a ~3.7× over-allocation that heap-allocates for nearly every statement.

Because `TaggedSymbolRef` is packed to 8 bytes (the base PR), `SmallVec<[_; 2]>` is **24 bytes — identical to `Vec`** — so `StmtInfo` does not grow. A `const` assertion pins `size_of::<DeclaredSymbols>() == size_of::<Vec<TaggedSymbolRef>>()`.

### Impact (threejs10x)

Measured two independent ways — both agree.

**1. Per-site — exact final heap state of every container.** Instrumented `LinkStage::link()` to walk every `StmtInfo.declared_symbols` after a real threejs10x link and read each one's `len()` / `spilled()` / `capacity()`.

| storage | heap allocations | live heap bytes |
|---|---|---|
| `Vec` | 24061 | 793,952 B (~775 KB) |
| `SmallVec<[_; 2]>` | 180 | 29,760 B (~29 KB) |

**~99% of these allocations elided.** Length histogram (`len: count`): `{0: 7920, 1: 23531, 2: 350, 3: 40, 4: 10, 5: 10, 6: 10, 9: 30, 10: 30, 14: 10, 17: 10, 20: 10, 27: 10, 68: 10}` — non-empty = 24061, `len ≥ 3` (what still spills with inline cap 2) = 180. (Large buckets are multiples of 10: threejs10x = 10× three.js.)

**2. Whole-program — real counting global allocator, A/B.** Wrapped the system allocator to count every `alloc`/`realloc` op, toggling *only* the inline capacity: `[_; 2]` (this PR) vs `[_; 0]` (no inline buffer → allocates like `Vec` for every non-empty container; the walk confirms `spilled = 24061` in that build).

| build | total alloc/realloc ops | total bytes requested |
|---|---|---|
| `SmallVec<[_; 2]>` (this PR) | ~1,003,900 | ~1316 MB |
| `SmallVec<[_; 0]>` (≈ `Vec`) | ~1,029,000 | ~1316 MB |
| **delta** | **≈ −24,600 … −25,000 ops** | **~0** |

Run-to-run noise was ±~150–270 ops, so the ~24k delta is ~100× the noise floor.

> **Measured vs derived.** The `24061` allocation count is empirical — both the per-site walk and the inline-cap-0 A/B build report it directly (`spilled = 24061`). The `~775 KB` is the one figure *derived* rather than read live: it applies `Vec`'s real `MIN_NON_ZERO_CAP = 4` rounding, whereas the `[_; 0]` A/B build spills exact-fit (~218 KB), so that build confirms the count but not the bytes.
>
> **What this does and doesn't buy.** The win is **allocation *count*** — ~24k near-per-statement `malloc`s on a hot path turned into inline writes — not memory footprint. Whole-program allocation *volume* is essentially unchanged (~1316 MB either way), and even this struct's live bytes (~746 KB saved) are tiny against that. So the absolute whole-program memory win is modest by design; the value is allocator/CPU pressure, bought with **zero growth of `StmtInfo`** (the `size_of` assertion).

### Verification

- `cargo clippy` (both crates, all targets) + `cargo fmt`: clean.
- Release `StmtInfo == 80` assertion: holds.
- Snapshot tests (`cargo test -p rolldown`, excluding hmr/test262): 1723 passed, 0 failed — zero behavior change.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@graphite-app graphite-app Bot force-pushed the perf/inline-declared-symbols branch from 801cf9d to a2e6966 Compare June 28, 2026 09:25
@graphite-app graphite-app Bot merged commit a2e6966 into main Jun 28, 2026
33 of 34 checks passed
@graphite-app graphite-app Bot deleted the perf/inline-declared-symbols branch June 28, 2026 09:52
@rolldown-guard rolldown-guard Bot mentioned this pull request Jul 1, 2026
shulaoda added a commit that referenced this pull request Jul 1, 2026
## [1.1.4] - 2026-07-01

### 🚀 Features

- disable `experimental.lazyBarrel` by default (#10071) by @shulaoda

### 🐛 Bug Fixes

- dev: disable lazy barrel in dev mode (#10060) by @shulaoda
- generate: keep full JSON interface under preserveModules namespa… (#10056) by @IWANABETHATGUY
- check finalize_other_specifiers in its own Debug attribute (#10032) by @shulaoda
- serialize the KeepAssign unused minify option as "keep_assign" (#10031) by @shulaoda
- keep fragments after the newline fragment in MagicString::last_line (#10023) by @shulaoda
- generate: undeclared JSON named exports under preserveModules (#10020) (#10027) by @IWANABETHATGUY
- deconflict: rename CJS-wrapped locals that shadow chunk-root bindings (#9921) by @IWANABETHATGUY
- rolldown: keep entry facade when a shared chunk holds another entry's module (#9997) by @hyf0
- treeshake: also bail JSON default split when the object escapes (#9996) by @IWANABETHATGUY
- don't classify await in a strict-mode function as top-level await (#9987) by @shulaoda
- avoid spurious leading newline in addon hooks (banner/footer/intro/outro) (#9989) by @shulaoda
- handle JSON default mutation bailouts (#9972) by @TheAlexLichter
- plugin: make lazy hook metadata enumerable (#9991) by @TheAlexLichter
- dev: make init errors in lazy-compiled modules catchable (#9981) by @h-a-n-a
- treeshake: keep computed-key side effects on namespace member access (#9986) by @shulaoda
- binding: validate replace plugin delimiters length instead of panicking (#9984) by @shulaoda
- reconstruct nested rest patterns in into_expression (#9980) by @IWANABETHATGUY
- reconstruct rest patterns as spread in into_expression (#9976) by @shulaoda
- preserve export keyword on multi-declarator exports under keepNames (#9974) by @shulaoda
- deterministically keep the shortest name for deduplicated assets (#9948) by @x1024
- treeshake: apply @__NO_SIDE_EFFECTS__ to cross-chunk namespace calls (#9960) by @IWANABETHATGUY

### 🚜 Refactor

- drop redundant program scope enter/leave in finalizer (#10049) by @shulaoda
- deconflict: extract collect_chunk_scope_captured_names (#10006) by @IWANABETHATGUY
- unify pre-scan multi-declarator split into one decision site (#9982) by @IWANABETHATGUY
- common: return bool from SymbolRef::is_not_reassigned (#9962) by @IWANABETHATGUY

### 📚 Documentation

- rolldown: remove outdated comment for removing parenthesized expression (#10062) by @Dunqing
- use GitHub-flavored alert for Etiquette note in contribution guide (#10012) by @IWANABETHATGUY
- replace: explain the delimiters left and right boundaries (#9985) by @shulaoda
- ast-mutation: remove stale Address Use section after pre-scan refactor (#9983) by @IWANABETHATGUY
- remove fathom (#9968) by @mdong1909
- contribution-guide: code-format main branch references (#9966) by @IWANABETHATGUY
- contribution-guide: fix stale REPL note and tidy wording (#9957) by @hyf0
- contribution-guide: clarify when to discuss before opening a PR (#9955) by @hyf0

### ⚡ Performance

- disable preserve_parens across all parse paths (#10057) by @Dunqing
- common: inline declared_symbols with SmallVec (#9920) by @IWANABETHATGUY
- common: pack TaggedSymbolRef into 8 bytes (#9919) by @IWANABETHATGUY
- sourcemap: skip newline scan on the no-sourcemap join fast path (#9936) by @Boshen

### 🧪 Testing

- dev: error in lazy module should be catchable (#9975) by @sapphi-red
- dev: reject unknown lazy compile modules (#9969) by @sapphi-red

### ⚙️ Miscellaneous Tasks

- deps: update actions/cache action to v6 (#10001) by @renovate[bot]
- trigger vite ecosystem-ci from PR comments (#10058) by @shulaoda
- deps: update napi to v3.10.0 (#10063) by @renovate[bot]
- remove unused From impl for RolldownLabelSpan (#10055) by @shulaoda
- remove dead Diagnostic::with_kind method (#10054) by @shulaoda
- remove unused StatementExt methods (#10053) by @shulaoda
- remove unused ExpressionExt methods (#10052) by @shulaoda
- remove commented-out re_export_all_names field (#10051) by @shulaoda
- deps: update pnpm to v11.9.0 (#10047) by @renovate[bot]
- remove the unused BindingGenerateHmrPatchReturn napi type (#10034) by @shulaoda
- remove the dead inline_entry_chunk_wrapping scaffolding (#10037) by @shulaoda
- deps: bump oxc_resolver to 11.22.0 (#10045) by @Boshen
- remove never-constructed MatchImportKind::_Ignore variant (#10041) by @shulaoda
- remove the unused ScheduledBuild napi struct (#10033) by @shulaoda
- remove dead compute_hmr_update_single method (#10040) by @shulaoda
- drop the redundant visited.insert in manual code splitting (#10038) by @shulaoda
- remove the dead output_assets vector in render_chunk_to_assets (#10036) by @shulaoda
- remove the unused From<String>/Display impls for BindingLogLevel (#10035) by @shulaoda
- deps: upgrade oxc to 0.138.0 and migrate to per-type AST construction (#10018) by @shulaoda
- deps: update rust crates (#9911) by @renovate[bot]
- deps: update test262 submodule for tests (#10016) by @rolldown-guard[bot]
- deps: update github actions (#9999) by @renovate[bot]
- deps: update npm packages (#10000) by @renovate[bot]

### ◀️ Revert

- "fix(plugin): make lazy hook metadata enumerable (#9991)" (#10005) by @shulaoda

### ❤️ New Contributors

* @x1024 made their first contribution in [#9948](#9948)

Co-authored-by: shulaoda <165626830+shulaoda@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants