Skip to content

perf(common): pack TaggedSymbolRef into 8 bytes#9919

Merged
graphite-app[bot] merged 1 commit into
mainfrom
perf/tagged-symbol-ref-split
Jun 28, 2026
Merged

perf(common): pack TaggedSymbolRef into 8 bytes#9919
graphite-app[bot] merged 1 commit into
mainfrom
perf/tagged-symbol-ref-split

Conversation

@IWANABETHATGUY

Copy link
Copy Markdown
Member

What

Replace the TaggedSymbolRef enum

enum { LinkOnly(SymbolRef), Normal(SymbolRef) } // 12 bytes

with a packed struct that stores the link-only tag in the high bit of the SymbolId, bringing it down to 8 bytes (same size as SymbolRef).

Why 12 bytes?

SymbolRef is { owner: ModuleIdx (u32), symbol: SymbolId (NonMaxU32) } = 8 bytes. The NonMaxU32 niche could host a discriminant for a single payload variant, but two payload-carrying variants force Rust to add a separate 4-byte-aligned discriminant → 12 bytes.

Why the bit-steal is sound

A single module can never reach 2³¹ symbols: oxc caps a source file at <4GB (u32 spans), and the densest distinct-symbol binding (a=>a=>…, 3 bytes each) tops out well under 2³¹. pack() asserts this invariant, so a pathological input panics loudly instead of silently corrupting a symbol id.

Impact

Shrinks every StmtInfo::declared_symbols element from 12 → 8 bytes, with no growth of StmtInfo itself. A const assertion pins size_of::<TaggedSymbolRef>() == size_of::<SymbolRef>().

Verification

  • Unit tests: normal / link-only roundtrip, largest taggable SymbolId, panic-on-overflow, size invariant.
  • cargo clippy + cargo fmt: clean.
  • Snapshot tests: identical (behavior unchanged).

🤖 Generated with Claude Code

@netlify

netlify Bot commented Jun 22, 2026

Copy link
Copy Markdown

Deploy Preview for rolldown-rs canceled.

Name Link
🔨 Latest commit b6494d9
🔍 Latest deploy log https://app.netlify.com/projects/rolldown-rs/deploys/6a40bb4d4b88fe00083479a1

@IWANABETHATGUY IWANABETHATGUY marked this pull request as ready for review June 28, 2026 03:58
@IWANABETHATGUY IWANABETHATGUY force-pushed the perf/tagged-symbol-ref-split branch from bf46224 to 019485f Compare June 28, 2026 04:00
@codspeed-hq

codspeed-hq Bot commented Jun 28, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 7 untouched benchmarks
⏩ 10 skipped benchmarks1


Comparing perf/tagged-symbol-ref-split (019485f) with main (e0dfef1)

Open in CodSpeed

Footnotes

  1. 10 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

IWANABETHATGUY commented Jun 28, 2026

Copy link
Copy Markdown
Member Author

Merge activity

  • Jun 28, 6:11 AM UTC: The merge label 'graphite: merge-when-ready' was detected. This PR will be added to the Graphite merge queue once it meets the requirements.
  • Jun 28, 6:11 AM UTC: IWANABETHATGUY added this pull request to the Graphite merge queue.
  • Jun 28, 6:16 AM UTC: Merged by the Graphite merge queue.

### What

Replace the `TaggedSymbolRef` enum

```rust
enum { LinkOnly(SymbolRef), Normal(SymbolRef) } // 12 bytes
```

with a packed struct that stores the link-only tag in the **high bit of the `SymbolId`**, bringing it down to **8 bytes** (same size as `SymbolRef`).

### Why 12 bytes?

`SymbolRef` is `{ owner: ModuleIdx (u32), symbol: SymbolId (NonMaxU32) }` = 8 bytes. The `NonMaxU32` niche could host a discriminant for a *single* payload variant, but two payload-carrying variants force Rust to add a separate 4-byte-aligned discriminant → 12 bytes.

### Why the bit-steal is sound

A single module can never reach 2³¹ symbols: oxc caps a source file at <4GB (u32 spans), and the densest distinct-symbol binding (`a=>a=>…`, 3 bytes each) tops out well under 2³¹. `pack()` asserts this invariant, so a pathological input panics loudly instead of silently corrupting a symbol id.

### Impact

Shrinks every `StmtInfo::declared_symbols` element from 12 → 8 bytes, with **no growth of `StmtInfo` itself**. A `const` assertion pins `size_of::<TaggedSymbolRef>() == size_of::<SymbolRef>()`.

### Verification

- Unit tests: normal / link-only roundtrip, largest taggable `SymbolId`, panic-on-overflow, size invariant.
- `cargo clippy` + `cargo fmt`: clean.
- Snapshot tests: identical (behavior unchanged).

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@graphite-app graphite-app Bot force-pushed the perf/tagged-symbol-ref-split branch from 019485f to b6494d9 Compare June 28, 2026 06:12
@graphite-app graphite-app Bot merged commit b6494d9 into main Jun 28, 2026
33 of 34 checks passed
@graphite-app graphite-app Bot deleted the perf/tagged-symbol-ref-split branch June 28, 2026 06:16
graphite-app Bot pushed a commit that referenced this pull request Jun 28, 2026
> [!NOTE]
> Base PR #9919 (pack `TaggedSymbolRef` to 8 bytes) is now merged into `main`; this PR contains only the `Vec` → `SmallVec` change on top of it.

### What

Switch `StmtInfo::declared_symbols` storage from `Vec<TaggedSymbolRef>` to `SmallVec<[TaggedSymbolRef; 2]>`, keeping up to two declared symbols inline and only spilling to the heap beyond that.

### Why

The overwhelming majority of statements declare zero, one, or two top-level symbols, but `Vec` rounds its first allocation up to capacity 4 (an 8-byte element falls in the `MIN_NON_ZERO_CAP = 4` bucket) — a ~3.7× over-allocation that heap-allocates for nearly every statement.

Because `TaggedSymbolRef` is packed to 8 bytes (the base PR), `SmallVec<[_; 2]>` is **24 bytes — identical to `Vec`** — so `StmtInfo` does not grow. A `const` assertion pins `size_of::<DeclaredSymbols>() == size_of::<Vec<TaggedSymbolRef>>()`.

### Impact (threejs10x)

Measured two independent ways — both agree.

**1. Per-site — exact final heap state of every container.** Instrumented `LinkStage::link()` to walk every `StmtInfo.declared_symbols` after a real threejs10x link and read each one's `len()` / `spilled()` / `capacity()`.

| storage | heap allocations | live heap bytes |
|---|---|---|
| `Vec` | 24061 | 793,952 B (~775 KB) |
| `SmallVec<[_; 2]>` | 180 | 29,760 B (~29 KB) |

**~99% of these allocations elided.** Length histogram (`len: count`): `{0: 7920, 1: 23531, 2: 350, 3: 40, 4: 10, 5: 10, 6: 10, 9: 30, 10: 30, 14: 10, 17: 10, 20: 10, 27: 10, 68: 10}` — non-empty = 24061, `len ≥ 3` (what still spills with inline cap 2) = 180. (Large buckets are multiples of 10: threejs10x = 10× three.js.)

**2. Whole-program — real counting global allocator, A/B.** Wrapped the system allocator to count every `alloc`/`realloc` op, toggling *only* the inline capacity: `[_; 2]` (this PR) vs `[_; 0]` (no inline buffer → allocates like `Vec` for every non-empty container; the walk confirms `spilled = 24061` in that build).

| build | total alloc/realloc ops | total bytes requested |
|---|---|---|
| `SmallVec<[_; 2]>` (this PR) | ~1,003,900 | ~1316 MB |
| `SmallVec<[_; 0]>` (≈ `Vec`) | ~1,029,000 | ~1316 MB |
| **delta** | **≈ −24,600 … −25,000 ops** | **~0** |

Run-to-run noise was ±~150–270 ops, so the ~24k delta is ~100× the noise floor.

> **Measured vs derived.** The `24061` allocation count is empirical — both the per-site walk and the inline-cap-0 A/B build report it directly (`spilled = 24061`). The `~775 KB` is the one figure *derived* rather than read live: it applies `Vec`'s real `MIN_NON_ZERO_CAP = 4` rounding, whereas the `[_; 0]` A/B build spills exact-fit (~218 KB), so that build confirms the count but not the bytes.
>
> **What this does and doesn't buy.** The win is **allocation *count*** — ~24k near-per-statement `malloc`s on a hot path turned into inline writes — not memory footprint. Whole-program allocation *volume* is essentially unchanged (~1316 MB either way), and even this struct's live bytes (~746 KB saved) are tiny against that. So the absolute whole-program memory win is modest by design; the value is allocator/CPU pressure, bought with **zero growth of `StmtInfo`** (the `size_of` assertion).

### Verification

- `cargo clippy` (both crates, all targets) + `cargo fmt`: clean.
- Release `StmtInfo == 80` assertion: holds.
- Snapshot tests (`cargo test -p rolldown`, excluding hmr/test262): 1723 passed, 0 failed — zero behavior change.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@rolldown-guard rolldown-guard Bot mentioned this pull request Jul 1, 2026
shulaoda added a commit that referenced this pull request Jul 1, 2026
## [1.1.4] - 2026-07-01

### 🚀 Features

- disable `experimental.lazyBarrel` by default (#10071) by @shulaoda

### 🐛 Bug Fixes

- dev: disable lazy barrel in dev mode (#10060) by @shulaoda
- generate: keep full JSON interface under preserveModules namespa… (#10056) by @IWANABETHATGUY
- check finalize_other_specifiers in its own Debug attribute (#10032) by @shulaoda
- serialize the KeepAssign unused minify option as "keep_assign" (#10031) by @shulaoda
- keep fragments after the newline fragment in MagicString::last_line (#10023) by @shulaoda
- generate: undeclared JSON named exports under preserveModules (#10020) (#10027) by @IWANABETHATGUY
- deconflict: rename CJS-wrapped locals that shadow chunk-root bindings (#9921) by @IWANABETHATGUY
- rolldown: keep entry facade when a shared chunk holds another entry's module (#9997) by @hyf0
- treeshake: also bail JSON default split when the object escapes (#9996) by @IWANABETHATGUY
- don't classify await in a strict-mode function as top-level await (#9987) by @shulaoda
- avoid spurious leading newline in addon hooks (banner/footer/intro/outro) (#9989) by @shulaoda
- handle JSON default mutation bailouts (#9972) by @TheAlexLichter
- plugin: make lazy hook metadata enumerable (#9991) by @TheAlexLichter
- dev: make init errors in lazy-compiled modules catchable (#9981) by @h-a-n-a
- treeshake: keep computed-key side effects on namespace member access (#9986) by @shulaoda
- binding: validate replace plugin delimiters length instead of panicking (#9984) by @shulaoda
- reconstruct nested rest patterns in into_expression (#9980) by @IWANABETHATGUY
- reconstruct rest patterns as spread in into_expression (#9976) by @shulaoda
- preserve export keyword on multi-declarator exports under keepNames (#9974) by @shulaoda
- deterministically keep the shortest name for deduplicated assets (#9948) by @x1024
- treeshake: apply @__NO_SIDE_EFFECTS__ to cross-chunk namespace calls (#9960) by @IWANABETHATGUY

### 🚜 Refactor

- drop redundant program scope enter/leave in finalizer (#10049) by @shulaoda
- deconflict: extract collect_chunk_scope_captured_names (#10006) by @IWANABETHATGUY
- unify pre-scan multi-declarator split into one decision site (#9982) by @IWANABETHATGUY
- common: return bool from SymbolRef::is_not_reassigned (#9962) by @IWANABETHATGUY

### 📚 Documentation

- rolldown: remove outdated comment for removing parenthesized expression (#10062) by @Dunqing
- use GitHub-flavored alert for Etiquette note in contribution guide (#10012) by @IWANABETHATGUY
- replace: explain the delimiters left and right boundaries (#9985) by @shulaoda
- ast-mutation: remove stale Address Use section after pre-scan refactor (#9983) by @IWANABETHATGUY
- remove fathom (#9968) by @mdong1909
- contribution-guide: code-format main branch references (#9966) by @IWANABETHATGUY
- contribution-guide: fix stale REPL note and tidy wording (#9957) by @hyf0
- contribution-guide: clarify when to discuss before opening a PR (#9955) by @hyf0

### ⚡ Performance

- disable preserve_parens across all parse paths (#10057) by @Dunqing
- common: inline declared_symbols with SmallVec (#9920) by @IWANABETHATGUY
- common: pack TaggedSymbolRef into 8 bytes (#9919) by @IWANABETHATGUY
- sourcemap: skip newline scan on the no-sourcemap join fast path (#9936) by @Boshen

### 🧪 Testing

- dev: error in lazy module should be catchable (#9975) by @sapphi-red
- dev: reject unknown lazy compile modules (#9969) by @sapphi-red

### ⚙️ Miscellaneous Tasks

- deps: update actions/cache action to v6 (#10001) by @renovate[bot]
- trigger vite ecosystem-ci from PR comments (#10058) by @shulaoda
- deps: update napi to v3.10.0 (#10063) by @renovate[bot]
- remove unused From impl for RolldownLabelSpan (#10055) by @shulaoda
- remove dead Diagnostic::with_kind method (#10054) by @shulaoda
- remove unused StatementExt methods (#10053) by @shulaoda
- remove unused ExpressionExt methods (#10052) by @shulaoda
- remove commented-out re_export_all_names field (#10051) by @shulaoda
- deps: update pnpm to v11.9.0 (#10047) by @renovate[bot]
- remove the unused BindingGenerateHmrPatchReturn napi type (#10034) by @shulaoda
- remove the dead inline_entry_chunk_wrapping scaffolding (#10037) by @shulaoda
- deps: bump oxc_resolver to 11.22.0 (#10045) by @Boshen
- remove never-constructed MatchImportKind::_Ignore variant (#10041) by @shulaoda
- remove the unused ScheduledBuild napi struct (#10033) by @shulaoda
- remove dead compute_hmr_update_single method (#10040) by @shulaoda
- drop the redundant visited.insert in manual code splitting (#10038) by @shulaoda
- remove the dead output_assets vector in render_chunk_to_assets (#10036) by @shulaoda
- remove the unused From<String>/Display impls for BindingLogLevel (#10035) by @shulaoda
- deps: upgrade oxc to 0.138.0 and migrate to per-type AST construction (#10018) by @shulaoda
- deps: update rust crates (#9911) by @renovate[bot]
- deps: update test262 submodule for tests (#10016) by @rolldown-guard[bot]
- deps: update github actions (#9999) by @renovate[bot]
- deps: update npm packages (#10000) by @renovate[bot]

### ◀️ Revert

- "fix(plugin): make lazy hook metadata enumerable (#9991)" (#10005) by @shulaoda

### ❤️ New Contributors

* @x1024 made their first contribution in [#9948](#9948)

Co-authored-by: shulaoda <165626830+shulaoda@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants