Skip to content

perf(semantic): use smallvec for storing reference IDs#17731

Merged
graphite-app[bot] merged 1 commit intomainfrom
01-06-perf_semantic_use_smallvec_for_storing_reference_ids
Jan 7, 2026
Merged

perf(semantic): use smallvec for storing reference IDs#17731
graphite-app[bot] merged 1 commit intomainfrom
01-06-perf_semantic_use_smallvec_for_storing_reference_ids

Conversation

@camchenry
Copy link
Member

@camchenry camchenry commented Jan 7, 2026

We process a lot of identifiers. Some of the most commonly allocating operations are declaring identifier references and resolving them. However, most identifiers are not referenced hundreds of times. Instead, identifiers are typically referenced just a few times. So, we can optimize for the common case by inlining the list of reference IDs onto the stack through smallvec. As long as there are 8 IDs or fewer, then all of the references can be stored on the stack instead of in a heap allocated Vec. If there are more references than that, then it will heap allocate a larger array.

Based on some benchmarking of different inline lengths, I think inlining 8 IDs is close to maximizing the performance here and makes for a nice round number: 8 u32 = 32 bytes. This yields good performance on the benchmarks and shouldn't pessimize the performance too much when there are more than 8 IDs, since it will just fall back to acting like a normal Vec.

@github-actions github-actions bot added A-semantic Area - Semantic C-performance Category - Solution not expected to change functional behavior, only performance labels Jan 7, 2026
@camchenry camchenry force-pushed the 01-06-perf_semantic_use_smallvec_for_storing_reference_ids branch from c20628b to 9d509af Compare January 7, 2026 03:28
Copy link
Member Author


How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • 0-merge - adds this PR to the back of the merge queue
  • hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@codspeed-hq
Copy link

codspeed-hq bot commented Jan 7, 2026

Merging this PR will improve performance by 8.44%

Summary

⚡ 5 improved benchmarks
✅ 33 untouched benchmarks
⏩ 7 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation semantic[react.development.js] 1.3 ms 1.3 ms +6.44%
Simulation semantic[cal.com.tsx] 22.1 ms 21.2 ms +4.54%
Simulation semantic[RadixUIAdoptionSection.jsx] 64.4 µs 62.5 µs +3.04%
Simulation semantic[binder.ts] 3.4 ms 3.1 ms +8.44%
Simulation codegen[cal.com.tsx] 34.9 ms 33.8 ms +3.1%

Comparing 01-06-perf_semantic_use_smallvec_for_storing_reference_ids (394c38c) with main (c115f4e)

Open in CodSpeed

Footnotes

  1. 7 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@camchenry
Copy link
Member Author

camchenry commented Jan 7, 2026

I'm going to benchmark a few sizes. These are the results for inlining 1 ID:

Screenshot 2026-01-06 at 10 40 30 PM

2 IDs:

Screenshot 2026-01-06 at 10 33 55 PM

4 IDs:

Screenshot 2026-01-06 at 10 48 38 PM

8 IDs:

Screenshot 2026-01-06 at 10 58 19 PM

12 IDs:

Screenshot 2026-01-06 at 11 09 25 PM

@camchenry camchenry force-pushed the 01-06-perf_semantic_use_smallvec_for_storing_reference_ids branch 5 times, most recently from e48449b to 394c38c Compare January 7, 2026 04:12
@camchenry camchenry marked this pull request as ready for review January 7, 2026 04:34
@camchenry camchenry requested a review from Dunqing as a code owner January 7, 2026 04:34
Copilot AI review requested due to automatic review settings January 7, 2026 04:34
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes memory allocation performance in the semantic analyzer by using SmallVec to store reference IDs inline on the stack instead of heap-allocating Vec instances. The optimization targets the common case where identifiers are referenced 8 times or fewer within a scope, storing these references on the stack (32 bytes for 8 u32 IDs) and only falling back to heap allocation when needed.

  • Introduces a ReferenceIds type alias using SmallVec<[ReferenceId; 8]> for storing reference IDs
  • Updates all references from Vec<ReferenceId> to use the new ReferenceIds type
  • Adjusts closure signature in builder.rs to accommodate SmallVec's different retain API

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
crates/oxc_semantic/src/unresolved_stack.rs Introduces ReferenceIds type alias using SmallVec with inline capacity of 8 and updates TempUnresolvedReferences to use it
crates/oxc_semantic/src/scoping.rs Imports ReferenceIds and updates set_root_unresolved_references signature to use the new type
crates/oxc_semantic/src/builder.rs Adjusts retain closure parameter from &reference_id to reference_id to work with SmallVec's API
crates/oxc_semantic/Cargo.toml Adds smallvec workspace dependency
Cargo.lock Updates dependency graph with smallvec 1.15.1

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@Dunqing Dunqing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! A few line changes, big performance improvements!

@camc314 camc314 added the 0-merge Merge with Graphite Merge Queue label Jan 7, 2026
Copy link
Contributor

camc314 commented Jan 7, 2026

Merge activity

We process a lot of identifiers. Some of the most commonly allocating operations are declaring identifier references and resolving them. However, most identifiers are not referenced hundreds of times. Instead, identifiers are typically referenced just a few times. So, we can optimize for the common case by inlining the list of reference IDs onto the stack through `smallvec`. As long as there are 8 IDs or fewer, then all of the references can be stored on the stack instead of in a heap allocated `Vec`. If there are more references than that, then it will heap allocate a larger array.

Based on some [benchmarking of different inline lengths](#17731 (comment)), I think inlining 8 IDs is close to maximizing the performance here and makes for a nice round number: 8 u32 = 32 bytes. This yields good performance on the benchmarks and shouldn't pessimize the performance too much when there are more than 8 IDs, since it will just fall back to acting like a normal `Vec`.
@graphite-app graphite-app bot force-pushed the 01-06-perf_semantic_use_smallvec_for_storing_reference_ids branch from 394c38c to 3a452b8 Compare January 7, 2026 09:19
@graphite-app graphite-app bot merged commit 3a452b8 into main Jan 7, 2026
21 checks passed
@graphite-app graphite-app bot removed the 0-merge Merge with Graphite Merge Queue label Jan 7, 2026
@graphite-app graphite-app bot deleted the 01-06-perf_semantic_use_smallvec_for_storing_reference_ids branch January 7, 2026 09:25
graphite-app bot pushed a commit that referenced this pull request Jan 8, 2026
graphite-app bot pushed a commit that referenced this pull request Jan 8, 2026
Dunqing pushed a commit that referenced this pull request Jan 12, 2026
### 🚀 Features

- 10426af codegen: Print soft space between inline block comments on the
same line (#17799) (camc314)
- 2261e6e semantic: Improve error message to add `#` for private
identifiers (#17779) (Dunqing)

### 🐛 Bug Fixes

- 7422b7e parser/trivia: Correctly mark whether a block comment is on a
newline (#17754) (camc314)
- c32e8d5 codegen: Wrap `TSAsExpression` in parens when used with
in/instanceof operators (#17752) (camc314)
- 5755b2d semantic: Report duplicate private identifier for static and
instance elements (#17591) (camc314)
- 0600df3 isolated_declarations: Only print jsdoc comments (#17748)
(camc314)
- ef7e014 parser: Preserve `@__NO_SIDE_EFFECTS__` annotation with
parenthesized expressions (#17711) (camc314)
- 59a6228 parser: Detect TS1363 error for type-only imports with mixed
default and named/namespace bindings (#17712) (Copilot)

### ⚡ Performance

- 864f1fa semantic: Mark duplicate class element error reporting as cold
(#17746) (camc314)
- 3a452b8 semantic: Use smallvec for storing reference IDs (#17731)
(camchenry)
- d5979dc minifier: Do not allocate when checking to convert `const` to
`let` (#17730) (camchenry)
- 3f4429c parser: Do not re-allocate TS interface heritage (#17692)
(camchenry)

### 📚 Documentation

- 120a27c minifier: Add prettier-ignore for js-in-md part (#17687)
(leaysgur)

Co-authored-by: Boshen <1430279+Boshen@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-semantic Area - Semantic C-performance Category - Solution not expected to change functional behavior, only performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants