perf(semantic): use smallvec for storing reference IDs#17731
Conversation
c20628b to
9d509af
Compare
How to use the Graphite Merge QueueAdd either label to this PR to merge it via the merge queue:
You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
Merging this PR will improve performance by 8.44%Summary
Performance Changes
Comparing Footnotes
|
e48449b to
394c38c
Compare
There was a problem hiding this comment.
Pull request overview
This PR optimizes memory allocation performance in the semantic analyzer by using SmallVec to store reference IDs inline on the stack instead of heap-allocating Vec instances. The optimization targets the common case where identifiers are referenced 8 times or fewer within a scope, storing these references on the stack (32 bytes for 8 u32 IDs) and only falling back to heap allocation when needed.
- Introduces a
ReferenceIdstype alias usingSmallVec<[ReferenceId; 8]>for storing reference IDs - Updates all references from
Vec<ReferenceId>to use the newReferenceIdstype - Adjusts closure signature in
builder.rsto accommodate SmallVec's differentretainAPI
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| crates/oxc_semantic/src/unresolved_stack.rs | Introduces ReferenceIds type alias using SmallVec with inline capacity of 8 and updates TempUnresolvedReferences to use it |
| crates/oxc_semantic/src/scoping.rs | Imports ReferenceIds and updates set_root_unresolved_references signature to use the new type |
| crates/oxc_semantic/src/builder.rs | Adjusts retain closure parameter from &reference_id to reference_id to work with SmallVec's API |
| crates/oxc_semantic/Cargo.toml | Adds smallvec workspace dependency |
| Cargo.lock | Updates dependency graph with smallvec 1.15.1 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Dunqing
left a comment
There was a problem hiding this comment.
Nice work! A few line changes, big performance improvements!
Merge activity
|
We process a lot of identifiers. Some of the most commonly allocating operations are declaring identifier references and resolving them. However, most identifiers are not referenced hundreds of times. Instead, identifiers are typically referenced just a few times. So, we can optimize for the common case by inlining the list of reference IDs onto the stack through `smallvec`. As long as there are 8 IDs or fewer, then all of the references can be stored on the stack instead of in a heap allocated `Vec`. If there are more references than that, then it will heap allocate a larger array. Based on some [benchmarking of different inline lengths](#17731 (comment)), I think inlining 8 IDs is close to maximizing the performance here and makes for a nice round number: 8 u32 = 32 bytes. This yields good performance on the benchmarks and shouldn't pessimize the performance too much when there are more than 8 IDs, since it will just fall back to acting like a normal `Vec`.
394c38c to
3a452b8
Compare
### 🚀 Features - 10426af codegen: Print soft space between inline block comments on the same line (#17799) (camc314) - 2261e6e semantic: Improve error message to add `#` for private identifiers (#17779) (Dunqing) ### 🐛 Bug Fixes - 7422b7e parser/trivia: Correctly mark whether a block comment is on a newline (#17754) (camc314) - c32e8d5 codegen: Wrap `TSAsExpression` in parens when used with in/instanceof operators (#17752) (camc314) - 5755b2d semantic: Report duplicate private identifier for static and instance elements (#17591) (camc314) - 0600df3 isolated_declarations: Only print jsdoc comments (#17748) (camc314) - ef7e014 parser: Preserve `@__NO_SIDE_EFFECTS__` annotation with parenthesized expressions (#17711) (camc314) - 59a6228 parser: Detect TS1363 error for type-only imports with mixed default and named/namespace bindings (#17712) (Copilot) ### ⚡ Performance - 864f1fa semantic: Mark duplicate class element error reporting as cold (#17746) (camc314) - 3a452b8 semantic: Use smallvec for storing reference IDs (#17731) (camchenry) - d5979dc minifier: Do not allocate when checking to convert `const` to `let` (#17730) (camchenry) - 3f4429c parser: Do not re-allocate TS interface heritage (#17692) (camchenry) ### 📚 Documentation - 120a27c minifier: Add prettier-ignore for js-in-md part (#17687) (leaysgur) Co-authored-by: Boshen <1430279+Boshen@users.noreply.github.com>






We process a lot of identifiers. Some of the most commonly allocating operations are declaring identifier references and resolving them. However, most identifiers are not referenced hundreds of times. Instead, identifiers are typically referenced just a few times. So, we can optimize for the common case by inlining the list of reference IDs onto the stack through
smallvec. As long as there are 8 IDs or fewer, then all of the references can be stored on the stack instead of in a heap allocatedVec. If there are more references than that, then it will heap allocate a larger array.Based on some benchmarking of different inline lengths, I think inlining 8 IDs is close to maximizing the performance here and makes for a nice round number: 8 u32 = 32 bytes. This yields good performance on the benchmarks and shouldn't pessimize the performance too much when there are more than 8 IDs, since it will just fall back to acting like a normal
Vec.