perf(minifier): use BitSet for LiveUsageCollector live references#22425
Conversation
Merging this PR will improve performance by 3.28%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | mangler[RadixUIAdoptionSection.jsx_keep_names] |
19.3 µs | 20 µs | -3.37% |
| ⚡ | Simulation | minifier[binder.ts] |
4 ms | 3.7 ms | +6.3% |
| ⚡ | Simulation | minifier[react.development.js] |
2.6 ms | 2.5 ms | +4.51% |
| ⚡ | Simulation | minifier[cal.com.tsx] |
36.5 ms | 34.4 ms | +5.97% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing perf/minifier-live-usage-bitset (5eb1e9d) with main (703557c)
Footnotes
-
3 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
How to use the Graphite Merge QueueAdd either label to this PR to merge it via the merge queue:
You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
a1e2e4b to
5eb1e9d
Compare
Merge activity
|
…2425) ## Summary `LiveUsageCollector` runs after each peephole pass to recollect surviving `IdentifierReference` IDs, so dead references can be batch-pruned from each symbol's resolved-reference list. The set was a `FxHashSet<ReferenceId>` — `.insert()` per identifier during the walk, `.contains()` per resolved-reference entry during `Scoping::retain_resolved_references`. `ReferenceId` is a `nonmax_u32` index, dense from 0 to `scoping.references_len()`, so a fixed-size bitset is the natural fit. This swaps to the existing arena-allocated `oxc_allocator::BitSet` (already used by `oxc_mangler` and `oxc_linter` for `SymbolId`/`ScopeId`/CFG bitsets). | | FxHashSet | BitSet | |---|---|---| | insert | ~25 cycles | ~5 cycles | | contains | ~25 cycles | ~3 cycles | | memory | MB-scale | KB-scale, arena | Also adds `Scoping::references_len()`, mirroring the existing `symbols_len()` / `scopes_len()` getters. ## Behavior Unchanged. - `just minsize` snapshot byte-identical - `cargo test -p oxc_minifier` 492/492 pass, `cargo test -p oxc_semantic --features jsdoc` 60/60 pass - test262 parser conformance 47086/47086 positive, 4588/4588 negative `allocs_minifier.snap` updates reflect the architectural side benefit: heap allocations drop across every test input as the bitset goes through the arena instead. ## Wall-time `profile minify` loop, `--profile release-with-debug`, median of 5 runs: | file | iters | before | after | delta | |---|---|---|---|---| | cal.com.tsx (1.0M) | 500 | 16.76 ms | **16.09 ms** | **-4.0%** (variance 0.87 -> 0.04) | | pdf.mjs (554K) | 400 | 13.03 ms | 13.09 ms | noise | | antd.js (6.7M) | 60 | 88.20 ms | 89.21 ms | noise / +1.1% | | Radix.jsx (2.5K) | 20000 | 13.62 us | 14.04 us | +3.1% | Mixed picture: the win is clean and consistent on the file with the most references (cal.com.tsx, where the hashset's hashing + bucket-walk overhead is the dominant share of `LiveUsageCollector` cost), and the variance reduction there suggests improved cache behavior. Tiny files see a small absolute regression that's roughly proportional to the per-peephole-iteration bitset allocation cost; the absolute cost is bounded by the file's `references_len`, so it doesn't grow with workload size. A natural follow-up - store one bitset on `MinifierState`, `clear()` instead of allocating per iteration - would amortize the allocation cost away on small files. Deferred to keep this PR focused. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
5eb1e9d to
d782b78
Compare
Stacked on top of #22425. ## Summary Swap `SymbolValues`'s backing store from `FxHashMap<SymbolId, SymbolValue>` to `IndexVec<SymbolId, Option<SymbolValue>>`. Symbol IDs are dense `u32`s, so indexed access drops the hash + probe on every hot path — `inline_identifier_reference`, `is_symbol_mutated`, and the dead-code checks in `remove_unused_expression` / `minimize_statements`. The buffer is sized once from `Scoping::symbols_len()` at `MinifierState::new` and reset in place between peephole iterations, so the whole fixed-point loop stays on the indexed-write fast path. No minifier pass currently mints `SymbolId`s mid-run (verified across `peephole/`, `normalize/`, the mangler, and `keep_var.rs` — only `traverse_context/` defines `generate_uid_name` / `create_symbol`, with zero callers), so `init_value` is a single indexed write. If a future pass starts minting UIDs, that write panics on out-of-range — the signal to add a grow path. Local harness running `Compressor::build_with_scoping` against `TestFiles::minimal()` (`RadixUIAdoptionSection.jsx`, `react.development.js`, `cal.com.tsx`, `binder.ts`), 27 paired cycles in randomized order after warm-up: | binary | min | median | p75 | std | |-----------------------|----------:|----------:|----------:|---------:| | bitset base (#22425) | 6 905 125 | 6 958 167 | 6 975 501 | 37 944 | | this PR on top | 6 693 875 | 6 750 875 | 6 774 626 | 40 482 | Paired delta: this PR wins **27 / 27** cycles, median **−207 µs (−2.97 %)**, sign-test p < 0.000001. `cargo test -p oxc_minifier --tests` passes 494/494; allocation snapshot shows reduced sys allocs on every file.
### 🐛 Bug Fixes - 0f26de6 ecmascript: Resolve identifier value type via tracked constants (#22234) (Alexander Lichter) - c27a8cf minifier: Normalize `{ x: x }` shorthand so adjacent-if merge is idempotent (#22401) (Dunqing) - e431a0e parser: Break extends clause loop on fatal error (#22517) (Boshen) - e9ec7c6 minifier: Fold optional chains by base nullishness (#22236) (Alexander Lichter) - e6090e7 transformer: Keep enum IIFE when a non-inlinable value reference remains (#22501) (Dunqing) - 931b7d6 transformer: Inline const enum members through type-cast wrappers (#22500) (Dunqing) - b9615b2 codegen: Preserve string quotes in require() calls during minification (#22475) (zennnnnnn11) - c73c159 transformer/async-to-generator: Reparent parameter initializer scopes (#22507) (camc314) - ecfd3ca transformer/async-to-generator: Move only parameter bindings (#22503) (camc314) - 3ce3431 transformer/explicit-resource-managment: Preserve shadowed for-head block (#22451) (camc314) ### ⚡ Performance - ce92c6c semantic: `#[inline]` `Scoping::get_binding` (#22414) (Dunqing) - 98be95c regular_expression: Track regex flags via bitflags (#22427) (Boshen) - dbbc059 jsdoc: Skip should_attach_jsdoc when no remaining comments (#22409) (Boshen) - 217d7d8 minifier: Index `SymbolValues` by `SymbolId` (#22441) (Dunqing) - d782b78 minifier: Use BitSet for LiveUsageCollector live references (#22425) (Boshen)

Summary
LiveUsageCollectorruns after each peephole pass to recollect survivingIdentifierReferenceIDs, so dead references can be batch-pruned from each symbol's resolved-reference list. The set was aFxHashSet<ReferenceId>—.insert()per identifier during the walk,.contains()per resolved-reference entry duringScoping::retain_resolved_references.ReferenceIdis anonmax_u32index, dense from 0 toscoping.references_len(), so a fixed-size bitset is the natural fit. This swaps to the existing arena-allocatedoxc_allocator::BitSet(already used byoxc_manglerandoxc_linterforSymbolId/ScopeId/CFG bitsets).Also adds
Scoping::references_len(), mirroring the existingsymbols_len()/scopes_len()getters.Behavior
Unchanged.
just minsizesnapshot byte-identicalcargo test -p oxc_minifier492/492 pass,cargo test -p oxc_semantic --features jsdoc60/60 passallocs_minifier.snapupdates reflect the architectural side benefit: heap allocations drop across every test input as the bitset goes through the arena instead.Wall-time
profile minifyloop,--profile release-with-debug, median of 5 runs:Mixed picture: the win is clean and consistent on the file with the most references (cal.com.tsx, where the hashset's hashing + bucket-walk overhead is the dominant share of
LiveUsageCollectorcost), and the variance reduction there suggests improved cache behavior. Tiny files see a small absolute regression that's roughly proportional to the per-peephole-iteration bitset allocation cost; the absolute cost is bounded by the file'sreferences_len, so it doesn't grow with workload size.A natural follow-up - store one bitset on
MinifierState,clear()instead of allocating per iteration - would amortize the allocation cost away on small files. Deferred to keep this PR focused.🤖 Generated with Claude Code