Skip to content

perf(minifier): fix O(n²) perf on very many var decls#21062

Merged
Boshen merged 4 commits intooxc-project:mainfrom
gthb:fix-oxc17369-minifier-quadratic-perf
Apr 8, 2026
Merged

perf(minifier): fix O(n²) perf on very many var decls#21062
Boshen merged 4 commits intooxc-project:mainfrom
gthb:fix-oxc17369-minifier-quadratic-perf

Conversation

@gthb
Copy link
Copy Markdown
Contributor

@gthb gthb commented Apr 5, 2026

What

Fix two O(n²) bottlenecks in the compressor that cause severe slowdowns on bundler output (esbuild, webpack CJS) containing thousands of root-level var import_X = __toESM(require_Y()) declarations, the pattern reported in #17369, where a file only 6% larger took 2.5x longer to minify.

Fixes #17369

Specifically:

Batch reference retention in exit_program

After each compressor iteration, stale references are cleaned up by diffing live references against the symbol table. The per-reference delete_resolved_reference method uses .position() internally, and that's a linear scan of the symbol's reference list, so repeated deletions are O(n²) when many references to the same symbol are removed.

Replace with a new Scoping::retain_resolved_references method that iterates each symbol's reference list once, retaining only references present in the live set. Total cost: O(total_refs).

Cached is_symbol_mutated helper

Scoping::symbol_is_mutated(symbol_id) scans all resolved references for the symbol looking for a write. For heavily-referenced symbols like __toESM (10k+ refs in the issue's bad.js) each call is O(num_refs), and when it is called repeatedly during compression the total becomes O(n²).

Add PeepholeOptimizations::is_symbol_mutated, which reads the write_references_count already cached in SymbolValue (populated during exit_variable_declaratorinit_symbol_value). This is O(1). It falls back to the full scan for symbols without cached values (function declarations and other non-variable bindings).

Use this helper instead of Scoping::symbol_is_mutated in two places:

  • Directly in substitute_single_use_symbol_in_expression, which walks target expressions looking for single-use symbols to inline. When it encounters an identifier it checks whether reordering the replacement past that identifier is safe. With ~10k substitution attempts per iteration on bad.js, this was the dominant O(n²) contributor.

  • Via is_expression_that_reference_may_change, which checks whether the base object of a member-expression assignment target can be mutated. It is called from:

    • substitute_single_use_symbol_in_expression, when the assignment target is foo.bar = …
    • member_object_may_be_mutated, which is called from minimize_logical_expression and remove_unused_expression

Performance

On the real-world files from #17369, on my machine, the time taken by

Compressor::new(&allocator).build_with_scoping(
    &mut program,
    scoping,
    CompressOptions::smallest(),
)

changes thus:

good.js (10.4 MB):  48.61 → 47.56 ms  (2.2% faster, or maybe that's just noise)
bad.js  (11.1 MB): 389.18 → 57.18 ms  (6.8x faster)
ratio bad/good:      8.0x →  1.2x

The total minification time (parse + compress + mangle) changes thus:

good.js (10.4 MB):  91.50 → 91.04 ms  (0.5% faster, probably mostly noise!)
bad.js  (11.1 MB): 436.28 → 104.40 ms  (4.2x faster)
ratio bad/good:      4.8x →  1.15x

For both compress and total, the bad/good ratio is now much closer to the ratio between the file sizes.

…th many var declarations

What
---

Fix two O(n²) bottlenecks in the compressor that cause severe slowdowns on
bundler output (esbuild, webpack CJS) containing thousands of root-level `var
import_X = __toESM(require_Y())` declarations, the pattern reported in oxc-project#17369,
where a file only 6% larger took 2.5x longer to minify.

Fixes oxc-project#17369

How
---

Two independent quadratic bottlenecks, both triggered by symbols with many
resolved references:

After each compressor iteration, stale references are cleaned up by diffing
live references against the symbol table. The per-reference
`delete_resolved_reference` method uses `.position()` internally, and that's a
linear scan of the symbol's reference list, so repeated deletions are O(n²)
when many references to the same symbol are removed.

Replace with a new `Scoping::retain_resolved_references` method that iterates
each symbol's reference list once, retaining only references present in the
live set. Total cost: O(total_refs).

`substitute_single_use_symbol_in_expression` walks target expressions looking
for single-use symbols to inline. When it encounters an identifier, it calls
`Scoping::symbol_is_mutated(symbol_id)` to check whether reordering past it is
safe. That method scans all resolved references for the symbol looking for a
write. For heavily-referenced symbols like `__toESM` (10k+ refs in the issue's
bad.js), each call is O(num_refs), and with ~10k substitution attempts per
iteration the total is O(n²).

Add `PeepholeOptimizations::is_symbol_mutated` which checks the
`write_references_count` already cached in `SymbolValue` (populated during
`exit_variable_declarator` → `init_symbol_value`). This is O(1). Fall back to
the full scan for symbols without cached values (function declarations and
other non-variable bindings).

Performance
---

On the real-world files from oxc-project#17369:

```
good.js (10.4 MB):  37.6 → 38.7 ms  (no change)
bad.js  (11.1 MB): 370.5 → 47.9 ms  (7.7x faster)
ratio bad/good:      9.9x →  1.2x
```

The bad/good ratio dropped from 9.9x to 1.2x, much closer to the ratio between
the file sizes.
@gthb gthb requested a review from Dunqing as a code owner April 5, 2026 13:41
@github-actions github-actions Bot added A-semantic Area - Semantic A-minifier Area - Minifier C-performance Category - Solution not expected to change functional behavior, only performance labels Apr 5, 2026
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Apr 5, 2026

Merging this PR will improve performance by 5.17%

⚡ 3 improved benchmarks
✅ 45 untouched benchmarks
⏩ 3 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation minifier[cal.com.tsx] 39.4 ms 37.4 ms +5.17%
Simulation minifier[react.development.js] 2.8 ms 2.7 ms +4.17%
Simulation minifier[binder.ts] 4.2 ms 4 ms +5.08%

Comparing gthb:fix-oxc17369-minifier-quadratic-perf (b948def) with main (edd0865)

Open in CodSpeed

Footnotes

  1. 3 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@gthb
Copy link
Copy Markdown
Contributor Author

gthb commented Apr 5, 2026

I used Claude for the investigation and fixes, then reviewed and iterated.

Comment thread crates/oxc_minifier/src/peephole/minimize_statements.rs Outdated
@Brooooooklyn
Copy link
Copy Markdown
Member

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. What shall we delve into next?

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Move `is_symbol_mutated` out of `peephole/minimize_statements.rs` into
`peephole/mod.rs`

Change `is_expression_that_reference_may_change` to use it instead of
`ctx.scoping().symbol_is_mutated`
@gthb
Copy link
Copy Markdown
Contributor Author

gthb commented Apr 7, 2026

Updated PR description to account for the addition 6fc05ee and updated performance measurements.

@Boshen Boshen merged commit 61adedd into oxc-project:main Apr 8, 2026
34 checks passed
camc314 pushed a commit that referenced this pull request Apr 13, 2026
### 💥 BREAKING CHANGES

- 36cdc31 str: [**BREAKING**] Remove identity `FromIn` impl for `Ident`
(#21251) (overlookmotel)
- 382958a span: [**BREAKING**] Remove re-exports of string types from
`oxc_span` crate (#21246) (overlookmotel)
- c4aedfa str: [**BREAKING**] Add `static_ident!` macro (#21245)
(overlookmotel)

### 🚀 Features

- e7e1aea transformer/typescript: Add `optimize_enums` option for
regular enum inlining (#20539) (Dunqing)
- 679f57f transformer/typescript: Implement const enum inlining and
declaration removal (#20508) (Dunqing)
- 6dd061c semantic: Extend `MemberWriteTarget` to cover all property
modification patterns (#21205) (Dunqing)
- f134e24 minifier: Support `property_write_side_effects` option to drop
unused property assignments (#20773) (Dunqing)
- 75663c0 semantic: Add enum member value evaluation for const enum
support (#20602) (Dunqing)
- 3cfe8ed semantic: Add `MemberWriteTarget` flag to `ReferenceFlags`
(#20772) (Dunqing)

### 🐛 Bug Fixes

- af1a586 transformer/class-properties: Use correct property name when
converting parameter properties (#21268) (Amal Jossy)
- b43250a allocator: Move allocation tracking into `Bump` (#21342)
(overlookmotel)
- 36f505f allocator: `StringBuilder` use `Allocator::alloc_layout`
(#21340) (overlookmotel)
- 7a08a6f allocator: Fix allocation counting in
`Allocator::alloc_concat_strs_array` (#21336) (overlookmotel)
- 2338e28 ecmascript: Treat `this` as potentially having side effects
(#21297) (sapphi-red)
- bd8bd39 allocator: Remove unsafe hacks from `from_raw_parts` methods
(#21283) (overlookmotel)
- 8f4c340 allocator: Remove dangerous pointer const to mut cast (#21279)
(overlookmotel)
- aa9259f parser: Add missing error code for optional param diagnostic
(#21258) (camc314)
- 04b3c2f str: Fix unsound casting const pointers to mut pointers
(#21242) (overlookmotel)
- ceadf6c str: Make `Ident::from_raw` an unsafe function (#21241)
(overlookmotel)
- eab13b3 transformer/decorators: Avoid accessor storage name collisions
(#21106) (Dunqing)
- 07e8a30 transformer/react-refresh: Handle parenthesized variable
initializers (#21047) (camc314)

### ⚡ Performance

- c3ca6f6 allocator: `StringBuilder::from_strs_array_in` check for 0
length earlier (#21338) (overlookmotel)
- c2422bb allocator: `Allocator::alloc_concat_strs_array` check for 0
length earlier (#21337) (overlookmotel)
- 04b0fdc allocator: Mark `Allocator::alloc_layout` as
`#[inline(always)]` (#21335) (overlookmotel)
- 17aee9e allocator: Use `offset_from_unsigned` in
`ChunkFooter::as_raw_parts` (#21280) (overlookmotel)
- 61adedd minifier: Fix O(n²) perf on very many var decls (#21062)
(Gunnlaugur Thor Briem)
- addcd02 napi/parser, linter/plugins: Raw transfer deserializer for
`Vec`s use shift instead of multiply where possible (#21142)
(overlookmotel)
- 3068ded napi/parser, linter/plugins: Shift before add when calculating
positions in raw transfer deserializer (#21141) (overlookmotel)
- eb400b8 napi/parser, linter/plugins: Remove `uint32` buffer view
(#21140) (overlookmotel)
- 2675085 napi/parser: Lazy deserialization use only `Int32Array`
(#21139) (overlookmotel)
- 5b35a53 napi/parser: Deserializing tokens use only `int32` array
(#21138) (overlookmotel)
- f163d10 parser: Tokens raw deserialization use `Int32Array` (#21137)
(overlookmotel)
- 7a86613 linter/plugins: Use `Int32Array`s for tokens and comments
buffers (#21136) (overlookmotel)
- 8c51121 napi/parser, linter/plugins: Raw transfer deserialize `Span`
fields as `i32`s (#21135) (overlookmotel)
- bc1bcdd napi/parser, linter/plugins: Inline trivial raw transfer field
deserializers into node object definitions (#21134) (overlookmotel)
- c0278ab napi/parser, linter/plugins: Use `Int32Array` in raw transfer
deserializer (#21132) (overlookmotel)
- 43482c7 linter/plugins: Use `>>` not `>>>` in binary search loops
(#21129) (overlookmotel)

### 📚 Documentation

- f5e1845 allocator: Upgrade headers in doc comments for `Bump` (#21263)
(overlookmotel)
- 2870174 allocator: Upper case `SAFETY` in comments (#21253)
(overlookmotel)
- 01bc269 str: Reformat `Ident` doc comments (#21240) (overlookmotel)
- dd47359 allocator: Add doc comments for panics and errors (#21230)
(overlookmotel)
@gthb gthb deleted the fix-oxc17369-minifier-quadratic-perf branch April 15, 2026 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-minifier Area - Minifier A-semantic Area - Semantic C-performance Category - Solution not expected to change functional behavior, only performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

minifier: Performance regression: 2.5x slower minification with minor input size increase

4 participants