perf(minifier): fix O(n²) perf on very many var decls#21062
perf(minifier): fix O(n²) perf on very many var decls#21062Boshen merged 4 commits intooxc-project:mainfrom
Conversation
…th many var declarations What --- Fix two O(n²) bottlenecks in the compressor that cause severe slowdowns on bundler output (esbuild, webpack CJS) containing thousands of root-level `var import_X = __toESM(require_Y())` declarations, the pattern reported in oxc-project#17369, where a file only 6% larger took 2.5x longer to minify. Fixes oxc-project#17369 How --- Two independent quadratic bottlenecks, both triggered by symbols with many resolved references: After each compressor iteration, stale references are cleaned up by diffing live references against the symbol table. The per-reference `delete_resolved_reference` method uses `.position()` internally, and that's a linear scan of the symbol's reference list, so repeated deletions are O(n²) when many references to the same symbol are removed. Replace with a new `Scoping::retain_resolved_references` method that iterates each symbol's reference list once, retaining only references present in the live set. Total cost: O(total_refs). `substitute_single_use_symbol_in_expression` walks target expressions looking for single-use symbols to inline. When it encounters an identifier, it calls `Scoping::symbol_is_mutated(symbol_id)` to check whether reordering past it is safe. That method scans all resolved references for the symbol looking for a write. For heavily-referenced symbols like `__toESM` (10k+ refs in the issue's bad.js), each call is O(num_refs), and with ~10k substitution attempts per iteration the total is O(n²). Add `PeepholeOptimizations::is_symbol_mutated` which checks the `write_references_count` already cached in `SymbolValue` (populated during `exit_variable_declarator` → `init_symbol_value`). This is O(1). Fall back to the full scan for symbols without cached values (function declarations and other non-variable bindings). Performance --- On the real-world files from oxc-project#17369: ``` good.js (10.4 MB): 37.6 → 38.7 ms (no change) bad.js (11.1 MB): 370.5 → 47.9 ms (7.7x faster) ratio bad/good: 9.9x → 1.2x ``` The bad/good ratio dropped from 9.9x to 1.2x, much closer to the ratio between the file sizes.
Merging this PR will improve performance by 5.17%
Performance Changes
Comparing Footnotes
|
|
I used Claude for the investigation and fixes, then reviewed and iterated. |
|
@codex review |
|
Codex Review: Didn't find any major issues. What shall we delve into next? ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Move `is_symbol_mutated` out of `peephole/minimize_statements.rs` into `peephole/mod.rs` Change `is_expression_that_reference_may_change` to use it instead of `ctx.scoping().symbol_is_mutated`
|
Updated PR description to account for the addition 6fc05ee and updated performance measurements. |
### 💥 BREAKING CHANGES - 36cdc31 str: [**BREAKING**] Remove identity `FromIn` impl for `Ident` (#21251) (overlookmotel) - 382958a span: [**BREAKING**] Remove re-exports of string types from `oxc_span` crate (#21246) (overlookmotel) - c4aedfa str: [**BREAKING**] Add `static_ident!` macro (#21245) (overlookmotel) ### 🚀 Features - e7e1aea transformer/typescript: Add `optimize_enums` option for regular enum inlining (#20539) (Dunqing) - 679f57f transformer/typescript: Implement const enum inlining and declaration removal (#20508) (Dunqing) - 6dd061c semantic: Extend `MemberWriteTarget` to cover all property modification patterns (#21205) (Dunqing) - f134e24 minifier: Support `property_write_side_effects` option to drop unused property assignments (#20773) (Dunqing) - 75663c0 semantic: Add enum member value evaluation for const enum support (#20602) (Dunqing) - 3cfe8ed semantic: Add `MemberWriteTarget` flag to `ReferenceFlags` (#20772) (Dunqing) ### 🐛 Bug Fixes - af1a586 transformer/class-properties: Use correct property name when converting parameter properties (#21268) (Amal Jossy) - b43250a allocator: Move allocation tracking into `Bump` (#21342) (overlookmotel) - 36f505f allocator: `StringBuilder` use `Allocator::alloc_layout` (#21340) (overlookmotel) - 7a08a6f allocator: Fix allocation counting in `Allocator::alloc_concat_strs_array` (#21336) (overlookmotel) - 2338e28 ecmascript: Treat `this` as potentially having side effects (#21297) (sapphi-red) - bd8bd39 allocator: Remove unsafe hacks from `from_raw_parts` methods (#21283) (overlookmotel) - 8f4c340 allocator: Remove dangerous pointer const to mut cast (#21279) (overlookmotel) - aa9259f parser: Add missing error code for optional param diagnostic (#21258) (camc314) - 04b3c2f str: Fix unsound casting const pointers to mut pointers (#21242) (overlookmotel) - ceadf6c str: Make `Ident::from_raw` an unsafe function (#21241) (overlookmotel) - eab13b3 transformer/decorators: Avoid accessor storage name collisions (#21106) (Dunqing) - 07e8a30 transformer/react-refresh: Handle parenthesized variable initializers (#21047) (camc314) ### ⚡ Performance - c3ca6f6 allocator: `StringBuilder::from_strs_array_in` check for 0 length earlier (#21338) (overlookmotel) - c2422bb allocator: `Allocator::alloc_concat_strs_array` check for 0 length earlier (#21337) (overlookmotel) - 04b0fdc allocator: Mark `Allocator::alloc_layout` as `#[inline(always)]` (#21335) (overlookmotel) - 17aee9e allocator: Use `offset_from_unsigned` in `ChunkFooter::as_raw_parts` (#21280) (overlookmotel) - 61adedd minifier: Fix O(n²) perf on very many var decls (#21062) (Gunnlaugur Thor Briem) - addcd02 napi/parser, linter/plugins: Raw transfer deserializer for `Vec`s use shift instead of multiply where possible (#21142) (overlookmotel) - 3068ded napi/parser, linter/plugins: Shift before add when calculating positions in raw transfer deserializer (#21141) (overlookmotel) - eb400b8 napi/parser, linter/plugins: Remove `uint32` buffer view (#21140) (overlookmotel) - 2675085 napi/parser: Lazy deserialization use only `Int32Array` (#21139) (overlookmotel) - 5b35a53 napi/parser: Deserializing tokens use only `int32` array (#21138) (overlookmotel) - f163d10 parser: Tokens raw deserialization use `Int32Array` (#21137) (overlookmotel) - 7a86613 linter/plugins: Use `Int32Array`s for tokens and comments buffers (#21136) (overlookmotel) - 8c51121 napi/parser, linter/plugins: Raw transfer deserialize `Span` fields as `i32`s (#21135) (overlookmotel) - bc1bcdd napi/parser, linter/plugins: Inline trivial raw transfer field deserializers into node object definitions (#21134) (overlookmotel) - c0278ab napi/parser, linter/plugins: Use `Int32Array` in raw transfer deserializer (#21132) (overlookmotel) - 43482c7 linter/plugins: Use `>>` not `>>>` in binary search loops (#21129) (overlookmotel) ### 📚 Documentation - f5e1845 allocator: Upgrade headers in doc comments for `Bump` (#21263) (overlookmotel) - 2870174 allocator: Upper case `SAFETY` in comments (#21253) (overlookmotel) - 01bc269 str: Reformat `Ident` doc comments (#21240) (overlookmotel) - dd47359 allocator: Add doc comments for panics and errors (#21230) (overlookmotel)
What
Fix two O(n²) bottlenecks in the compressor that cause severe slowdowns on bundler output (esbuild, webpack CJS) containing thousands of root-level
var import_X = __toESM(require_Y())declarations, the pattern reported in #17369, where a file only 6% larger took 2.5x longer to minify.Fixes #17369
Specifically:
Batch reference retention in
exit_programAfter each compressor iteration, stale references are cleaned up by diffing live references against the symbol table. The per-reference
delete_resolved_referencemethod uses.position()internally, and that's a linear scan of the symbol's reference list, so repeated deletions are O(n²) when many references to the same symbol are removed.Replace with a new
Scoping::retain_resolved_referencesmethod that iterates each symbol's reference list once, retaining only references present in the live set. Total cost: O(total_refs).Cached
is_symbol_mutatedhelperScoping::symbol_is_mutated(symbol_id)scans all resolved references for the symbol looking for a write. For heavily-referenced symbols like__toESM(10k+ refs in the issue's bad.js) each call is O(num_refs), and when it is called repeatedly during compression the total becomes O(n²).Add
PeepholeOptimizations::is_symbol_mutated, which reads thewrite_references_countalready cached inSymbolValue(populated duringexit_variable_declarator→init_symbol_value). This is O(1). It falls back to the full scan for symbols without cached values (function declarations and other non-variable bindings).Use this helper instead of
Scoping::symbol_is_mutatedin two places:Directly in
substitute_single_use_symbol_in_expression, which walks target expressions looking for single-use symbols to inline. When it encounters an identifier it checks whether reordering the replacement past that identifier is safe. With ~10k substitution attempts per iteration on bad.js, this was the dominant O(n²) contributor.Via
is_expression_that_reference_may_change, which checks whether the base object of a member-expression assignment target can be mutated. It is called from:substitute_single_use_symbol_in_expression, when the assignment target isfoo.bar = …member_object_may_be_mutated, which is called fromminimize_logical_expressionandremove_unused_expressionPerformance
On the real-world files from #17369, on my machine, the time taken by
changes thus:
The total minification time (parse + compress + mangle) changes thus:
For both compress and total, the bad/good ratio is now much closer to the ratio between the file sizes.