Skip to content

refactor(minifier)!: typed mutation helpers + incremental scoping refresh#22736

Closed
Dunqing wants to merge 56 commits into
mainfrom
spec/minifier-incremental-scoping
Closed

refactor(minifier)!: typed mutation helpers + incremental scoping refresh#22736
Dunqing wants to merge 56 commits into
mainfrom
spec/minifier-incremental-scoping

Conversation

@Dunqing

@Dunqing Dunqing commented May 26, 2026

Copy link
Copy Markdown
Member

AI disclosure (per CONTRIBUTING.md): This PR was developed with substantial AI assistance (Claude Code). All commits were reviewed for correctness, tested locally, and benchmarked. The contributor is responsible for the content.

Summary

Two-stage refactor of how oxc_minifier tracks AST mutations:

  1. Helper-driven mutation API (commits 1-21): replaces ~190 manual ctx.state.changed = true writes scattered across crates/oxc_minifier/src/peephole/ with 8 typed helpers on TraverseCtx<MinifierState> (replace_expression, replace_statement, replace_assignment_target_property, replace_property_key, replace_for_statement_left, drop_expression, drop_statement, notice_change). CI gates (tools/check_state_changed.sh + an ast-grep rule) prevent future bypasses. Makes "forgot to mark the loop dirty" — the bug pattern behind fix(minifier): mark peephole loop changed when dropping dead-after-throw statement #22722, refactor(codegen): consolidate cjs-module-lexer helpers into one module #22552, and others — a compile-error rather than a silent miss.

  2. Incremental scoping refresh (commits 22-39): deletes the per-pass LiveUsageCollector walk over the entire live program in exit_program. Mutation helpers now walk the dropped + replacement subtrees in place, accumulating dead ReferenceIds into a per-pass PassDirty set that exit_program consumes directly. Direct-eval flag refresh runs only when an eval(...) call was dropped. state.changed: bool removed; replaced by mutations: u64 snapshot-compared by the fixed-point loop driver.

Both stages are documented in spec/plan files committed in the PR:

  • docs/superpowers/specs/2026-05-25-minifier-eliminate-changed-flag-design.md
  • docs/superpowers/specs/2026-05-26-minifier-incremental-scoping-refresh-design.md
  • docs/superpowers/plans/* (per-spec)

Latent bugs surfaced and fixed during the migration

The discipline of routing every AST mutation through a typed helper exposed real silent-mutation bugs the prior global walk was masking:

  • 3 missed *e = … writes in remove_unused_expression.rs (remove_unused_array_expr two arms + remove_unused_class_expr) where callers consumed the bool return without bumping notice_change.
  • 2 field-writes in convert_to_dotted_properties.rs (the function took &TraverseCtx immutably).
  • 1 caller-tracked bypass in minimize_statements.rs:1387 (substitute_single_use_symbol_in_expression).
  • 9 Pattern A field-write bypasses across 5 files where direct <slot>.<field> = X writes lacked any change tracking (callee, for-statement-left, conditional-expression injects).
  • 4 idempotency-blocked sites in try_fold_if (gated structurally so typed helpers can land without infinite fixed-point iteration).
  • 4 additional silent leaks surfaced by deleting LiveUsageCollector — fold-constants ChainFold::Collapse, minimize_conditional transient wrapping, fold_arguments_into_needed_expressions drain, and retain_mut filters in declaration removal.

None of these changed minified output on the tasks/minsize/ corpus, but each was a latent correctness hazard.

Performance

Single-file minifier benchmarks (baseline = spec/minifier-eliminate-changed-flag with helpers but no incremental scoping; new = this PR's HEAD):

Workload Baseline New Δ
kitchen-sink.tsx 13090 µs 6977 µs −46.70%
cal.com.tsx 5901 µs 5712 µs −3.21%
binder.ts 552 µs 543 µs −1.61%
react.development.js 376 µs 402 µs +6.77%

Mangler benchmarks unchanged (within noise on all 6 files).

Memory (tasks/track_memory_allocations/allocs_minifier.snap) shows the same shape: small sys-alloc increases on most files (+4 to +31), a large drop on kitchen-sink.tsx (−1356 sys allocs, −46566 arena allocs). The big files where LiveUsageCollector dominated win sharply; the small file regression on react.development.js is the per-helper FxHashSet insert overhead the spec's §6.3 already documented mitigation tiers for (hybrid BitSet + overflow set, lazy init, etc.). Deferred — kitchen-sink.tsx-class inputs are the load-bearing case for bundler users.

Breaking changes

  • MinifierState::changed: bool field removed. External crates reading the field will not compile. The replacement signal is MinifierState::mutations: u64 (read-only outside the helpers).
  • New helper API: TraverseCtx::replace_* / drop_* / notice_change are the only sanctioned mutation entry points; direct state.changed = true writes are forbidden by tools/check_state_changed.sh.

Verification

Rebased onto latest main

Rebased on top of #22722 (dead-after-throw fix), #22729 (cjs-module-lexer hint), and the other 4 commits added since branching. Three semantic conflicts in minimize_statements.rs (upstream's state.changed = true writes from #22722 needed to flow through ctx.drop_statement(&stmt) / ctx.notice_change() to satisfy the new lockdown gate) — resolved in commit 0335d0b562. The Pattern E dead_drop_mutates_ast gate from #22722 is the very pattern this PR's spec was named after; it now correctly uses the typed helpers from this PR.

Commit structure

39 commits, organized by stage:

  • Helper migration (27 commits, originally on spec/minifier-eliminate-changed-flag): spec → plan → PR 1 adds helpers → PRs 2-17 per-file mechanical migrations → PR 18 lockdown (pub(crate) + CI gates).
  • Incremental scoping (12 commits): spec → plan → Commit 1 latent-bug fixes → Commit 2 drop_* helpers + Pattern C/D migrations → Pattern A field-write bypass sweep → Commit 3 counter → Commit 4 DropDiff infrastructure (no-op observable) → try_fold_if idempotency gate → Commit 5 switch consumer + delete LiveUsageCollector.

Each commit is individually revertable for targeted rollback (see specs §9).

@github-actions github-actions Bot added A-parser Area - Parser A-semantic Area - Semantic A-minifier Area - Minifier labels May 26, 2026
@codspeed-hq

codspeed-hq Bot commented May 26, 2026

Copy link
Copy Markdown

Merging this PR will improve performance by 58.85%

⚡ 2 improved benchmarks
✅ 60 untouched benchmarks
⏩ 9 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation minifier[kitchen-sink.tsx] 63.6 ms 32 ms +99.11%
Simulation pipeline[kitchen-sink.tsx] 122 ms 96.3 ms +26.72%

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.


Comparing spec/minifier-incremental-scoping (aa36ef3) with main (123d4f4)

Open in CodSpeed

Footnotes

  1. 9 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@github-actions github-actions Bot added the A-allocator Area - Allocator label May 26, 2026
Dunqing added a commit that referenced this pull request May 26, 2026
…places an identifier

Before the minifier's incremental-scoping refactor (#22736), the per-pass
`LiveUsageCollector` walked the live program at `exit_program` and rebuilt
`Scoping::resolved_references` from scratch. That walk silently masked the
fact that `ReplaceGlobalDefines` never told `Scoping` about the identifier
references it was replacing — the next minifier pass would simply recompute
them and find that the replaced identifier had no surviving occurrences.

With the LiveUsageCollector gone, only references that the minifier itself
drops via `replace_*`/`drop_*` helpers are pruned. References that callers
drop before `dead_code_elimination_with_scoping` get to run are now visible
as stale entries in `resolved_references`, and `symbol_is_unused` returns
`false` for symbols whose every textual occurrence has been replaced.

Concretely, the existing `replace_global_defines::declare_const` integration
test (added by #12427 to align with rollup) fails: the test expects the
`declare const IS_PROD: boolean;` line to be DCE'd after every reference to
`IS_PROD` is replaced with `true`, but the leaked references keep DCE from
seeing IS_PROD as unused.

The fix delegates reference deletion to the plugin: when
`replace_identifier_define_impl` returns a replacement value, the caller
captures the old identifier's `(reference_id, symbol_id, name)` BEFORE the
replacement and calls `Scoping::delete_reference` (resolved case) or
`delete_root_unresolved_reference` (truly global case) only AFTER the AST
mutation actually commits. The split matters at the assignment-target site —
`replace_define_with_assignment_expr` may bail when the replacement value
isn't a valid LHS (e.g. `IS_PROD = 0` with define `IS_PROD -> true` would
need `true = 0`, which `assignment_target_from_expr` rejects); the
reference must stay in that case.

New test `declare_const_assignment_target_bailout` pins the bail path.
Dunqing added a commit that referenced this pull request May 26, 2026
…s into PassDirty

Two more silent reference-drop sites surfaced by Codex adversarial
review of #22736, same bug class as the four fixed in 0856533 and
the two fixed in faaa4ba. Both sites mutated the AST through
`notice_change` only, even though the mutation discarded a subtree
that could carry identifier references — leaving stale entries in
`Scoping::resolved_references` across passes.

1. `remove_unused_array_expr` (`peephole/remove_unused_expression.rs`):
   the `retain_mut` predicate dropped a side-effect-free
   `ArrayExpressionElement::SpreadElement` without walking its
   `argument`. The `may_have_side_effects` check still uses the
   `ArrayExpressionElement` impl (so `[...ident]` is still treated as
   side-effecting via the iterator protocol and kept), but when the
   spread IS elided the argument now routes through
   `ctx.drop_expression`.

2. `fold_object_exp` (`peephole/fold_constants.rs`) had three skipped
   drop paths:
   - the `is_expression_undefined` short-circuit, which discarded the
     spread argument with `continue`,
   - the `should_fold_spread_element` branch, which dropped the
     argument with a bare `// skip` comment,
   - the inlined-object `__proto__` filter, which used
     `filter(...).collect` to drop non-computed `__proto__` properties
     and their value expressions.
   The first two now call `ctx.drop_expression(&spread.argument)`
   before continuing. The filter chain is rewritten as an explicit loop
   that calls `ctx.drop_expression(&p.value)` on each elided
   `__proto__` property; the property key is always a non-computed
   static name and carries no references.

Regression tests:

- `test_array_spread_drop_walks_argument_refs` in
  `tests/peephole/remove_unused_expression.rs` covers the elided
  spread path through nested array literals
  (`[...[function(){}]]`). Existing `MayHaveSideEffects` rules treat
  every spread shape that retains identifier refs as side-effecting,
  so there is no source-level case where the new walk is observable
  on stable inputs — the test is documented as defensive and
  conformance is the live discriminator.
- `test_fold_object_spread_drop_walks_argument_refs` in
  `tests/peephole/fold_constants.rs` covers the two
  ref-bearing fold paths: `{...function(){ x = 'b' }}` (path 2) and
  `{...{__proto__: function(){ x = 'b' }}}` (path 3). Without the
  fix, the stale write-reference on `x` blocks the constant inline
  on the follow-up pass and the test fails; with the fix, the inline
  fires.

Verification: `cargo coverage -- minifier` returns 44795/44799 (matches
main). Unit tests pass (508 + 2 doctests). `just minsize` shows zero
delta against the corpus snapshots. Clippy/fmt/ast-grep clean.
Dunqing added 23 commits June 9, 2026 15:35
The active branch fix/minifier-mark-changed-on-dead-stmt-drop is the
latest instance of a recurring bug class: missing one manual
`ctx.state.changed = true` silently disables peephole fixed-point
convergence. This spec covers replacing all 188 manual writes across
16 files with typed mutation helpers (`replace_expression`,
`replace_statement`, `notice_change`, `reset_changed`), with CI
checks to prevent regression.

The narrower scope reflects two adversarial review rounds: an earlier
draft also redesigned scoping refresh, which proved to have
correctness gaps requiring their own design pass (deferred in section
6).
Implementation plan for the spec committed in
`docs/superpowers/specs/2026-05-25-minifier-eliminate-changed-flag-design.md`.

Decomposes the migration into 18 PRs: PR 1 adds the four mutation
helpers and migrates the per-pass false-reset; PRs 2-17 mechanically
convert one peephole module each, ordered smallest-first; PR 18 is the
final lockdown that makes `MinifierState::changed` `pub(crate)` and
adds two CI checks (a grep for direct field writes plus an ast-grep
rule that catches the `*slot = new; ctx.notice_change();` bypass).

Each per-file task includes baseline + post counts so executing
engineers can verify zero drift, and `minimize_statements.rs` (Task
17) flags the `dead_drop_mutates_ast` gate that, if removed, would
spin the peephole fixed-point loop forever.
The per-file breakdown summed to 188 but the actual count is 187 — the
minimize_statements.rs entry was 45 but is actually 44. Update both the
spec and the plan to match the real numbers grep produces.
…r re-export

Clippy 1.95.0 flags the `pub(crate) use` at `lexer/mod.rs:38` because the
`mod lexer` declaration is private in the default build. But the feature-gated
`pub mod lexer` (cfg(feature = "benchmarking")) means widening to `pub` would
change visibility under benchmarking. Annotate with `#[expect(...)]` instead.

Unblocks the pre-commit clippy hook for any crate that depends on oxc_parser.
Introduces `replace_expression`, `replace_statement`, `notice_change`,
and `reset_changed` as the sanctioned API for marking a peephole pass as
having mutated the AST. Migrates the single `state.changed = false`
reset in `enter_program` to `reset_changed()`; all `= true` sites
will be migrated in subsequent PRs in this stack.

The legacy field stays `pub` for now so unmigrated modules can keep
writing to it directly. The final PR (`refactor(minifier): lock down
state.changed writes`) makes the field `pub(crate)` and adds the CI
check.
Adds rule 1a to the Site classification rule: field assignments to
`Statement<'a>` / `Expression<'a>` fields (e.g. `for_stmt.body = X`)
count as slot replacements and must use the typed `replace_*` helpers,
not `notice_change()`. The ast-grep CI rule only catches deref-write
bypasses, so reviewers enforce the broader semantic contract.

Caught during PR 4 spec review where both sites had been migrated to
`notice_change()` based on a literal reading of the syntactic rule.
Fixing PR 4's commit follows in the next commit on this branch.
…inimize_for_statement.rs

Spec review on PR 4 flagged that for_stmt.body = drop_first_statement(...)
is a slot replacement (`for_stmt.body` is `Statement<'a>`), not an
in-place tweak — replace_statement applies, not notice_change. Plan
rule 1a (added in commit d18379e) makes this explicit.
Preserves the `dead_drop_mutates_ast` gate at line 22 — dropping
that gate would let the peephole fixed-point loop spin forever on
`var x;` declarations that `KeepVar` re-emits unchanged.
Dunqing and others added 19 commits June 9, 2026 15:35
Adds the per-pass dirty-data accumulator that commit 5 will consume to
drive an incremental scoping refresh. This commit is no-op observable:
`LiveUsageCollector` still authoritatively prunes dead references and
refreshes direct-eval flags in `exit_program`. Zero minsize delta.

Pieces:

- `MinifierState.dirty: PassDirty<'a>` holds the per-pass accumulator
  (`dead_refs: FxHashSet<ReferenceId>`,
  `dead_unresolved: FxHashSet<Ident<'a>>`, `eval_dropped: bool`).
  Reset in `enter_program` alongside the existing `symbol_values` /
  `proto_write_symbols` resets.

- New `traverse_context::drop_diff::DropDiff` collector wraps
  `PassDirty` + a `&Scoping` snapshot and exposes `walk_old_*`
  (mark-dead) and `resurrect_from_*` (un-mark-dead) walks for the 5
  AST shapes the helpers handle: expression, statement,
  assignment-target-property, property-key, for-statement-left. It also
  recognises direct `eval(...)` calls in mark-dead mode and sets
  `eval_dropped` accordingly.

- The 5 `replace_*` and 2 `drop_*` walking helpers in
  `ecma_context.rs` now go through a private `dirty_diff()` accessor
  that destructures `TraverseCtx` into disjoint `state` and
  `scoping` field-level borrows. `replace_*` invokes
  `walk_old_<shape>` then `resurrect_from_<shape>` so references
  reused via `take_in` aren't spuriously marked dead. The two
  `drop_*` helpers only walk the old subtree. `notice_change` stays
  walk-free.

- `DropDiff` tolerates `IdentifierReference` nodes whose
  `reference_id` is `None`: peephole code routinely calls
  `take_in` which leaves a dummy in the old slot, and the synthetic
  identifiers built via `AstBuilder::identifier_reference` have no
  semantic data. Such nodes carry nothing dead to mark.

Sanity-checked with a debug print in `exit_program` running across the
peephole unit-test suite: all three dirty fields (`dead_refs`,
`dead_unresolved`, `eval_dropped`) get populated, confirming the
wiring fires. Debug print removed before commit.

Gates: `cargo test -p oxc_minifier` (504 passed) /
`cargo test -p oxc_mangler` (7 passed) pass, `cargo minsize` is
zero-delta, `check_state_changed.sh` and `ast-grep scan` are clean,
`cargo clippy -p oxc_minifier -- -D warnings` is clean.
…pass sites

Pre-Commit-5 prerequisite for the incremental scoping refresh. The 4
Pattern A field-write bypasses in try_fold_if (test rewrite + alternate
install + consequent install + consequent empty) were deferred during
the Commit 2 bypass sweep because routing them through the typed helpers
(which bump state.mutations) would prevent fixed-point convergence.

Two distinct non-idempotency causes, each gated separately:

1. Test rewrite (was line 78): NumericLiteral(1.0 or 0.0) still
   re-evaluates to Some(true)/Some(false) via the enclosing
   evaluate_value_to_boolean predicate. Skip the rewrite when the test
   is already the canonical numeric form matching `boolean`, then route
   through replace_expression.

2. Var-hoisting (was lines 99/101): KeepVar re-extracts the same names
   from a fresh `var x, y;` it produced on the prior iteration, yielding
   a structurally-equivalent fresh allocation. Skip when the target slot
   is already in canonical KeepVar output shape (a `var` declaration
   whose declarators all lack initializers) -- new is_keep_var_canonical
   helper -- then route through replace_statement.

Two more sites in the test-has-side-effects branch:

3. `alternate = None` (was line 108) -> switched to Option::take +
   ctx.drop_statement; naturally idempotent (no-op when already None).

4. `consequent = empty` (was line 110) -> route through replace_statement
   gated on the consequent not already being EmptyStatement.

This closes the last 4 Pattern A bypasses in remove_dead_code.rs so
Commit 5 can delete LiveUsageCollector without leaking dead refs into
Scoping.

Verification:
- cargo build -p oxc_minifier -- clean
- cargo test -p oxc_minifier -- 504 pass
- cargo test -p oxc_mangler -- pass
- cargo clippy -p oxc_minifier -- pass (with #[expect(clippy::float_cmp)]
  on try_fold_if, matching the pre-existing convention in fold_constants.rs)
- just minsize -- zero delta (gate is invisible to output)
- ./tools/check_state_changed.sh -- pass
- cd crates/oxc_minifier && ast-grep scan -- pass
- Existing test_fold_if_statement case "if (false) { var a; console.log(a) }"
  -> "if (0) var a" exercises the previously-non-idempotent var-hoist path
  and converges (would have hit "Ran loop more than 10 times" debug_assert
  in compressor.rs otherwise).
…lector

Stop walking the entire live program in `exit_program` to rebuild resolved
references and direct-eval flags. Each `replace_*` / `drop_*` helper now feeds
a per-pass `PassDirty` accumulator that `exit_program` consumes directly:
dead resolved-reference ids are batch-removed from `Scoping` without any walk,
unresolved-by-name entries are confirmed via a small `NamedRefCollector` walk
only when any such candidate exists, and the direct-eval-flag refresh's
`LiveDirectEvalCollector` walk fires only when at least one direct eval call
was dropped this pass.

Implementation:

- `oxc_semantic::Scoping` grows two methods: `retain_resolved_references_excluding`
  (batch removal complementing the existing `retain_resolved_references` bitset
  variant) and `remove_unresolved_reference` (name-keyed pruning).
- `PeepholeOptimizations::exit_program` rewritten to consume `PassDirty`
  directly. `LiveUsageCollector` is gone; the bitset-of-all-refs allocation
  it built each pass is gone with it.
- `MinifierState::changed: bool` and `TraverseCtx::reset_changed` are removed;
  the fixed-point loop driver already snapshot-compares `state.mutations` from
  commit 3, and `exit_program` no longer needs the bool. Every walking helper
  now bumps just `mutations`.

Four leaked-reference sites that `LiveUsageCollector`'s whole-program walk
was silently covering were uncovered while bringing `just minsize` to zero
delta — they had to be fixed in this commit for the new path to be sound:

- `fold_constants.rs` `fold_chain_expr` `ChainFold::Collapse`: when the base
  has no side effects, the `take_in`'d base was thrown away instead of being
  preserved in the new expression. Added `ctx.drop_expression(&base)` before
  the discard.
- `minimize_conditional_expression.rs::minimize_conditional`: the transient
  `ConditionalExpression` built on the stack was being mutated by the fold
  (slot `take_in`s) and then discarded without any walk, leaking refs in
  untouched slots. Wrap the transient in an `Expression` slot so a successful
  fold can be installed via `ctx.replace_expression`, whose walks pair up
  correctly. This pattern ("transient AST node mutated then discarded")
  generalises to any future helper that materialises a fresh node only to
  destructure it.
- `remove_unused_expression.rs::fold_arguments_into_needed_expressions`: the
  filter_map drain dropped `Argument`s whose unwrapped expression
  `remove_unused_expression` collapsed to nothing — silently leaking refs in
  e.g. `pure(x)` where the manual-pure call is treeshaken and `x` would
  otherwise stay live in the symbol's resolved-references list. Replaced the
  filter_map with an explicit loop that `drop_expression`s discarded args.
- `remove_unused_declaration.rs::remove_unused_variable_declaration` +
  the matching for-statement-init retain in `minimize_statements.rs`: the
  `retain` predicate silently drops declarators whose init has refs (e.g.
  `var y = pure(x)` where the symbol is unused). Switched to `retain_mut`
  with an explicit `drop_expression(&init)` for removed declarators.

Allocs-snapshot drift is expected per spec §6.6 — `PassDirty`'s reusable
`FxHashSet`s replace the per-pass arena `BitSet`. `tasks/minsize/` shows
zero delta, confirming `PassDirty` is now sound for the full size-test
corpus.

BREAKING CHANGE: `MinifierState::changed: bool` and
`TraverseCtx::reset_changed` are removed; downstream consumers that read or
write `state.changed` should snapshot-compare `state.mutations` instead.
… to helpers

After rebasing onto origin/main, three new `state.changed = true` writes
added by upstream #22722 (dead-after-throw drop) survived the auto-merge
into our migrated `minimize_statements.rs`. The lockdown gate at
`tools/check_state_changed.sh` flags them.

Convert each to its appropriate per-helper equivalent:

- Dead-stmt-drop site: `ctx.drop_statement(&stmt)` — walks the dropped
  subtree to record dead references in PassDirty, satisfies the Pattern E
  contract this site was named after in the spec.
- Coalesced identity drops: `ctx.notice_change()` — the underlying
  mutations were already accounted for by `stmts.push` and inside
  `remove_unused_variable_declaration`; this is a pure 'flag the loop'
  intent.
- Re-hoist-then-remove None branch: `ctx.notice_change()` — same
  reasoning.

506 minifier tests pass (504 baseline + 2 new upstream tests:
dead_after_throw_drop_triggers_unused_declarator_removal and
test_preserve_cjs_module_lexer_hint).
Swaps `PassDirty::dead_refs` from `FxHashSet<ReferenceId>` to
`oxc_allocator::BitSet` sized at `enter_program` to
`scoping.references_len()`. The `dead_refs` lifetime is one pass —
arena allocation is the right shape.

Per-ident hot-path cost drops from `FxHashSet::insert/remove` (~25
cycles, hashed, heap) to `BitSet::set_bit/unset_bit` (~5 cycles,
direct array store, arena). System allocs drop on every workload.

Bench recovery on the regressing `react.development.js` case was
smaller than spec §6.3 Tier 1 predicted — the per-ident leaf cost
is not the dominant factor; the `DropDiff` traversal frequency (1.49×
more visits than the old single `LiveUsageCollector` walk) dominates.
Net measured improvement on react.dev: ~1pp (within noise). Other
workloads unchanged within noise. Worth landing as a correctness-
neutral micro-improvement; further work to reduce walk frequency
tracked separately.

Residual risk: mid-pass-created `ReferenceId` (e.g.
`substitute_chain_call_expression`'s `create_unbound_reference`) could
exceed the bitset capacity if later dropped in the same pass. The
`Scoping::retain_resolved_references_excluding` consumer ignores
out-of-range indices, but the `DropDiff::visit_identifier_reference`
hot path uses `debug_assert!` for the bounds check — verified safe by
probe `assert!` over the test corpus and `just minsize` corpus, no
trips. A proper overflow set is left as a follow-up if the assertion
ever trips in CI.
`mod lexer` is `pub(crate)` in the default build but `pub` under the
`benchmarking` feature. `clippy::redundant_pub_crate` therefore only fires
in the non-benchmarking build. Commit a486e42 added an unconditional
`#[expect]` to silence that; CI runs clippy with `--all-features`, where the
lint doesn't fire and the expectation becomes unfulfilled.

Gate the `#[expect]` with `#[cfg_attr(not(feature = "benchmarking"), …)]` so
both configurations are clean.
…places an identifier

Before the minifier's incremental-scoping refactor (#22736), the per-pass
`LiveUsageCollector` walked the live program at `exit_program` and rebuilt
`Scoping::resolved_references` from scratch. That walk silently masked the
fact that `ReplaceGlobalDefines` never told `Scoping` about the identifier
references it was replacing — the next minifier pass would simply recompute
them and find that the replaced identifier had no surviving occurrences.

With the LiveUsageCollector gone, only references that the minifier itself
drops via `replace_*`/`drop_*` helpers are pruned. References that callers
drop before `dead_code_elimination_with_scoping` get to run are now visible
as stale entries in `resolved_references`, and `symbol_is_unused` returns
`false` for symbols whose every textual occurrence has been replaced.

Concretely, the existing `replace_global_defines::declare_const` integration
test (added by #12427 to align with rollup) fails: the test expects the
`declare const IS_PROD: boolean;` line to be DCE'd after every reference to
`IS_PROD` is replaced with `true`, but the leaked references keep DCE from
seeing IS_PROD as unused.

The fix delegates reference deletion to the plugin: when
`replace_identifier_define_impl` returns a replacement value, the caller
captures the old identifier's `(reference_id, symbol_id, name)` BEFORE the
replacement and calls `Scoping::delete_reference` (resolved case) or
`delete_root_unresolved_reference` (truly global case) only AFTER the AST
mutation actually commits. The split matters at the assignment-target site —
`replace_define_with_assignment_expr` may bail when the replacement value
isn't a valid LHS (e.g. `IS_PROD = 0` with define `IS_PROD -> true` would
need `true = 0`, which `assignment_target_from_expr` rejects); the
reference must stay in that case.

New test `declare_const_assignment_target_bailout` pins the bail path.
… into PassDirty

After the incremental-scoping refactor, two peephole sites were still
silently dropping AST subtrees without telling `PassDirty` — so their
identifier references stayed in `Scoping::resolved_references` across
passes. The next pass would either inline what should have been a
multi-reference symbol (idempotency failure) or skip an inline that
should have fired (also idempotency failure).

Both surface as new test262 conformance failures only revealed when
`LiveUsageCollector` was deleted in commit 0856533:

- `expressions/await/syntax-await-in-ConditionalExpression` — the
  object literal `{ then: function() { x = 'unexpected'; … } }` gets
  removed via `let z = …` DCE, but the unused property values inside
  the object are dropped via `remove_unused_object_expr` without a
  `drop_expression` walk. Reference to `x` inside the dropped `then`
  function body leaks; `x`'s write-references count stays at 1 on the
  next pass; the constant-value inline for `let x = 'initial value'`
  is blocked. The second `dead_code_elimination` call then sees the
  correct count and inlines, producing different output.
- `expressions/in/private-field-presence-{accessor,method}-shadowed`,
  `class/elements/private-{accessor,field}-is-visible-in-computed-…`
  — `remove_unused_private_members` drops class elements via
  `Vec::retain` without walking the removed method/accessor bodies.
  Refs to outer-scope identifiers inside those bodies (e.g.
  `parentCount += 1` inside a removed `#method`) leak; the same
  idempotency-via-ref-count-drift effect ensues.

Fix:

1. `remove_unused_object_expr` calls `ctx.drop_expression` on each
   dropped property key (computed) and value before letting it fall
   off the end of the iteration.
2. `remove_unused_private_members` calls a new
   `TraverseCtx::drop_class_element` helper for every element it
   removes inside the `retain` predicate. The helper threads through a
   new `DropDiff::walk_old_class_element` entry that uses the default
   `Visit` walk to record refs and direct-eval calls.

Conformance: `cargo coverage -- minifier` returns to the main-branch
44795/44799 pass rate. Unit tests (506 + 2 doctests) all pass.
`allocs_minifier.snap` shifts by 2 sys allocs / -3 arena reallocs on
kitchen-sink.tsx — small, expected drift from the new
`drop_class_element` walk paths in classes-heavy files.
…s into PassDirty

Two more silent reference-drop sites surfaced by Codex adversarial
review of #22736, same bug class as the four fixed in 0856533 and
the two fixed in faaa4ba. Both sites mutated the AST through
`notice_change` only, even though the mutation discarded a subtree
that could carry identifier references — leaving stale entries in
`Scoping::resolved_references` across passes.

1. `remove_unused_array_expr` (`peephole/remove_unused_expression.rs`):
   the `retain_mut` predicate dropped a side-effect-free
   `ArrayExpressionElement::SpreadElement` without walking its
   `argument`. The `may_have_side_effects` check still uses the
   `ArrayExpressionElement` impl (so `[...ident]` is still treated as
   side-effecting via the iterator protocol and kept), but when the
   spread IS elided the argument now routes through
   `ctx.drop_expression`.

2. `fold_object_exp` (`peephole/fold_constants.rs`) had three skipped
   drop paths:
   - the `is_expression_undefined` short-circuit, which discarded the
     spread argument with `continue`,
   - the `should_fold_spread_element` branch, which dropped the
     argument with a bare `// skip` comment,
   - the inlined-object `__proto__` filter, which used
     `filter(...).collect` to drop non-computed `__proto__` properties
     and their value expressions.
   The first two now call `ctx.drop_expression(&spread.argument)`
   before continuing. The filter chain is rewritten as an explicit loop
   that calls `ctx.drop_expression(&p.value)` on each elided
   `__proto__` property; the property key is always a non-computed
   static name and carries no references.

Regression tests:

- `test_array_spread_drop_walks_argument_refs` in
  `tests/peephole/remove_unused_expression.rs` covers the elided
  spread path through nested array literals
  (`[...[function(){}]]`). Existing `MayHaveSideEffects` rules treat
  every spread shape that retains identifier refs as side-effecting,
  so there is no source-level case where the new walk is observable
  on stable inputs — the test is documented as defensive and
  conformance is the live discriminator.
- `test_fold_object_spread_drop_walks_argument_refs` in
  `tests/peephole/fold_constants.rs` covers the two
  ref-bearing fold paths: `{...function(){ x = 'b' }}` (path 2) and
  `{...{__proto__: function(){ x = 'b' }}}` (path 3). Without the
  fix, the stale write-reference on `x` blocks the constant inline
  on the follow-up pass and the test fails; with the fix, the inline
  fires.

Verification: `cargo coverage -- minifier` returns 44795/44799 (matches
main). Unit tests pass (508 + 2 doctests). `just minsize` shows zero
delta against the corpus snapshots. Clippy/fmt/ast-grep clean.
`replace_*` walked the replacement value to un-mark resolved references
that `walk_old_*` had marked dead. When `walk_old_*` marked nothing —
the common case, since survivors are moved out via `take_in`, leaving
reference-id-less dummies — that walk is a provable no-op over an often
large moved-in subtree. On react.development.js it was ~99% of the
per-mutation walk cost and the source of the incremental-scoping
regression.

Skip the walk when nothing was marked. Resurrect still runs at the only
id-aliasing site (the `clone_in_with_semantic_ids` fold), where the
original is marked dead, so the surviving clone is un-marked there.

minifier[react.development.js] -17.6%, kitchen-sink.tsx -2.8%,
cal.com.tsx -7.2%, binder.ts -9.9% vs branch HEAD. Output unchanged:
minsize bit-identical and conformance snapshots untouched.
Draining a catch body via clear() skipped the incremental dirty walk, so references removed from an unreachable catch block could remain live in scoping and block later DCE.

Route the dropped statements through drop_statement, and keep optional catch binding conservative when the rewrite leaves a same-name var in the catch body.
The per-pass `prune_unresolved_refs` walk (a full-program `NamedRefCollector`
pass) kept `Scoping::root_unresolved_references` accurate, but nothing observes
the result:

- `build_with_scoping` / `dead_code_elimination_with_scoping` take `scoping` by
  value and return `u8`, so the compressor's pruned scoping never escapes to any
  caller; `MinifierReturn.scoping` comes from the mangler's rebuilt semantic,
  not the compressor's.
- The only in-loop reader of `root_unresolved_references` is the UID generator,
  which is never invoked during minify (the `is_global_reference` checks read
  per-reference `symbol_id()`, not the map).
- main never pruned it (the prune and `remove_unresolved_reference` are new in
  this branch) yet minsize is bit-identical; disabling the walk leaves all unit
  tests, conformance snapshots, and minsize unchanged.

Remove the walk plus the now-unused `dead_unresolved` tracking,
`NamedRefCollector`, and `Scoping::remove_unresolved_reference`. This drops the
largest residual per-pass scoping cost on bundler/CJS inputs:
kitchen-sink.tsx -16.2%, cal.com.tsx -7.5%; react/binder unchanged (they never
trigger it). Output is unchanged (minsize bit-identical, conformance untouched).

Resolved-reference refresh (`dead_refs`) and direct-eval-flag refresh stay --
both are read in-loop and affect optimization decisions.
…prune

Removing `prune_unresolved_refs` and the `dead_unresolved` FxHashSet
(commit "drop unobserved unresolved-reference pruning") eliminates a few
per-file heap allocations. Sys allocs only: checker.ts 172->170, App.tsx
123->119, pdf.mjs 3937->3929, antd.js 3093->3084, kitchen-sink.tsx
1768->1756. Sys reallocs and arena allocs unchanged.

Values taken from the CI (Linux) run; not regenerated locally because
macOS reports different sys-alloc counts than the CI environment.
Add a debug-only check in `exit_program`: before
`retain_resolved_references_excluding` prunes the references marked dead this
pass, assert none of them still appear in the live program. A still-live
reference being pruned is the unsafe direction -- it leaves a symbol
under-referenced so a later pass can wrongly drop it, producing incorrect
output.

The resurrect-walk skip and the unresolved-prune removal both rest on
scoping-state invariants that minsize/conformance only check indirectly via
output. This walk runs once per dirty pass in debug builds only, so the unit
tests and the `cargo coverage -- minifier` corpus double as an over-prune
detector at zero release cost. Verified sound (forcing the resurrect walk to
skip makes it fire on test_fold_is_object_and_not_null) and silent across the
test262/babel/typescript corpus.
A >=3 element sequence kept in indirect-access position (e.g.
`(sideEffect(), 0, foo.bar)()`) re-wrapped its already-`0` second-to-last
element on every pass. Routing that through `replace_expression` bumps the
mutation counter each iteration, so the fixed-point loop never converges
and trips the 10-iteration debug_assert.

Skip the rewrite when the element is already `0`, mirroring the `len == 2`
guard above. `0` is side-effect-free so output is unchanged; this only
restores convergence (the spin is pre-existing on main).
The incremental scoping refresh trusts that the incoming `Scoping`
exactly matches the program; the deleted full walk used to self-heal any
staleness. Document this precondition on the `*_with_scoping` entry points
so external callers (e.g. rolldown) know they must keep scoping in sync
after mutating the program — a missing reference can cause incorrect
output, not just a missed optimization.

Also correct the resurrect-skip comment in drop_diff.rs: there are two
id-preserving sites (`clone_in_with_semantic_ids` and
`expression_identifier_with_reference_id`), not one, both harmless because
they install within a single `replace_expression` call.
Design for replacing the per-pass dead-ref bitset + batch list compaction
with a persistent, eagerly-initialized, live-maintained per-symbol
reference-count store inside SymbolValues. Counts are derived from the live
program once (self-healing), then maintained incrementally by the existing
DropDiff walk; Scoping's reference lists are left untouched.

Supersedes the consumption model from the incremental-scoping work on this
branch; gated on a minsize re-snapshot and a single-file benchmark.
Six-task implementation plan for the live-reference-counts design: add the
count store + eager program-walk init, maintain it in DropDiff as a shadow
validated by a count-drift debug assertion, switch consumers off Scoping's
lists, then delete the dead-ref bitset + batch retain. Benchmark-gated.
@Dunqing Dunqing force-pushed the spec/minifier-incremental-scoping branch from 1c0e6c8 to b5b3025 Compare June 9, 2026 07:50
Dunqing added 2 commits June 10, 2026 00:15
`resurrect_*` un-marks references that `walk_old_*` marked dead but which
survive in the replacement value. The only way a `ReferenceId` lands in both
the dropped slot and the replacement is an id-preserving copy, which happens
at exactly one site (`substitute_is_object_and_not_null`) and produces an
`Expression`. So the statement / property-key / assignment-target-property /
for-statement-left resurrect walks could never un-mark anything — dead work.

Keep `resurrect_from_expression` for that site; the other four `replace_*`
helpers now just `walk_old_*`. Behaviour-identical: 508 tests pass with the
over-prune assertion active, `just minsize` is zero-delta.
graphite-app Bot pushed a commit that referenced this pull request Jun 15, 2026
#23196)

> Developed with AI assistance (Claude Code); reviewed, tested, and benchmarked by the contributor.

## Summary

Replaces the ~190 manual `ctx.state.changed = true` writes scattered across `crates/oxc_minifier/src/peephole/` with typed mutation helpers on `MinifierTraverseCtx` (`replace_expression`, `replace_statement`, `replace_assignment_target_property`, `replace_property_key`, `replace_for_statement_left`, `drop_expression`, `drop_statement`, `drop_class_element`, `notice_change`).

The mutation signal itself becomes a private `mutated: bool` on `MinifierState`, readable only via `take_mutated()` (read-and-reset in a single call) — the compiler now enforces that the fixed-point loop's signal can only be set by the helpers and only consumed by the loop driver.

"Forgot to mark the loop dirty" — the bug pattern behind #22722 and #22552 — becomes structurally hard to write. The migration also carries the latent silent-mutation fixes surfaced by routing every mutation through a helper (missed `*slot = …` writes in `remove_unused_expression`, `convert_to_dotted_properties`, `minimize_statements`, and field-write bypass sites), plus idempotency gates in `try_fold_if` so the typed helpers don't block fixed-point convergence.

This is stage 1 of 2 re-cutting the approach prototyped in #22736; the incremental scoping refresh stacks on top.

## Verification

- `cargo test -p oxc_minifier` — 509 pass
- `just minsize` — **bit-identical output** across the entire size-test corpus (this PR is a pure refactor)
- `cargo coverage -- minifier` — conformance unchanged
- clippy (all features/targets) clean
- `allocs_minifier.snap`: small arena-only deltas (+0.02–0.08%) from helper call-site construction patterns; sys allocs unchanged
graphite-app Bot pushed a commit that referenced this pull request Jun 16, 2026
#23197)

> Developed with AI assistance (Claude Code); reviewed, tested, and benchmarked by the contributor.

## Summary

Deletes the per-pass `LiveUsageCollector` walk over the entire live program. The typed mutation helpers (previous PR in this stack) now walk only the dropped/replaced subtrees in place, accumulating dead `ReferenceId`s into a per-pass `PassDirty` arena `BitSet` that the `Compressor` driver consumes in one batch (after `Normalize` and after each peephole pass) via the new `Scoping::retain_resolved_references_excluding`. Direct-eval flag refresh runs only when an `eval(...)` call was actually dropped.

Cross-crate support riding in this PR: `Scoping::retain_resolved_references_excluding` (oxc_semantic) and `BitSet::capacity()` / `is_empty()` / `contains()` (oxc_allocator).

## Breaking

Removes `oxc_semantic::Scoping::retain_resolved_references` — its only consumer was `LiveUsageCollector`, deleted here. No other in-repo or known downstream callers.

## Benchmarks (CodSpeed, this PR vs stack base)

| Benchmark | Base → Head | Efficiency |
|---|---|---|
| `minifier[kitchen-sink.tsx]` | 63.6 ms → 52.5 ms | **+21.2%** |
| `pipeline[kitchen-sink.tsx]` | 122.3 ms → 110.7 ms | +10.5% |
| `minifier[App.tsx]` | 12.5 ms → 12.1 ms | +3.3% |
| everything else | — | unchanged |

> **Why earlier runs showed ~2× on kitchen-sink:** the initial cut of this branch had a reference-leak bug — dropped subtrees weren't always marked dead, leaked references made symbols look used, and the minifier silently skipped most of kitchen-sink's DCE, producing output **22% larger than main** (invisible to minsize, which doesn't include that file). Those runs measured a minifier doing less work, not a faster minifier. The leak is fixed (see below); the numbers above are measured at byte-identical output to main.

Memory: kitchen-sink.tsx arena allocs 100230 → 90553, sys allocs 2917 → 2639; all other tracked files at or slightly below base.

## Relation to #22736

This re-cuts the approach prototyped and discussed in #22736 (kept open during development as the design record), with three deviations from the POC:

1. **Mark-only `DropDiff`** — the POC's "Resurrect" mode existed for one call site (`substitute_is_object_and_not_null`) that reused `ReferenceId`s in replacement values. That site now mints fresh references (same symbol/resolution, new id), so the resurrect walk and its cross-call aliasing invariant are gone entirely.
2. **`mutated: bool` instead of a `mutations: u64` snapshot counter** — after this PR the signal's only reader is the fixed-point loop, so a private bool with read-and-reset `take_mutated()` suffices.
3. **Bounds-checked mark path** — references minted mid-pass and dropped in the same pass are skipped (treated live → conservative output) instead of the POC's debug-assert + potential release panic.

Safety net: `debug_assert_no_over_prune` walks the live program in debug builds after each dirty pass, so the whole test + conformance corpus doubles as an over-prune detector at zero release cost.

## Reference-leak fix + simplification follow-ups

The initial cut leaked references at three classes of drop sites. The leak direction is safe (stale extra references block optimizations, never corrupt output), but it cost real output size on `void ident`-heavy inputs — kitchen-sink.tsx minified 22% larger than main:

1. **`Normalize` raw drops** (`void x` → `void 0`, `drop_console`) never reached `PassDirty`. `Normalize` now records drops through the same typed helpers, and the driver flushes once before the fixed-point loop so pass 1 already observes pruned reference counts (no extra peephole pass).
2. **`remove_unused_template_literal`** silently drained elements that `remove_unused_expression` collapsed to nothing, without a drop walk.
3. **Dropped variable declarators** walked only `init`; references in the binding's TS type annotation (e.g. computed keys in a type literal) leaked. New `drop_variable_declarator` helper walks the whole declarator.

Each class has a leak-regression test; kitchen-sink.tsx output is verified **byte-identical to main**.

Two simplification passes followed: flush hoisted into the `Compressor` driver (no cross-traversal coupling), `DropDiff` reduced to a thin `Visit` over `&mut PassDirty` (no scoping lookup on the mark hot path; marks for unresolved refs are inert), `BitSet::clear()` reuse instead of a per-pass arena realloc, a shared `as_direct_eval_call` predicate for producer and consumer, plain `clone_in` (id-less) at the one site that cloned semantic ids, and removal of dead `drop_console` repair code.

## Verification

- `cargo test -p oxc_minifier -p oxc_mangler -p oxc_semantic -p oxc_allocator` — all pass, incl. the new leak-regression tests
- `just minsize` — **bit-identical output**; kitchen-sink.tsx (not in minsize) separately verified byte-identical to main
- `cargo coverage -- minifier` — test262 44825/44829, babel 1785/1785, snapshots unchanged
- clippy (all features/targets), fmt clean

## Test alignment in oxc_transformer_plugins

The `replace_global_defines` integration-test helper previously reused RGD's returned scoping directly for DCE — a path production does not take: `crates/oxc/src/compiler.rs` marks scoping dirty whenever RGD changed the AST (`scoping_dirty |= ret.changed`) and rebuilds before compress/DCE, and rolldown minifies freshly-parsed chunks. With the full-walk `LiveUsageCollector` gone, that test-only path surfaced stale ambient-`declare` references. The helper now mirrors the production pipeline (rebuild when `ret.changed`, reuse otherwise). No `ReplaceGlobalDefines` source changes. Two `declare`-define contract tests added.

## Known trade-off: defines-heavy DCE pays O(dropped) drop-walks

Profiling `pipeline[react.development.js]` (with `process.env.NODE_ENV` defines) shows the worst case of walk-what-you-drop: DCE drops one `if` subtree covering ~90% of the program's nodes, so the drop-walk approaches a full-program traversal where the old per-pass live walk only covered the small remainder (−3% CodSpeed / +7% local wall on that bench; attribution: ~72% drop-walks, 0% double-marking, ~0% bitset cost). A bounded-walk + live-walk-fallback mitigation was prototyped and validated (recovers to +1.4%, all other wins preserved), but we decided the added mechanism wasn't worth the maintenance cost for a one-bench regression — the same file's plain minification *improves* 12%, and kitchen-sink-class inputs (the load-bearing case) improve ~2×. Revisit if real-world defines-heavy workloads surface it.
@Dunqing Dunqing closed this Jun 17, 2026
@Dunqing

Dunqing commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

Superseded by #23196 and #23197

camc314 pushed a commit that referenced this pull request Jul 3, 2026
#23196)

> Developed with AI assistance (Claude Code); reviewed, tested, and benchmarked by the contributor.

## Summary

Replaces the ~190 manual `ctx.state.changed = true` writes scattered across `crates/oxc_minifier/src/peephole/` with typed mutation helpers on `MinifierTraverseCtx` (`replace_expression`, `replace_statement`, `replace_assignment_target_property`, `replace_property_key`, `replace_for_statement_left`, `drop_expression`, `drop_statement`, `drop_class_element`, `notice_change`).

The mutation signal itself becomes a private `mutated: bool` on `MinifierState`, readable only via `take_mutated()` (read-and-reset in a single call) — the compiler now enforces that the fixed-point loop's signal can only be set by the helpers and only consumed by the loop driver.

"Forgot to mark the loop dirty" — the bug pattern behind #22722 and #22552 — becomes structurally hard to write. The migration also carries the latent silent-mutation fixes surfaced by routing every mutation through a helper (missed `*slot = …` writes in `remove_unused_expression`, `convert_to_dotted_properties`, `minimize_statements`, and field-write bypass sites), plus idempotency gates in `try_fold_if` so the typed helpers don't block fixed-point convergence.

This is stage 1 of 2 re-cutting the approach prototyped in #22736; the incremental scoping refresh stacks on top.

## Verification

- `cargo test -p oxc_minifier` — 509 pass
- `just minsize` — **bit-identical output** across the entire size-test corpus (this PR is a pure refactor)
- `cargo coverage -- minifier` — conformance unchanged
- clippy (all features/targets) clean
- `allocs_minifier.snap`: small arena-only deltas (+0.02–0.08%) from helper call-site construction patterns; sys allocs unchanged
camc314 pushed a commit that referenced this pull request Jul 3, 2026
#23197)

> Developed with AI assistance (Claude Code); reviewed, tested, and benchmarked by the contributor.

## Summary

Deletes the per-pass `LiveUsageCollector` walk over the entire live program. The typed mutation helpers (previous PR in this stack) now walk only the dropped/replaced subtrees in place, accumulating dead `ReferenceId`s into a per-pass `PassDirty` arena `BitSet` that the `Compressor` driver consumes in one batch (after `Normalize` and after each peephole pass) via the new `Scoping::retain_resolved_references_excluding`. Direct-eval flag refresh runs only when an `eval(...)` call was actually dropped.

Cross-crate support riding in this PR: `Scoping::retain_resolved_references_excluding` (oxc_semantic) and `BitSet::capacity()` / `is_empty()` / `contains()` (oxc_allocator).

## Breaking

Removes `oxc_semantic::Scoping::retain_resolved_references` — its only consumer was `LiveUsageCollector`, deleted here. No other in-repo or known downstream callers.

## Benchmarks (CodSpeed, this PR vs stack base)

| Benchmark | Base → Head | Efficiency |
|---|---|---|
| `minifier[kitchen-sink.tsx]` | 63.6 ms → 52.5 ms | **+21.2%** |
| `pipeline[kitchen-sink.tsx]` | 122.3 ms → 110.7 ms | +10.5% |
| `minifier[App.tsx]` | 12.5 ms → 12.1 ms | +3.3% |
| everything else | — | unchanged |

> **Why earlier runs showed ~2× on kitchen-sink:** the initial cut of this branch had a reference-leak bug — dropped subtrees weren't always marked dead, leaked references made symbols look used, and the minifier silently skipped most of kitchen-sink's DCE, producing output **22% larger than main** (invisible to minsize, which doesn't include that file). Those runs measured a minifier doing less work, not a faster minifier. The leak is fixed (see below); the numbers above are measured at byte-identical output to main.

Memory: kitchen-sink.tsx arena allocs 100230 → 90553, sys allocs 2917 → 2639; all other tracked files at or slightly below base.

## Relation to #22736

This re-cuts the approach prototyped and discussed in #22736 (kept open during development as the design record), with three deviations from the POC:

1. **Mark-only `DropDiff`** — the POC's "Resurrect" mode existed for one call site (`substitute_is_object_and_not_null`) that reused `ReferenceId`s in replacement values. That site now mints fresh references (same symbol/resolution, new id), so the resurrect walk and its cross-call aliasing invariant are gone entirely.
2. **`mutated: bool` instead of a `mutations: u64` snapshot counter** — after this PR the signal's only reader is the fixed-point loop, so a private bool with read-and-reset `take_mutated()` suffices.
3. **Bounds-checked mark path** — references minted mid-pass and dropped in the same pass are skipped (treated live → conservative output) instead of the POC's debug-assert + potential release panic.

Safety net: `debug_assert_no_over_prune` walks the live program in debug builds after each dirty pass, so the whole test + conformance corpus doubles as an over-prune detector at zero release cost.

## Reference-leak fix + simplification follow-ups

The initial cut leaked references at three classes of drop sites. The leak direction is safe (stale extra references block optimizations, never corrupt output), but it cost real output size on `void ident`-heavy inputs — kitchen-sink.tsx minified 22% larger than main:

1. **`Normalize` raw drops** (`void x` → `void 0`, `drop_console`) never reached `PassDirty`. `Normalize` now records drops through the same typed helpers, and the driver flushes once before the fixed-point loop so pass 1 already observes pruned reference counts (no extra peephole pass).
2. **`remove_unused_template_literal`** silently drained elements that `remove_unused_expression` collapsed to nothing, without a drop walk.
3. **Dropped variable declarators** walked only `init`; references in the binding's TS type annotation (e.g. computed keys in a type literal) leaked. New `drop_variable_declarator` helper walks the whole declarator.

Each class has a leak-regression test; kitchen-sink.tsx output is verified **byte-identical to main**.

Two simplification passes followed: flush hoisted into the `Compressor` driver (no cross-traversal coupling), `DropDiff` reduced to a thin `Visit` over `&mut PassDirty` (no scoping lookup on the mark hot path; marks for unresolved refs are inert), `BitSet::clear()` reuse instead of a per-pass arena realloc, a shared `as_direct_eval_call` predicate for producer and consumer, plain `clone_in` (id-less) at the one site that cloned semantic ids, and removal of dead `drop_console` repair code.

## Verification

- `cargo test -p oxc_minifier -p oxc_mangler -p oxc_semantic -p oxc_allocator` — all pass, incl. the new leak-regression tests
- `just minsize` — **bit-identical output**; kitchen-sink.tsx (not in minsize) separately verified byte-identical to main
- `cargo coverage -- minifier` — test262 44825/44829, babel 1785/1785, snapshots unchanged
- clippy (all features/targets), fmt clean

## Test alignment in oxc_transformer_plugins

The `replace_global_defines` integration-test helper previously reused RGD's returned scoping directly for DCE — a path production does not take: `crates/oxc/src/compiler.rs` marks scoping dirty whenever RGD changed the AST (`scoping_dirty |= ret.changed`) and rebuilds before compress/DCE, and rolldown minifies freshly-parsed chunks. With the full-walk `LiveUsageCollector` gone, that test-only path surfaced stale ambient-`declare` references. The helper now mirrors the production pipeline (rebuild when `ret.changed`, reuse otherwise). No `ReplaceGlobalDefines` source changes. Two `declare`-define contract tests added.

## Known trade-off: defines-heavy DCE pays O(dropped) drop-walks

Profiling `pipeline[react.development.js]` (with `process.env.NODE_ENV` defines) shows the worst case of walk-what-you-drop: DCE drops one `if` subtree covering ~90% of the program's nodes, so the drop-walk approaches a full-program traversal where the old per-pass live walk only covered the small remainder (−3% CodSpeed / +7% local wall on that bench; attribution: ~72% drop-walks, 0% double-marking, ~0% bitset cost). A bounded-walk + live-walk-fallback mitigation was prototyped and validated (recovers to +1.4%, all other wins preserved), but we decided the added mechanism wasn't worth the maintenance cost for a one-bench regression — the same file's plain minification *improves* 12%, and kitchen-sink-class inputs (the load-bearing case) improve ~2×. Revisit if real-world defines-heavy workloads surface it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-allocator Area - Allocator A-minifier Area - Minifier A-parser Area - Parser A-semantic Area - Semantic

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant