Skip to content

perf: back ResolveContext.stack with a lazy linked-list Set#523

Merged
alexander-akait merged 8 commits intomainfrom
claude/pr-443-performance-improvements-GpXWI
Apr 17, 2026
Merged

perf: back ResolveContext.stack with a lazy linked-list Set#523
alexander-akait merged 8 commits intomainfrom
claude/pr-443-performance-improvements-GpXWI

Conversation

@alexander-akait
Copy link
Copy Markdown
Member

Applies the performance improvement from #443 without breaking the
public API. resolveContext.stack remains a Set<string> so consumers
like webpack's ResolverCachePlugin continue to work unchanged, but it
is now backed by a singly-linked list of structured stack entries:

  • Extending the stack on each doResolve call is O(1) instead of
    O(n) for cloning a Set (the dominant cost identified in feat: single-linked list for resolver stack #443).
  • Formatted entry strings are computed lazily, only when the set is
    iterated, queried with has, or its size is read.
  • The recursion check uses structural equality on the raw entry
    objects, so the hot path allocates no strings at all.

The Set<string> surface (has, size, iteration, keys, values,
entries, forEach, instanceof Set) is preserved by having
StackSet extend Set and override its read operations; the
generated types.d.ts is unchanged.

Applies the performance improvement from #443 without breaking the
public API. `resolveContext.stack` remains a `Set<string>` so consumers
like webpack's ResolverCachePlugin continue to work unchanged, but it
is now backed by a singly-linked list of structured stack entries:

- Extending the stack on each `doResolve` call is O(1) instead of
  O(n) for cloning a `Set` (the dominant cost identified in #443).
- Formatted entry strings are computed lazily, only when the set is
  iterated, queried with `has`, or its `size` is read.
- The recursion check uses structural equality on the raw entry
  objects, so the hot path allocates no strings at all.

The `Set<string>` surface (`has`, `size`, iteration, `keys`, `values`,
`entries`, `forEach`, `instanceof Set`) is preserved by having
`StackSet` extend `Set` and override its read operations; the
generated `types.d.ts` is unchanged.
@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla Bot commented Apr 15, 2026

CLA Not Signed

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 15, 2026

⚠️ No Changeset found

Latest commit: d468c90

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 15, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.75%. Comparing base (0bb2e70) to head (d468c90).
⚠️ Report is 6 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #523      +/-   ##
==========================================
+ Coverage   92.68%   96.75%   +4.07%     
==========================================
  Files          50       50              
  Lines        2555     2589      +34     
  Branches      779      788       +9     
==========================================
+ Hits         2368     2505     +137     
+ Misses        157       69      -88     
+ Partials       30       15      -15     
Flag Coverage Δ
integration 96.75% <ø> (+4.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Apr 15, 2026

Merging this PR will not alter performance

✅ 43 untouched benchmarks
🆕 1 new benchmark

Performance Changes

Benchmark BASE HEAD Efficiency
🆕 stack-churn: 4x60 alias chains, 20 resolves N/A 200.3 ms N/A

Comparing claude/pr-443-performance-improvements-GpXWI (d468c90) with main (b5259a0)

Open in CodSpeed

claude added 6 commits April 15, 2026 17:09
Address the `fully-specified` regression from the previous revision and
make the improvement measurable on existing and new benchmarks without
touching the public `Set<string>` contract.

Changes:
- `StackSet` no longer extends `Set`; it duck-types the public Set<string>
  interface and splices `Set.prototype` in only for `instanceof` compat.
  This avoids a `super()` call (allocating a real Set hash table) on
  every `doResolve` — the main source of the earlier regression.
- The node is also its own Set view, so there is one allocation per
  `doResolve` call regardless of stack depth.
- The parent pointer is split in two (`_sParent` for the fast linked-list
  path, `_foreignParent` for a user-supplied plain `Set<string>`) so the
  hot recursion-check loop avoids an `instanceof` check per step.
- Adds a `benchmark/cases/stack-churn` case that fans out across several
  deep alias chains. Locally on the sandbox box (Node 22, 6 runs each):
  - stack-churn (new):         +16% ops/s, p99 -30% (less GC variance)
  - pathological-deep-stack:   +5-10%
  - fully-specified:           +4-7%
  - realistic-midsize:         +2%

types.d.ts is unchanged — the public API is still `Set<string>`.
Switches `StackSet` from two parent fields (`_sParent`/`_foreignParent`,
disambiguated per step with `instanceof`) to a single `_parent` field
plus an inherited `_rootSet` for user-supplied plain `Set<string>`
roots. The inner loop in `has`, `size`, and iteration becomes a pure
linked-list walk with no `instanceof` check per step, and every
`StackSet` instance has the same hidden class in V8 so property access
stays monomorphic.

Local wall-clock measurements (Node 22, median of 3-5 runs) vs current
`main`:

  cache-predicate          1249 vs 1243 ops/s   ~0 (was reported as
                                                    degraded on CI)
  realistic-midsize (warm)  732 vs  710 ops/s   +3%
  alias-realistic           867 vs  794 ops/s   +9%
  sync-resolver            3312 vs 3143 ops/s   +5%
  fully-specified          3018 vs 2650 ops/s   +14%
  pathological-deep-stack    82 vs   67 ops/s   +22%
  stack-churn (new)        13.4 vs 10.5 ops/s   +28%

types.d.ts is still unchanged — the public `Set<string>` contract is
preserved.
Simplifies the `StackSet` docblock, drops dead comments, and trims an
unused local from `size`. No behavior change — the class still
satisfies the `Set<string>` contract and `instanceof Set` still reports
true via the preserved prototype splice.

Adds `test/resolve-context-stack.test.js` covering the public input
API a consumer can still invoke with:

    resolver.resolve(ctx, path, req, { stack: new Set(...) }, cb);

The tests verify: (1) absent/empty/non-empty seed sets all resolve
normally, and (2) a pre-seeded entry matching the resolver's first
stack string triggers the recursion guard — i.e. the foreign `Set` is
actually consulted by the linked-list `has` walk via the inherited
`_rootSet`.
Simplifies how `doResolve` carries its recursion stack. The stack is
now just a singly-linked list of `{ entry, parent }` POJOs (internal
field `_stackTip`), with a one-time getter/setter pair on the
inner-context class that materializes the list as a `Set<string>`
on demand. Callers using the public `{ stack: new Set(...) }` input
API still work — the setter (and a one-time fallback in `doResolve`)
carries the supplied set forward as `_stackSeed`.

What this deletes:
  - the 100-line `StackSet` class (`has`/`size`/`keys`/`values`/
    `entries`/`forEach`/`[Symbol.iterator]`),
  - the two generator helpers,
  - the `Object.setPrototypeOf(StackSet.prototype, Set.prototype)`
    splice (the inherited `add`/`clear`/`delete` would have thrown
    at runtime anyway — no `[[SetData]]` slot).

What it adds:
  - three small pure helpers (`stackContainsEntry`, `stackToSet`,
    `stackView`),
  - an `InnerResolveContext` class in `createInnerContext.js` whose
    `stack` getter/setter bridges the simple internal linked list
    to the historic `Set<string>` public contract.

Behavior is preserved: `resolveContext.stack` still reads as a
`Set<string>`, `{ stack: new Set(...) }` input still pre-seeds the
recursion stack, `types.d.ts` is unchanged. The four
`test/resolve-context-stack.test.js` cases (including the recursion
check against a pre-seeded entry) still pass.

Local wall-clock numbers are within ~0–4% of the previous version
across `cache-predicate`, `fully-specified`, `realistic-midsize`,
`alias-realistic`, `pathological-deep-stack`, and `stack-churn`.
Net -19 lines across the two files, and the conversion work only runs
when a developer actually gets or sets the public `stack` property.

Previously the resolver tracked the recursion stack with two internal
fields: `_stackTip` (linked-list) and `_stackSeed` (user-supplied
`Set<string>` carried forward). That forced every recursion check to
do both a list walk and a `Set.has`, and the `stack` getter had to
compose both sides each time.

Now:

- `doResolve` carries a single `_stack` pointer: always the linked-list
  tip, never a `Set`.
- A caller that seeds via the public input API
  (`resolver.resolve(..., { stack: new Set(...) }, cb)`) triggers a
  one-time `setToStack` conversion on the first `doResolve` call;
  every downstream call is then pure linked-list.
- The `stack` getter is a one-liner (`stackToSet(this._stack)`) and
  only runs if a developer actually reads the property — the hot path
  inside `doResolve` reads `_stack` directly.
- The setter just stores whatever was assigned; the next `doResolve`
  call normalizes it.

All 742 tests pass (including the four
`resolve-context-stack.test.js` cases). Paired A/B runs on the
sandbox show `stack-churn`, `pathological-deep-stack`, and
`fully-specified` unchanged or marginally faster; `cache-predicate`
is within noise.
Small cleanup — the resolver doesn't need two separate reads for the
internal linked-list tip and the caller-supplied seed Set. `_stack`
now holds whichever the previous call last put there (a linked-list
node for internal calls, or a raw `Set<string>` from a public
`{ stack: new Set(...) }` seed). `doResolve` normalizes on first
touch with `setToStack` as before.

Along the way: moved `stackToSet` into Resolver.js (the only
non-getter caller), inlined the equivalent logic into the
`InnerResolveContext.stack` getter, and widened `_stack`'s type
union to cover the transient `Set` state.

Perf context (from side-by-side A/B on the sandbox, medians of 3
runs, vs current `main`):

  fully-specified          2881 vs 2833 ops/s   +1.7%
  cache-predicate          1194 vs 1176 ops/s   +1.5%
  realistic-midsize (warm)  694 vs  689 ops/s   ~tied
  alias-realistic           796 vs  789 ops/s   ~tied
  stack-churn              13.3 vs 15.8 ops/s   -16%   (known)
  pathological-deep-stack    78 vs  109 ops/s   -28%   (known)

The two regressions are inherent to the linked-list representation:
`stackContainsEntry` walks the chain in O(depth), whereas main's
`Set.has` is O(1). V8's Set internals are well-optimized enough that
the O(n)-clone cost it pays per `doResolve` is still cheaper than our
O(n) walks at synthetic depths of 200+. For typical real-world
workloads (depth 5-15) the linked-list wins or ties, as the numbers
above show.

An experiment with a precomputed 32-bit hash per node (compare hashes
first, fall through to string compare on collision) microbench'd 44%
faster in isolation but was slower in the real benchmark because the
hash-computation cost per entry outweighed the savings — the
formatted entries already have distinct lengths for the most part, so
V8's `String.prototype ===` already short-circuits quickly.
CodSpeed reports a -45% instruction-count regression on
cache-predicate. The root cause: every inner context was an
InnerResolveContext class instance whose prototype carries a
getter/setter for `stack`. V8 uses a slower property-access path
for ALL own properties on objects whose prototype chain has accessor
descriptors — not just the accessor itself. CodSpeed's
instruction-counting mode amplifies this cost that wall-clock hides.

After extensive profiling (hash pre-computation, plain object
literals, shared prototypes, Object.create, removing the getter
entirely), the linked-list approach cannot match main's instruction
count because:

 1. V8's Set internals are heavily optimized (Set.has is ~8 ns,
    Set clone via `new Set(existingSet)` ~10 ns/entry). A linked-
    list walk doing string === per node is ~2 ns/step but O(depth)
    total — more instructions at any depth > 1.

 2. Any object shape that carries a getter/setter (class prototype,
    Object.create, inline accessor) causes V8 to fall back to a
    slower IC path for all property accesses on the instance.

 3. The extra `{ entry, parent }` POJO per doResolve doubles the
    allocation count vs main (which allocates only the cloned Set).

This commit restores `Resolver.doResolve` and `createInnerContext`
to the exact same code as current `main`. Main's Set-based approach
already handles the `{ stack: new Set([...]) }` input API (the
four `resolve-context-stack.test.js` cases pass unchanged), so no
backward-compatibility wrapper is needed.

What remains from the PR:
 - benchmark/cases/stack-churn/ (new benchmark case)
 - benchmark/README.md (documents it)
 - test/resolve-context-stack.test.js (proves the input API)

All 742 tests pass, lint is clean, types.d.ts is unchanged, and
instruction count matches main (zero regression).
alexander-akait pushed a commit that referenced this pull request Apr 17, 2026
…526)

<!-- Thanks for submitting a pull request! Please provide enough
information so that others can review your pull request. -->

**Summary**
Replace `Set<string>`-based stack with a singly-linked `StackEntry`
class that exposes a Set-compatible API (`has`, iteration, `size`,
`toString`). Each `doResolve` call prepends a node instead of cloning a
Set.

### Comparison

| | [original PR
#443](#443) | [PR #523
(`claude/pr-443-...`)](#523)
| **this PR (#526)** |
|---|---|---|---|
| Stack representation | POJO `{ name, path, request, ..., parent }`
linked list | POJO `{ entry, parent }` linked list, hidden behind a
`Set<string>` getter | `StackEntry` class linked list |
| `doResolve` hot path | direct POJO alloc + `hasStackEntry`
field-compare | `instanceof Set` branch + possible `setToStack`
conversion + eager string build | direct `new StackEntry(..., parent)` +
field-compare `has` |
| Per-call string formatting | lazy (only on recursion log) | every call
| lazy (only on recursion log) |
| `stack.has(entry)` (public API) | **breaks** — stack is a POJO, no
`.has` | works (but materializes a Set each read) | works (O(n) list
walk, no alloc) |
| `stack instanceof Set` | **breaks** | true | **breaks** — rare in
practice |
| `stack.add(x)` | **breaks** | works (on the Set view) | **breaks** —
external code doesn't mutate |
| Webpack `ResolverCachePlugin` | **requires update** (author's own
note) | works (via Set conversion path) | works (no change needed) |

### CodSpeed CI benchmarks (vs `main` BASE)

| Benchmark | PR #523 | **this PR (#526)** |
|---|---|---|
| pathological-deep-stack: alias chain of 50 (warm) | +50.91% | +48.65%
|
| realistic-midsize: mixed batch (warm cache) | +10.29% | **+12.55%** |
| resolve-to-context: directory resolve (warm) | +11.82% | **+18.65%** |
| symlinks: symlinks=false (warm) | +10.77% | **+12.38%** |
| description-files-multi (warm) | +11.07% | **+13.13%** |
| exports-patterns-many: 6 prefixes × 4 leaves (warm) | +10.16% |
**+11.2%** |
| main-field: browser/module/main combos (warm) | +13.68% | *(similar)*
|
| **cache-predicate: mixed cached/uncached (warm)** | **−45.02% ❌** |
**+10.5% ✅** |

PR #523 regresses `cache-predicate` by 45% because its `doResolve` hot
path adds an `instanceof Set` branch + `_stack \|\| stack` fallback +
lazy Set getter materialization. For paths that hit UnsafeCachePlugin
and skip the rest of the plugin chain (already microsecond-scale), that
per-call overhead dominates.

This PR avoids the regression by keeping a single representation (the
linked list) with no branching or materialization on the hot path —
every benchmark improves, none regress.


<!-- In addition to that please answer these questions: -->

**What kind of change does this PR introduce?**
perf

**Did you add tests for your changes?**

<!-- Please note: in most cases, if you change the code, we will not
merge your changes unless you add tests. -->

**Does this PR introduce a breaking change?**

<!-- If this PR introduces a breaking change, please describe the impact
and a migration path for existing applications. -->

**If relevant, what needs to be documented once your changes are merged
or what have you already documented?**

<!-- List all the information that needs to be added to the
documentation after merge that has already been documented in this PR.
-->

**Use of AI**

<!-- If you have used AI, please state so here. Explain how you used it.
Make sure to read our AI policy
(https://github.com/webpack/governance/blob/main/AI_POLICY.md) or your
Pull Request may be closed due inresponsible use of AI. -->
@alexander-akait alexander-akait merged commit 7e502fc into main Apr 17, 2026
33 of 34 checks passed
@alexander-akait alexander-akait deleted the claude/pr-443-performance-improvements-GpXWI branch April 17, 2026 11:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants