Skip to content

perf: node_modules-anchor fast paths#1189

Closed
Boshen wants to merge 2 commits into
mainfrom
perf/node-modules-anchor-fast-paths
Closed

perf: node_modules-anchor fast paths#1189
Boshen wants to merge 2 commits into
mainfrom
perf/node-modules-anchor-fast-paths

Conversation

@Boshen

@Boshen Boshen commented May 30, 2026

Copy link
Copy Markdown
Member

Summary

Two fast paths for resolving inside node_modules, built on one allocation-free helper — Cache::pkg_anchor, which walks the cached parent chain (no syscalls) to the deepest <...>/node_modules/<pkg> directory (the "anchor").

1. Canonicalize the anchor, not every component

canonicalize_with_visited canonicalizes the real anchor and appends the suffix below it without a per-component read_link walk:

  • The anchor is canonicalized properly (recursively), so a symlinked project root (/tmp/private/tmp, a symlinked $HOME/checkout, a Windows junction) is still resolved.
  • A symlinked anchor — a linked workspace package, or the top-level link of an isolated layout — takes the normal walk, which follows it and everything below.
  • The suffix is appended only after suffix_below_anchor_is_symlink_free confirms every component strictly below the anchor is a real, non-symlink entry. That lstat reuses the cache (CachedMeta), so components already stat'd while resolving the file cost nothing — and any internal package symlink (lib -> dist, a re-export file) or .. suffix falls back to the normal walk.

This is correct for any layout — flat, isolated, PnP, workspace, nested, or a hand-built tree — because it verifies the suffix rather than assuming it from a detected layout. No layout detection, no markers, no trust gate.

2. Jump to the package's package.json

find_package_json_for_a_package jumps straight to the anchor's package.json instead of probing every intermediate directory, falling back to the directory walk when dependency tracking is enabled (resolve_with_context, so missing_dependencies stays byte-identical) or when the anchor has no manifest of its own.

Measured syscall reduction

FileSystem-call counts for the 18-request package_managers workload (which includes monorepo cases — a nested react@17 under a workspace and a workspace-to-workspace dep), this branch vs main, measured with one counting-FileSystem harness:

combo main now Δ
npm-flat / yarn-flat / bun-flat 108 86 −20%
pnpm-isolated / pnpm-hoisted / bun-isolated 143 120 −16%
yarn-isolated 135 112 −17%
yarn-pnp 174 134 −23%

The reduction comes from read (skipped intermediate package.json ENOENT probes) and the eliminated per-component canonicalize walk; the suffix check adds no syscalls in real installs because the lstats are cache hits.

Tests

  • node_modules_canonicalize.rs walks every path (~12.4k) under each installed fixtures/bench-pm/installs/<combo>/node_modules tree (all npm/pnpm/yarn/bun × flat/isolated/hoisted/pnp combos) and asserts the resolver's canonicalization equals std::fs::canonicalize — Windows-aware — exercising the symlinks into virtual stores, the .bin shim directory, and the monorepo layouts (a nested react@17 under packages/ui, a workspace-to-workspace link).
  • Committed symlink fixtures cover a real anchor with internal dir/file symlinks (internal-symlink) and a symlinked workspace anchor with a suffix symlink (symlinked-workspace); each test fails without the suffix check.
  • Full suite green: 271 unit + integration + dependencies + resolve_package + doctest, clippy --deny warnings, fmt.

🤖 Generated with Claude Code

@codecov

codecov Bot commented May 30, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 97.40260% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.78%. Comparing base (8252151) to head (7cc5b97).

Files with missing lines Patch % Lines
src/cache/cache_impl.rs 97.56% 1 Missing ⚠️
src/lib.rs 97.22% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1189      +/-   ##
==========================================
+ Coverage   93.75%   93.78%   +0.03%     
==========================================
  Files          22       22              
  Lines        4229     4282      +53     
==========================================
+ Hits         3965     4016      +51     
- Misses        264      266       +2     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 81f5c0eeb3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/cache/cache_impl.rs
Comment on lines +389 to +400
if path.inside_node_modules
&& let Some(anchor) = self.pkg_anchor(path)
&& anchor.path() != path.path()
{
let anchor_canonical = self.canonicalize_with_visited(&anchor, visited)?;
// `anchor` is a genuine path-ancestor node of `path`, so the prefix always strips and
// the remainder is non-empty (guaranteed by the `anchor != path` guard above).
let rest = path.path().strip_prefix(anchor.path()).unwrap();
let canonical = anchor_canonical.normalize_with(rest, self);
let _ =
path.canonicalized.set((Arc::downgrade(&canonical.0), canonical.0.path.clone()));
return Ok(canonical);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve symlink resolution below package anchors

When the resolved path is inside node_modules/<pkg> and any component below that package root is itself a symlink (for example node_modules/pkg/lib -> ../../shared and resolving pkg/lib/index.js), this fast path canonicalizes only the package anchor and appends the remaining suffix without checking lib. With the default ResolveOptions::symlinks = true, load_realpath should return the symlink target as the previous per-component walk did, but this now returns and caches the unresolved node_modules/pkg/lib/index.js path.

Useful? React with 👍 / 👎.

@codspeed-hq

codspeed-hq Bot commented May 30, 2026

Copy link
Copy Markdown

Merging this PR will degrade performance by 6.83%

❌ 7 regressed benchmarks
✅ 14 untouched benchmarks
⏩ 5 skipped benchmarks1

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Benchmark BASE HEAD Efficiency
pm/npm-flat 954.7 µs 1,067.7 µs -10.58%
pm/yarn-isolated 1 ms 1.1 ms -8.95%
pm/pnpm-isolated 1.1 ms 1.2 ms -8.84%
pm/bun-isolated 1.1 ms 1.2 ms -6.85%
pm/pnpm-hoisted 1.1 ms 1.2 ms -4.6%
pm/yarn-flat 950.8 µs 992.4 µs -4.19%
resolver_real[multi-thread] 388 µs 402.1 µs -3.52%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing perf/node-modules-anchor-fast-paths (7cc5b97) with main (8252151)

Open in CodSpeed

Footnotes

  1. 5 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a7c5991b5f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/cache/cache_impl.rs
// `anchor` is a genuine path-ancestor node of `path`, so the prefix always strips and
// the remainder is non-empty (guaranteed by the `anchor != path` guard above).
let rest = path.path().strip_prefix(anchor.path()).unwrap();
let canonical = anchor_canonical.normalize_with(rest, self);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep walking suffixes that escape package anchors

When the cached path contains .. components after the package segment, this suffix is not necessarily “strictly below” the anchor. For example, resolving from an absolute directory like /repo/node_modules/a/.. to ./b/index.js produces a path whose anchor is node_modules/a and whose rest is ../b/index.js; appending that to the canonicalized a skips the normal per-component walk over sibling b. If node_modules/b is a symlink and symlinks is enabled, this now returns the unresolved sibling path instead of the symlink target, whereas the old walk would lstat and follow b.

Useful? React with 👍 / 👎.

@Boshen Boshen force-pushed the perf/node-modules-anchor-fast-paths branch from a7c5991 to 9080c54 Compare May 30, 2026 09:55

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9080c548d9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/lib.rs Outdated
if ctx.missing_dependencies.is_none()
&& let Some(anchor) = self.cache.pkg_anchor(cached_path)
{
return self.cache.get_package_json(&anchor, &self.options, ctx);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fall back when the anchor lacks package.json

When resolving a file under node_modules/<pkg>/sub/... where the package anchor has no package.json but a nested directory does, this fast path returns None immediately for the normal resolve()/resolve_file() APIs. The previous directory walk (and the new slow path still used by resolve_with_context) would keep probing up to the node_modules boundary and return that nested package.json, so Resolution::package_json() now differs between APIs and regresses packages that are resolved by explicit subpaths even though their top-level directory has no manifest.

Useful? React with 👍 / 👎.

@Boshen Boshen marked this pull request as draft May 31, 2026 04:33
@Boshen Boshen force-pushed the perf/node-modules-anchor-fast-paths branch 3 times, most recently from 87ad871 to 50143c3 Compare June 4, 2026 08:30
Boshen added a commit that referenced this pull request Jun 4, 2026
Adds a `node_modules` canonicalization test suite + committed symlink
fixtures that assert the resolver's canonicalization equals
`std::fs::canonicalize` across the layout families:

- **`canonicalize_matches_os_for_all_node_modules`** — walks every path
in each installed `fixtures/bench-pm/installs/<combo>/node_modules` tree
(npm/pnpm/yarn/bun × flat/isolated/hoisted/pnp); skipped when the bench
fixtures aren't installed.
- **`symlinked_package_anchor_walks_suffix_symlinks`** — a symlinked
workspace anchor with a symlink in the suffix below it.
- **`real_package_anchor_walks_internal_symlinks`** — a real
`node_modules/<pkg>` anchor that ships an internal directory/file
symlink (`lib -> dist`, a re-export file).
- **`nested_monorepo_canonicalize_matches_os`** — a monorepo version
conflict where a workspace package nests its own dependency version via
an isolated-store symlink (root resolves `dep@2`, `packages/ui` nests
`dep@1`).

These all pass on `main` — they pin canonicalization correctness
independent of any resolver optimization, so they can land on their own
(and guard #1189).

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Boshen added 2 commits June 5, 2026 01:23
Two fast paths for resolving inside `node_modules`, sharing one allocation-free
`Cache::pkg_anchor` (a parent-chain walk with no syscalls) to the deepest
`<...>/node_modules/<pkg>` directory (the "anchor").

1. Canonicalize the anchor, not every component. `canonicalize_with_visited`
   canonicalizes the real anchor — recursively, so a symlinked project root is
   still resolved — and appends the suffix below it without a per-component
   `read_link` walk. A symlinked anchor (a linked workspace package, or the
   top-level link of an isolated layout) takes the normal walk. The suffix is
   appended only after `suffix_below_anchor_is_symlink_free` confirms every
   component strictly below the anchor is a real, non-symlink entry — reusing the
   cached `lstat`, so it is free for components already stat'd while resolving the
   file. An internal package symlink or a `..` suffix falls back to the normal
   walk. Correct for any layout because it verifies the suffix rather than
   assuming it from a detected layout.

2. Jump to the package's `package.json`. `find_package_json_for_a_package` jumps
   straight to the anchor's `package.json` instead of probing every intermediate
   directory, falling back to the directory walk when dependency tracking is
   enabled or when the anchor has no manifest of its own.

Syscalls on the 18-request `package_managers` workload vs `main`: flat 108->86,
isolated 143->120, yarn-isolated 135->112, yarn-pnp 174->134 (-16% to -23%).
…kspace)

`packages/ui` depends on `react@17.0.2` while the app uses `react@18.3.1`, so the
conflict un-hoists a nested `packages/ui/node_modules/react`; `packages/utils`
links `@bench/ui` (workspace-to-workspace). Two `package_managers` workload
requests resolve `react` from `packages/ui` (the nested 17) and `@bench/ui` from
`packages/utils`; lockfiles regenerated for all combos.
@Boshen Boshen force-pushed the perf/node-modules-anchor-fast-paths branch from 65b36ce to 7cc5b97 Compare June 4, 2026 17:24
@Boshen

Boshen commented Jun 5, 2026

Copy link
Copy Markdown
Member Author

Too hard, I give up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant