Skip to content

perf(pacquet): port pnpm's purePkgs + peersCache for peer resolution#11906

Merged
zkochan merged 4 commits into
mainfrom
perf/11900-isnew
May 24, 2026
Merged

perf(pacquet): port pnpm's purePkgs + peersCache for peer resolution#11906
zkochan merged 4 commits into
mainfrom
perf/11900-isnew

Conversation

@zkochan

@zkochan zkochan commented May 24, 2026

Copy link
Copy Markdown
Member

Refs #11900, #11907.

Summary

The peer resolver was rewalking every NodeId in the tree from scratch and recomputing the full per-package peer set on each hoist-loop iteration. On the astro@^5 install (~1.6k unique packages, deep transitive shape) that pushed resolve_peers to 3.8 s of an 8.5 s resolve_importer phase — pacquet had already recorded is_pure per graph node but wasn't using it as a cache key, and peersCache was deferred in the original slice.

This PR ports both upstream optimizations + the deep parentPackagesMatch cache-match check, and removes an unsafe node_dep_paths shortcut at the top of resolve_node that returned stale depPaths when a shared NodeId got walked under two different parent peer contexts.

Four commits:

  1. Tests — port four peer-resolution cases from upstream's installing/deps-resolver/test/resolvePeers.ts.
  2. The cache portpurePkgs + peersCache + find_hit (initial simplified version).
  3. Doc fix — drop a broken intra-doc link to a local let binding.
  4. CodeRabbit review fixupsparentPackagesMatch deep check (the real cache-match logic), depth tie-break on fast returns, module-doc lead-in corrections, trimmed test docstrings.

How purePkgs and peersCache work

purePkgs — a HashSet<String> of pkgIdWithPatchHash values whose full subtree resolved with zero external peers and zero missing peers. A revisit of any pure pkg whose own peerDependencies is empty short-circuits with depPath = pkgIdWithPatchHash — no recursion.

peersCache — a HashMap<String, Vec<PeersCacheItem>> keyed by pkgIdWithPatchHash. find_hit accepts a cache item when, for each cached resolved peer, the current parent_refs has a counterpart entry whose NodeId either equals the cached one, shares its node_dep_paths entry, or points to a package with the same pkgIdWithPatchHash — and in the last case parent_packages_match (the deep check) on the two parents' own recorded contexts also succeeds (skipped only when pure_pkgs makes it vacuous, matching upstream).

parent_packages_match + parent_pkgs_of_node — the deep check ported per upstream parentPackagesMatch. Each NodeId's parent peer context is recorded at descend time (mirrors upstream's parentPkgsOfNode.set(...) at resolvePeers.ts:817-832). parent_packages_match compares two contexts: same size, same set of names, each name mapping to the same (pkg_id, version); when peers are shadowed (an occurrence > 0 on either side) the contexts must additionally agree on depth or be in purePkgs. ParentRef now carries depth and occurrence, and bump_occurrence_on_shadow handles same-name shadowing per upstream's addParentPkg arm at resolvePeers.ts:430-439.

Why this PR is correct after the earlier wrong direction

The original version of this branch took the isNew gate from resolveDependencies.ts and applied it to pacquet's tree walker: share child NodeIds across occurrences of the same package id. That diverged from pnpm — upstream uses fresh NodeIds per occurrence (resolveDependencies.ts:1580) and absorbs the would-be tree explosion downstream via purePkgs + peersCache. The shared-NodeId approach broke the peer resolver's node_dep_paths cache, which the new revisit_resolves_peer_in_one_occurrence_misses_in_other test caught immediately. This PR keeps the per-occurrence tree shape and ports the actual upstream caches.

Result

Bench on astro@^5 (macOS, hot store, no lockfile, hyperfine --warmup 1 --runs 5):

mean speedup
pacquet@main 11.346 ± 1.262 s 1.00×
pacquet+peersCache 9.890 ± 1.651 s ~1.15×

The deep parent_packages_match check rejects more cache hits than the earlier simplified version (~1.51×), in exchange for matching pnpm's algorithm exactly. The earlier simplified port was a divergence that could produce false-positive cache hits on ancestor-sensitive peer chains.

Test plan

  • cargo nextest run -p pacquet-resolving-deps-resolver — 54/54 pass (50 pre-existing + 4 newly ported peer cases)
  • revisit_resolves_peer_in_one_occurrence_misses_in_other passes (this is the test that caught the earlier wrong direction)
  • cargo clippy --locked --workspace --all-targets -- --deny warnings — clean
  • cargo fmt --check — clean
  • RUSTDOCFLAGS=-D warnings cargo doc --no-deps -p pacquet-resolving-deps-resolver — clean
  • CI integrated-benchmark on Linux

Written by an agent (Claude Code, claude-opus-4-7).

Summary by CodeRabbit

  • Performance

    • Faster peer dependency resolution via a pure-package fast path and a subtree cache to avoid redundant traversal, reducing resolution time and repeated work.
  • Tests

    • Added regression tests for cyclic peers, multiple occurrences, parallel sibling peer chains, and incorrect-version peer reporting to ensure stable peer-resolution behavior.

Review Change Stack

@qodo-code-review

Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@coderabbitai

coderabbitai Bot commented May 24, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

Adds per-package peer-resolution caching (pure_pkgs, peers_cache), parent-context snapshots, shadowing-aware parent refs, a cache-compatibility finder (find_hit), a pure-subtree fast-path, and four regression tests covering peer cycles, revisits, sibling isolation, and bad-peer recording.

Changes

Peer-resolution caching and optimization

Layer / File(s) Summary
Docs and Walker init
pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs
Module docs updated; Walker gains pure_pkgs, peers_cache, and parent_pkgs_of_node fields.
ParentRef shape and shadowing support
pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs
ParentRef adds alias, depth, occurrence; insert_parent_ref initializes new fields; bump_occurrence_on_shadow implements occurrence increments for shadowing.
Pure-pkgs fast-path short-circuit
pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs
resolve_node early-returns for package ids in pure_pkgs with empty peerDependencies, writing DepPath and empty external/missing maps and updating graph depth.
ParentRefs augmentation for children
pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs
Child ParentRefs include computed child_depth and use shadowing-aware occurrence bumping to capture per-parent context for cache validation.
peers_cache lookup and hit application
pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs
resolve_node records parent snapshots, calls find_hit pre-recursion; on hit clones cached dep_path/resolved peers/missing peers into node state, replays missing-peer diagnostics, updates depth, and returns without recursion.
Cache population after resolution
pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs
After normal computation, pure outcomes insert pkg.id into pure_pkgs; non-pure outcomes append PeersCacheItem (dep_path, external resolved peers, missing peers) to peers_cache.
Regression tests for peer-resolution behaviors
pacquet/crates/resolving-deps-resolver/src/tests.rs
Four new #[tokio::test] cases: cyclic peer dependencies, revisit with different occurrences producing both pure and peer-suffixed depPaths, two sibling peer chains resolving independently, and bad-peer recorded with found/wanted details.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

  • pnpm/pnpm#11774: Upstream changes to resolve_peers implementing the pure_pkgs fast-path and peers_cache/find_hit logic referenced by this PR.
  • pnpm/pnpm#11791: Changes around pkgIdWithPatchHash and depPath suffixing that interact with this PR's cache keys and depPath reuse.

Poem

🐰 I hopped through caches, quick and keen,
Found pure subtrees where peers were clean.
I reused paths that once were known,
Saved the walk, and nibble on a cone. 🌰

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Linked Issues check ⚠️ Warning The PR ports pnpm peer-resolution optimizations and adds peer-resolution tests, but the only linked issue (#39) concerns optional dependencies installation logic, which is unrelated to the peer-resolution caching mechanism being implemented. Either link the PR to the correct peer-resolution or performance-related issue, or clarify the connection between peer-resolution caching and optional-dependencies installation if one exists.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'perf(pacquet): port pnpm's purePkgs + peersCache for peer resolution' directly describes the main change: porting performance optimizations (purePkgs and peersCache) from pnpm for peer resolution, which is the core focus of the changeset.
Out of Scope Changes check ✅ Passed All changes are scoped to resolve_peers.rs and tests.rs for peer-resolution caching; no unrelated modifications detected. The additions of purePkgs, peersCache, cache-lookup logic, and regression tests align with the stated performance-optimization objective.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/11900-isnew

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented May 24, 2026

Copy link
Copy Markdown
Contributor

Micro-Benchmark Results

Linux

group                          main                                   pr
-----                          ----                                   --
tarball/download_dependency    1.01      5.3±0.05ms   812.6 KB/sec    1.00      5.3±0.06ms   818.1 KB/sec

@codecov-commenter

codecov-commenter commented May 24, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 60.60606% with 65 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.65%. Comparing base (e549cd1) to head (39ec3ea).

Files with missing lines Patch % Lines
...rates/resolving-deps-resolver/src/resolve_peers.rs 60.60% 65 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #11906      +/-   ##
==========================================
- Coverage   87.86%   87.65%   -0.22%     
==========================================
  Files         206      206              
  Lines       24565    24719     +154     
==========================================
+ Hits        21584    21667      +83     
- Misses       2981     3052      +71     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

github-actions Bot commented May 24, 2026

Copy link
Copy Markdown
Contributor

Integrated-Benchmark Report (Linux)

Scenario: Isolated linker: fresh restore, cold cache + cold store

Command Mean [s] Min [s] Max [s] Relative
pacquet@HEAD 2.359 ± 0.120 2.230 2.585 1.03 ± 0.06
pacquet@main 2.298 ± 0.062 2.225 2.430 1.00
pnpm 4.661 ± 0.050 4.581 4.746 2.03 ± 0.06
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 2.35926872134,
      "stddev": 0.11969086134660906,
      "median": 2.3081692452400002,
      "user": 2.8002159200000003,
      "system": 3.4618500000000005,
      "min": 2.2295608487400003,
      "max": 2.58473629474,
      "times": [
        2.43114761674,
        2.4888848657400002,
        2.27034705774,
        2.58473629474,
        2.44436667674,
        2.29221961074,
        2.2295608487400003,
        2.25523341374,
        2.27207194874,
        2.3241188797400003
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 2.29797055374,
      "stddev": 0.06152326355556041,
      "median": 2.2956533402400003,
      "user": 2.79598692,
      "system": 3.4461664999999995,
      "min": 2.22527565874,
      "max": 2.42962754174,
      "times": [
        2.3429010687400003,
        2.3206655237400002,
        2.26710903174,
        2.27871674274,
        2.32162139374,
        2.22527565874,
        2.3125899377400003,
        2.2297610527400002,
        2.25143758574,
        2.42962754174
      ]
    },
    {
      "command": "pnpm",
      "mean": 4.661469954040001,
      "stddev": 0.05011622041656565,
      "median": 4.66663554674,
      "user": 7.896686619999999,
      "system": 4.0821768,
      "min": 4.58080742774,
      "max": 4.74581011574,
      "times": [
        4.65377143274,
        4.5952205387400005,
        4.68729841774,
        4.67949966074,
        4.70027359774,
        4.58080742774,
        4.74581011574,
        4.69112354474,
        4.62870797674,
        4.6521868277400005
      ]
    }
  ]
}

Scenario: Isolated linker: fresh restore, hot cache + hot store

Command Mean [ms] Min [ms] Max [ms] Relative
pacquet@HEAD 682.2 ± 28.1 665.3 758.8 1.00
pacquet@main 694.8 ± 70.2 654.8 888.2 1.02 ± 0.11
pnpm 2495.5 ± 91.7 2357.4 2605.8 3.66 ± 0.20
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 0.6821796710200001,
      "stddev": 0.028073629207814473,
      "median": 0.6705739240199999,
      "user": 0.37103729999999996,
      "system": 1.46730042,
      "min": 0.66530032852,
      "max": 0.75880760352,
      "times": [
        0.75880760352,
        0.68311546152,
        0.67130834452,
        0.66983950352,
        0.68590740152,
        0.66653374052,
        0.66908224452,
        0.66530032852,
        0.68481780852,
        0.66708427352
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 0.6948256770200001,
      "stddev": 0.0701570791641633,
      "median": 0.66879774652,
      "user": 0.3739634,
      "system": 1.4771797199999999,
      "min": 0.65483228752,
      "max": 0.88822133452,
      "times": [
        0.70874859152,
        0.69504225852,
        0.66338915052,
        0.66033532552,
        0.65483228752,
        0.88822133452,
        0.67420634252,
        0.68203216852,
        0.66301224352,
        0.65843706752
      ]
    },
    {
      "command": "pnpm",
      "mean": 2.4954907293200006,
      "stddev": 0.09170098768364807,
      "median": 2.52302605802,
      "user": 3.0049859,
      "system": 2.2038131199999995,
      "min": 2.35741292052,
      "max": 2.60575667852,
      "times": [
        2.5789080015200003,
        2.40504186852,
        2.46694132352,
        2.54293093152,
        2.60575667852,
        2.56938263752,
        2.35741292052,
        2.36630937752,
        2.5031211845200003,
        2.55910236952
      ]
    }
  ]
}

Scenario: Isolated linker: fresh install, cold cache + cold store

Command Mean [s] Min [s] Max [s] Relative
pacquet@HEAD 3.513 ± 0.065 3.400 3.585 1.00
pacquet@main 4.950 ± 0.164 4.681 5.162 1.41 ± 0.05
pnpm 6.237 ± 0.066 6.103 6.343 1.78 ± 0.04
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 3.51278513292,
      "stddev": 0.06540646754542927,
      "median": 3.52373171142,
      "user": 5.248499239999999,
      "system": 3.5201653,
      "min": 3.39998122042,
      "max": 3.5851835944199997,
      "times": [
        3.4902004074199997,
        3.4585512284199997,
        3.39998122042,
        3.43332904142,
        3.55686004742,
        3.5254234624199996,
        3.5851835944199997,
        3.5826319184199997,
        3.52203996042,
        3.57365044842
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 4.94993090252,
      "stddev": 0.16388024622170166,
      "median": 4.955489203920001,
      "user": 6.72416784,
      "system": 3.5395694,
      "min": 4.68146582342,
      "max": 5.16239609842,
      "times": [
        4.68146582342,
        4.768196312420001,
        5.07339464142,
        4.8288952514200005,
        4.87113421842,
        5.15143827542,
        5.16239609842,
        5.002400493420001,
        5.05140999642,
        4.90857791442
      ]
    },
    {
      "command": "pnpm",
      "mean": 6.237169886119999,
      "stddev": 0.06627907961214972,
      "median": 6.23798609292,
      "user": 10.32054904,
      "system": 4.372656600000001,
      "min": 6.102539489420001,
      "max": 6.34293841842,
      "times": [
        6.34293841842,
        6.22424719342,
        6.26871969242,
        6.270143651420001,
        6.246476078420001,
        6.102539489420001,
        6.18259319742,
        6.299144576420001,
        6.22949610742,
        6.2054004564200005
      ]
    }
  ]
}

Scenario: Isolated linker: fresh install, hot cache + hot store

Command Mean [s] Min [s] Max [s] Relative
pacquet@HEAD 2.563 ± 0.103 2.406 2.777 1.00
pacquet@main 4.035 ± 0.216 3.829 4.433 1.57 ± 0.11
pnpm 4.219 ± 0.045 4.123 4.264 1.65 ± 0.07
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 2.5625500814800004,
      "stddev": 0.10345387455664258,
      "median": 2.5381792566800003,
      "user": 2.87741734,
      "system": 2.23025222,
      "min": 2.4060561076800004,
      "max": 2.77651002768,
      "times": [
        2.4956077386800004,
        2.5595915006800003,
        2.77651002768,
        2.6634816796800003,
        2.5693622856800005,
        2.4060561076800004,
        2.5167670126800004,
        2.5093720956800003,
        2.62094841168,
        2.50780395468
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 4.03456181028,
      "stddev": 0.21569972012858313,
      "median": 3.9904817931800003,
      "user": 4.333419039999999,
      "system": 2.25923522,
      "min": 3.8285801176800005,
      "max": 4.43250134068,
      "times": [
        4.14397279568,
        3.9748201856800005,
        4.43250134068,
        3.8287883966800003,
        4.34613823168,
        3.86736091768,
        4.00614340068,
        4.06520115268,
        3.8285801176800005,
        3.8521115636800003
      ]
    },
    {
      "command": "pnpm",
      "mean": 4.21906013668,
      "stddev": 0.04535799255180319,
      "median": 4.21756159968,
      "user": 5.21208624,
      "system": 2.646420419999999,
      "min": 4.12278122568,
      "max": 4.26354054068,
      "times": [
        4.25668976368,
        4.21422241868,
        4.21634957168,
        4.12278122568,
        4.25646727568,
        4.26010492268,
        4.16691949268,
        4.21877362768,
        4.26354054068,
        4.21475252768
      ]
    }
  ]
}

@github-actions

github-actions Bot commented May 24, 2026

Copy link
Copy Markdown
Contributor

🐰 Bencher Report

Branchpr/11906
Testbedpacquet
Click to view all benchmark results
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
isolated-linker.fresh-install.cold-cache.cold-store📈 view plot
🚷 view threshold
3,512.79 ms
(-29.46%)Baseline: 4,979.55 ms
5,975.46 ms
(58.79%)
isolated-linker.fresh-install.hot-cache.hot-store📈 view plot
🚷 view threshold
2,562.55 ms
(-34.45%)Baseline: 3,909.10 ms
4,690.92 ms
(54.63%)
isolated-linker.fresh-restore.cold-cache.cold-store📈 view plot
🚷 view threshold
2,359.27 ms
(-1.24%)Baseline: 2,388.78 ms
2,866.53 ms
(82.30%)
isolated-linker.fresh-restore.hot-cache.hot-store📈 view plot
🚷 view threshold
682.18 ms
(+3.64%)Baseline: 658.21 ms
789.85 ms
(86.37%)
🐰 View full continuous benchmarking report in Bencher

@zkochan

zkochan commented May 24, 2026

Copy link
Copy Markdown
Member Author

Please don't merge this without the peer-resolution tests in #11908.

While porting the upstream peer-resolution tests #11908 promises, one of them — revisit_resolves_peer_in_one_occurrence_misses_in_other (the port of upstream's 'when a package is referenced twice in the dependencies graph and one of the times it cannot resolve its peers') — fails with this PR applied:

thread 'tests::peers::revisit_resolves_peer_in_one_occurrence_misses_in_other' panicked at pacquet/crates/resolving-deps-resolver/src/tests.rs:896:9:
resolved-peer occurrence of foo missing from graph: {"bar@1.0.0", "foo@1.0.0", "zoo@1.0.0", "qar@1.0.0"}

Expected: both foo@1.0.0 (missing-peer occurrence under root→zoo) and foo@1.0.0(qar@1.0.0) (resolved-peer occurrence under root→bar→zoo→foo) in the graph.
Actual: only the missing-peer occurrence — the resolved-peer one is dropped.

What's happening

This PR makes per-occurrence parent NodeIds share their children NodeIds via the new children_node_ids_by_id cache. That gives a DAG-shaped tree instead of an exponential one (the intended win).

But resolve_peers caches the per-NodeId depPath in node_dep_paths keyed by NodeId alone:

// pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs:283
if let Some(existing) = self.node_dep_paths.get(&node_id).cloned() {
    return NodeOutput { dep_path: existing, ... };
}

Before this PR: each occurrence had its own NodeId, so distinct parent peer contexts mapped to distinct cache entries — no conflict.

After this PR: the same child NodeId is walked under two different parent peer contexts (one where qar is missing, one where qar resolves). The first walk caches its depPath. The second walk hits the cache and returns the first's — losing the second occurrence's correct (qar@1.0.0) suffix entirely.

Path forward

Two options I see:

  1. Wait until peersCache lands. Upstream's peersCache keys cached entries by pkgIdWithPatchHash and uses findHit + parentPackagesMatch to confirm the cached parent context matches the current one. That's exactly what's needed to share child NodeIds correctly. Tracked in perf(pacquet): port pnpm's peer-resolution optimizations (purePkgs fast path + peersCache) #11907.
  2. Restrict the isNew cache to pure subtrees. Skip the children-cache reuse when the subtree contains any peer dependency. Loses most of the win on the astro fixture (the platform-tarball packages that drive the 75k→11k reduction are mostly pure leaves), but it would be correct.

The performance win (1.41× on astro) only materializes meaningfully alongside #11907 anyway, so option 1 — stack both behind #11908's tests — feels right.

Going to convert this PR to draft until #11908 lands and either approach is in.


Written by an agent (Claude Code, claude-opus-4-7).

@zkochan zkochan marked this pull request as draft May 24, 2026 18:04
@zkochan zkochan force-pushed the perf/11900-isnew branch from ad3d6db to a31bffb Compare May 24, 2026 18:08
@zkochan zkochan changed the title perf(pacquet): skip child re-walk on revisit (isNew gate) perf(pacquet): isNew gate (broken alone) + peer-resolution tests that catch it May 24, 2026
zkochan added 2 commits May 24, 2026 20:29
Pacquet's `mod peers` test block had five tests, all of which exercise
single-occurrence happy paths. None covered the harder branches the
upstream resolver is designed to handle — cycles, packages reached
twice with divergent peer scope, parallel peer chains, transitive
peer issues. That gap left the peer resolver under-tested even for
the current algorithm, and would have made it dangerous to land
the `peersCache` + `purePkgs` ports tracked in #11907 because the
new cache lookup short-circuits exactly the branches no existing
test exercises.

Port four resolver-layer cases from
[`installing/deps-resolver/test/resolvePeers.ts`](https://github.com/pnpm/pnpm/blob/c86c423bdc/installing/deps-resolver/test/resolvePeers.ts):

- `cyclic_peer_dependencies_resolve_cleanly` — four-way cycle (foo
  ↔ bar ↔ qar ↔ zoo), every node lands in the graph without the
  walker panicking on cycle re-entry. Upstream `:14`.
- `revisit_resolves_peer_in_one_occurrence_misses_in_other` — same
  package reached via two parent chains, one where the peer
  resolves and one where it's missing; both occurrences must
  surface with distinct depPaths. Upstream `:128`.
- `two_peer_chains_resolve_against_their_own_sibling` — two parallel
  pkg-with-peer chains in the same importer; each picks its own
  sibling, no cross-pollination. Substitutes for upstream's
  `'resolve peer dependencies with npm aliases'` (`:573`) since
  npm-alias plumbing isn't yet wired through the test stub
  resolver — the TODO captures the gap so a follow-up can swap in
  the alias form once it lands.
- `bad_peer_inside_subtree_records_resolved_from_parent` — a peer
  reachable through a subdependency but at the wrong version
  surfaces as a *bad* peer, not a missing one. Stands in for
  upstream's `'unmet peer dependency issue resolved from
  subdependency'` describe-block (`:502`); the `resolvedFrom`
  field upstream tracks isn't exposed on pacquet's
  `PeerDependencyIssue` yet, so the test asserts the bad/missing
  classification only.

All four pass on `main`. Together they exercise the parent-context
matching that future cache optimizations (#11907) need to get right
— the second test in particular drives a shared-subtree shape where
a naive NodeId-keyed cache returns stale depPaths.

Refs #11907.
…peer resolution

The peer resolver was rewalking every `NodeId` in the tree from
scratch and recomputing the full per-package peer set on each
hoist-loop iteration. On the `astro@^5` install (~1.6k unique
packages, deep transitive shape) that pushed `resolve_peers` to
3.8 s of an 8.5 s `resolve_importer` phase — pacquet had already
recorded `is_pure` per graph node but wasn't using it as a cache
key, and `peersCache` was deferred in the original slice.

Port both upstream optimizations and remove the unsafe
`node_dep_paths` shortcut at the top of `resolve_node` that
silently returned stale `depPath`s when the same shared `NodeId`
got walked under two different parent peer contexts (an
inevitable shape post-isNew-gate, and the bug
`revisit_resolves_peer_in_one_occurrence_misses_in_other` catches).

## `purePkgs` fast path

A `HashSet<String>` of `pkgIdWithPatchHash` values whose full
subtree resolved with zero external peers and zero missing peers.
Populated bottom-up: a node is added when `is_pure` is true after
its own walk completes. A revisit of any pure pkg whose own
`peerDependencies` is empty short-circuits with
`depPath = pkgIdWithPatchHash` — no recursion, no peersCache
lookup. Mirrors upstream's
`purePkgs` early-return (resolvePeers.ts:398-406 at c86c423).

## `peersCache` + `find_hit`

A `HashMap<String, Vec<PeersCacheItem>>` keyed by
`pkgIdWithPatchHash`. Each cache item carries the `depPath`, the
external `(peer_name → NodeId)` map, and the `(peer_name → info)`
missing set produced by one specific parent peer context.
`find_hit` iterates the bucket and accepts the first item whose
cached resolved-peer `NodeId`s map to packages with the same
`pkgIdWithPatchHash` in the current `child_parent_refs`, and
whose cached missing-peer set doesn't intersect the current
parents. Mirrors upstream's `peersCache` + `findHit`
(resolvePeers.ts:342-348 + 660-699 at c86c423).

The simplified port omits upstream's `parentPackagesMatch` deep
check (which compares the cached and current parent peer chains
via `parentPkgsOfNode`). The current ported tests pass without
it; the deeper check is tracked separately in #11907 along with
the rest of the peer-resolution port.

## Result

Removes the buggy `node_dep_paths` shortcut at the top of
`resolve_node`. All 54 tests pass — including
`revisit_resolves_peer_in_one_occurrence_misses_in_other` which
catches the exact wrong-depPath regression that doomed the
earlier shared-NodeId approach.

Bench on `astro@^5` (macOS, hot store, no lockfile, hyperfine
warmup=1 runs=5, store pre-warmed identically):

|                      | mean              | range            |
|----------------------|------------------:|-----------------:|
| pacquet@main         | 10.929 ± 0.810 s  | 9.90 – 12.17 s   |
| **pacquet+peersCache** | **7.250 ± 0.322 s**  | **6.85 – 7.74 s** |

**1.51× faster than main on the targeted fresh-resolve path**, on
a correct algorithm that matches pnpm. The remaining gap to pnpm
on the same fixture (~1 s) is partly the `parentPackagesMatch`
deep check (#11907) and partly first-visit resolver work that
neither cache addresses.

Refs #11900, #11907.
@zkochan zkochan force-pushed the perf/11900-isnew branch from a31bffb to bd964ab Compare May 24, 2026 18:30
@zkochan zkochan changed the title perf(pacquet): isNew gate (broken alone) + peer-resolution tests that catch it perf(pacquet): port pnpm's purePkgs + peersCache for peer resolution May 24, 2026
@zkochan zkochan marked this pull request as ready for review May 24, 2026 18:30
@qodo-code-review

Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

…s doc

The `[is_pure]` link points at a local `let is_pure = …` binding
inside `resolve_node`, which isn't an item rustdoc can resolve.
`-D rustdoc::broken_intra_doc_links` rejects it on the doc job.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
pacquet/crates/resolving-deps-resolver/src/tests.rs (1)

798-819: ⚡ Quick win

Trim test doc comments that duplicate the test body.

These blocks re-state setup and expected outputs already encoded in the tests. Please keep only non-obvious “why” context and remove procedural narration.

As per coding guidelines: "Tests are documentation. Do not duplicate them in prose." and "Doc comments (///, //!) document the contract ... They are not a re-narration of the body."

Also applies to: 916-935, 1011-1023

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pacquet/crates/resolving-deps-resolver/src/tests.rs` around lines 798 - 819,
Remove or significantly shorten the long doc-comment block that precedes the
test annotated with #[tokio::test] which re-states the test setup and expected
graph keys (the prose describing root→zoo, root→bar→zoo, and the expected
nodes). Keep only concise, non-procedural “why” context (one or two lines) and
delete the procedural narration and duplicated expectations; apply the same
trimming to the two other long doc-comment blocks later in the file that
likewise duplicate their test bodies (the other blocks that precede tests
annotated with #[tokio::test]).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs`:
- Around line 19-35: Update the module-level doc lead-in to stop claiming the
optimizations are not yet ported; state that peersCache and purePkgs
optimizations have been ported and are implemented (referencing peersCache,
purePkgs, pkgIdWithPatchHash, and Walker::find_hit) and clarify the simplified
behavior (omitting parentPackagesMatch deep check) so the top summary matches
the detailed paragraphs below; adjust wording to remove the contradiction and
ensure the doc accurately reflects that pacquet accepts cache hits based on
mapped NodeId→pkgIdWithPatchHash and rejects on missing-peer intersection.
- Around line 724-779: find_hit currently only compares immediate resolved_peers
and missing_peers and can return stale peers_cache entries because it omits
pnpm's deeper parent-chain validation; implement the same parentPackagesMatch
logic used upstream so cached entries are only considered hits when the cached
walk's ancestor chain for each peer matches the current parent_refs. Update
find_hit (and/or add a helper called parent_packages_match) to walk the cached
PeersCacheItem's parent chain and compare each ancestor
NodeId/pkgIdWithPatchHash against the corresponding entries in
self.tree.dependencies_tree and current parent_refs (using resolved_package_id
comparison when NodeIds differ), and ensure missing_peers logic remains; this
ensures peers_cache (Walker::peers_cache / PeersCacheItem) only returns a hit
when parentPackagesMatch succeeds.
- Around line 342-369: The fast-return branches (the pure_pkgs short-circuit and
the cache-hit path) skip the self.graph.entry(...).and_modify(...) logic so
DependenciesGraphNode.depth can remain too large; before returning from both
early exits (the branch that checks self.pure_pkgs.contains(...) /
pkg.peer_dependencies.is_empty() and the other fast-return branch mentioned),
run the same graph update as the normal path: call
self.graph.entry(node_id.clone()).and_modify(|n: &mut DependenciesGraphNode| if
current_depth < n.depth { n.depth = current_depth }); where current_depth is the
same depth value used by the non-fast path (the depth variable the function
normally uses). This preserves the graph depth tie-break behavior expected by
the rest of the algorithm.

---

Nitpick comments:
In `@pacquet/crates/resolving-deps-resolver/src/tests.rs`:
- Around line 798-819: Remove or significantly shorten the long doc-comment
block that precedes the test annotated with #[tokio::test] which re-states the
test setup and expected graph keys (the prose describing root→zoo, root→bar→zoo,
and the expected nodes). Keep only concise, non-procedural “why” context (one or
two lines) and delete the procedural narration and duplicated expectations;
apply the same trimming to the two other long doc-comment blocks later in the
file that likewise duplicate their test bodies (the other blocks that precede
tests annotated with #[tokio::test]).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 8a1d03f5-8d50-4b36-b3ec-711cdc34f82e

📥 Commits

Reviewing files that changed from the base of the PR and between ad3d6db and bd964ab.

📒 Files selected for processing (2)
  • pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs
  • pacquet/crates/resolving-deps-resolver/src/tests.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Lint and Test (ubuntu-latest)
  • GitHub Check: Lint and Test (macos-latest)
  • GitHub Check: Lint and Test (windows-latest)
  • GitHub Check: Run benchmark on ubuntu-latest
  • GitHub Check: Run benchmark on ubuntu-latest
  • GitHub Check: Code Coverage
  • GitHub Check: Compile & Lint
🧰 Additional context used
📓 Path-based instructions (1)
pacquet/**/*.rs

📄 CodeRabbit inference engine (pacquet/AGENTS.md)

pacquet/**/*.rs: When porting a function that fires pnpm:<channel> events through globalLogger, logger.debug(), or streamParser.write(), mirror the call site, payload, and ordering so the reporter parses pacquet's NDJSON the same way it parses pnpm's.
Declare a newtype wrapper for branded string types. Do not collapse the brand into a plain String or &str.
If upstream always validates before construction, validate in pacquet's wrapper too. The wrapper must construct only via TryFrom<String> and/or FromStr. Do not provide an infallible public constructor.
If upstream never validates, just brand for type-safety. Expose an infallible From<String> (and From<&str> when convenient).
If upstream occasionally constructs without validation, expose from_str_unchecked as an escape hatch alongside the validating constructor.
Match upstream serde behavior for branded types that cross JSON, YAML, or INI boundaries. Use #[serde(try_from = "String")] for deserialization and #[serde(into = "String")] for serialization.
Use #[derive(derive_more::From)] and #[derive(derive_more::Into)] for mechanical conversion impls. Fall back to manual impl only when conversion needs custom logic.
String-literal unions should become enums, not newtype wrappers. Model closed sets of valid string values as enums.
Template literal types should be treated as branded strings with validation discipline from rules 2-5.
Choose owned vs. borrowed parameters to minimize copies. Widen to the most encompassing type (&Path over &PathBuf, &str over &String) when it doesn't force extra copies.
Prefer Arc::clone(&x) / Rc::clone(&x) over x.clone() for reference-counted types, so the cost is visible at the call site.
Follow Rust API Guidelines for naming conventions.
Do not use star imports inside module bodies. Write use super::{Foo, bar} instead of use super::*;. Two forms stay allowed: external-crate preludes like use rayon::prelude::*; and root-of-module re-...

Files:

  • pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs
  • pacquet/crates/resolving-deps-resolver/src/tests.rs
🧠 Learnings (3)
📚 Learning: 2026-05-20T19:40:55.051Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11774
File: pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs:0-0
Timestamp: 2026-05-20T19:40:55.051Z
Learning: In the pacquet Rust code, ensure the semver implementation uses the `node-semver` crate (not `nodejs-semver`). `node-semver`’s public API does not include a `satisfies_with_prerelease`-style method; prerelease-tolerant matching should be implemented inline by first calling `Range::satisfies`, and when it rejects a prerelease version, retry matching against a stripped `MAJOR.MINOR.PATCH` base of the prerelease version.

Applied to files:

  • pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs
  • pacquet/crates/resolving-deps-resolver/src/tests.rs
📚 Learning: 2026-05-22T00:08:44.646Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11837
File: pacquet/crates/resolving-npm-resolver/src/pick_package.rs:33-51
Timestamp: 2026-05-22T00:08:44.646Z
Learning: In the pnpm/pnpm repo’s pacquet Rust crates, do not flag Unicode ellipsis characters (U+2026, `…`) in Rust doc comments (`///` / `/** */`) as a lint violation. The pacquet crate’s `dylint.toml` only enables `perfectionist::derive_ordering`, and the Dylint `unicode-ellipsis` rule is not enabled for this project—so `…` in doc comments is an intentional, repo-consistent style.

Applied to files:

  • pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs
  • pacquet/crates/resolving-deps-resolver/src/tests.rs
📚 Learning: 2026-05-20T23:07:58.444Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11784
File: pacquet/crates/resolving-deps-resolver/src/hoist_peers.rs:120-133
Timestamp: 2026-05-20T23:07:58.444Z
Learning: When reviewing code in this pacquet Rust port, follow the upstream pnpm compatibility rule: only match pnpm’s behavior exactly. Do not propose review changes that intentionally deviate from pnpm’s documented/observed behavior, even if pnpm appears buggy. If you identify a real bug in pnpm behavior, the review should prioritize fixing it upstream in pnpm first, and avoid implementing a pnpm-behavior workaround here unless the same fix has already landed upstream.

Applied to files:

  • pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs
  • pacquet/crates/resolving-deps-resolver/src/tests.rs
🔇 Additional comments (1)
pacquet/crates/resolving-deps-resolver/src/tests.rs (1)

691-797: LGTM!

Also applies to: 820-915, 936-1010, 1024-1083

Comment thread pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs
Comment thread pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs
Comment thread pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs
Three correctness items + one nitpick from the auto-review:

1. **`parent_packages_match` deep check**. `find_hit` was checking
   only the immediate resolved-peer `NodeId` / `pkgIdWithPatchHash`
   match, which can still return stale cached entries when the peer
   itself has different per-occurrence ancestor chains. Port
   upstream's [`parentPackagesMatch`](https://github.com/pnpm/pnpm/blob/c86c423bdc/installing/deps-resolver/src/resolvePeers.ts#L701-L731):
   - Track per-`NodeId` parent peer context in
     `Walker::parent_pkgs_of_node`, populated at descend-time by
     `parent_dep_paths_from_refs` (mirrors upstream's
     `parentPkgsOfNode.set(childNodeId, parentDepPaths)` block at
     `resolvePeers.ts:817-832`).
   - Add `Walker::parent_packages_match` that compares two
     `NodeId`s' recorded contexts: same set of peer-relevant names,
     each name mapping to the same `(pkg_id, version)`, plus
     upstream's depth-equality / `pure_pkgs` fallbacks when peers
     are shadowed.
   - Extend `ParentRef` with `depth` and `occurrence` so the deep
     check has the inputs it needs. New `bump_occurrence_on_shadow`
     helper does the same-name shadowing bookkeeping upstream's
     `addParentPkg` arm does at `resolvePeers.ts:430-439`.
   - `find_hit` now calls `parent_packages_match` whenever cached
     and current peer `NodeId`s differ but their pkg ids match,
     skipping the deep check only when `pure_pkgs` makes it
     vacuous (matches upstream).

2. **Depth tie-break on the fast returns**. The `pure_pkgs`
   short-circuit and the `find_hit` cache-hit branch both returned
   early without running the existing
   `self.graph.entry(dep_path).and_modify(|n| if n.depth > … { … })`
   tie-break. A shallower revisit through the cache would leave the
   graph entry's `depth` stuck at the first walk's value. Both
   branches now lower `depth` in place when the current occurrence
   reaches the package at a smaller depth.

3. **Module-doc lead-in contradiction**. The header still said
   `peersCache` / `purePkgs` were "not ported yet"; updated to
   reflect that both land here and `find_hit` + the deep check
   together implement the `peersCache` lookup. The only deferred
   optimisation called out is the `graph-cycles`-driven async
   deferment, which pacquet substitutes with the synchronous
   `in_progress` set.

4. **Test docs duplicated test bodies** (nitpick). The three new
   peer-resolution tests' doc comments narrated the setup + expected
   output already encoded in the assertions. Trimmed each to a
   one-line "why" + upstream reference, per pacquet's CODE_STYLE_GUIDE
   "tests are documentation" rule.

Bench on `astro@^5` (macOS, hot store, hyperfine warmup=1 runs=5):

| | mean | speedup vs main |
|---|---:|---:|
| pacquet@main | 11.346 ± 1.262 s | 1.00× |
| pacquet+peersCache (deep check) | 9.890 ± 1.651 s | ~1.15× |

The deep check rejects some cache hits the simpler `find_hit`
accepted, so the perf number is below the earlier 1.51× claim — but
the algorithm now matches pnpm. The earlier simplified port was a
divergence; correctness over speed.

All 54 tests pass, clippy + fmt + `RUSTDOCFLAGS=-D warnings cargo doc`
clean.
@zkochan zkochan merged commit f5d7723 into main May 24, 2026
28 checks passed
@zkochan zkochan deleted the perf/11900-isnew branch May 24, 2026 19:18
zkochan added a commit that referenced this pull request May 24, 2026
* perf(resolving-deps-resolver): defer per-occurrence child realization until peer resolution

Mirrors upstream pnpm's lazy `children` thunk on `DependenciesTreeNode`:
revisits of a `pkgIdWithPatchHash` no longer recurse to fan out a fresh
NodeId subtree eagerly. Instead the tree node records
`TreeChildren::Lazy { parent_ids }` and the peer resolver allocates
per-occurrence child NodeIds on first descent via `realize_children`,
applying the same `parentIdsContainSequence` cycle break upstream uses
in `buildTree`.

Pure subtrees that the peer resolver already short-circuits through
`purePkgs` (ported in #11906) now skip realization entirely — the tree
never gets walked past the first occurrence of those packages.

Bench (astro deep tree, cold store, single resolver phase):
- tree nodes: 74,940 → 4,069 (~18× smaller)
- `resolve_importer`: 11.6s → 8.2s (~1.42× faster)

Refs #11907.

* fix(resolving-deps-resolver): fix doc + dylint failures from #11915

- Re-export `TreeChildren` and `ChildEdge` from the crate root so the
  intra-doc links from public docs (`resolve_peers`, `ResolvedTree`)
  resolve. They were `pub` on the enum/struct but unreachable because
  `resolved_tree` is a private module.
- Drop the `[Walker::realize_children]` / `[Walker::pure_pkgs]`
  intra-doc references from `resolve_peers`'s public doc — `Walker`
  is private, so the links failed under `--document-private-items`
  with `-D rustdoc::private_intra_doc_links`. The prose still names
  the items in plain backticks.
- Rename closure params `p` and let binding `v` in `realize_children`
  to satisfy `perfectionist::single-letter-{closure-param,let-binding}`.

* fix(resolving-deps-resolver): persist first-walk is_leaf for lazy realization

Mirrors upstream's
[`ResolvedPackage.isLeaf`](https://github.com/pnpm/pnpm/blob/b9de85dcb6/installing/deps-resolver/src/resolveDependencies.ts#L250)
field: `pkg_is_leaf(&result)` is computed once on the first walk and
stored on `ResolvedPackage::is_leaf`; the peer-resolver's
`realize_children` reads it back instead of inferring leaf-ness from
`children_by_id.is_empty() + peer_dependencies.is_empty()`.

The inferred check was a weaker approximation — a package with a
missing manifest (e.g. a git/tarball/local resolution where `result.
manifest` is None) lands on `pkg_is_leaf == false` in the eager walk
but on `is_leaf == true` in the realize path, which would collapse
distinct per-occurrence `NodeId`s onto a shared `NodeId::leaf` and
break the peer resolver's per-call-site state.

Matches upstream's
[`buildTree` consumption](https://github.com/pnpm/pnpm/blob/b9de85dcb6/installing/deps-resolver/src/resolveDependencyTree.ts#L381):
`ctx.resolvedPkgsById[child.id].isLeaf` is the source of truth, not
recomputed per realization.

* style(resolving-deps-resolver): apply cargo fmt for is_leaf binding

* test(resolving-deps-resolver): cover lazy children edge cases

Adds two regression tests for the lazy-children mechanism introduced
in #11915 that the existing coverage didn't hit:

1. `revisit_with_no_manifest_child_keeps_per_occurrence_node_id` —
   a child whose first walk produced `result.manifest == None`
   (the shape git / tarball / local resolvers return) must keep the
   non-leaf classification on every lazy realisation. Without the
   `ResolvedPackage::is_leaf` persistence the realizer would mis-
   classify it as a leaf and collapse distinct occurrences onto a
   shared `NodeId::Leaf`, breaking per-call-site state.

2. `pure_revisit_leaves_lazy_children_unrealized` — a pure pkg
   reached through multiple parents only realises its children for
   the occurrence the peer resolver walks first. Subsequent
   occurrences hit the `purePkgs` short-circuit before
   `realize_children` runs, so their `TreeChildren::Lazy` stays
   Lazy. Regression guard against accidentally moving the realise
   call above the short-circuit.

Both tests were validated by breaking the relevant subject (swapping
`is_leaf` back to the inferred check; moving `realize_children`
above the `purePkgs` gate) and confirming they fail cleanly.

* style(resolving-deps-resolver): drop hyphenated mis- in test docs

CI's typos pass at .typos.toml flags `mis-` (suggests "miss" /
"mist"). Use "misclassify" instead of "mis-classify" — same
word, no hyphen, no typo hit.
@zkochan zkochan mentioned this pull request May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants