Skip to content

perf(pacquet): prefetch packuments during fresh-resolve tree walk#11903

Closed
zkochan wants to merge 2 commits into
mainfrom
perf/11900
Closed

perf(pacquet): prefetch packuments during fresh-resolve tree walk#11903
zkochan wants to merge 2 commits into
mainfrom
perf/11900

Conversation

@zkochan

@zkochan zkochan commented May 24, 2026

Copy link
Copy Markdown
Member

Summary

Closes #11900.

Detaches packument fetches from the deps-resolver's cooperative try_join_all task by firing tokio::spawned resolver.resolve() calls just before each level of the tree walk descends. Two prefetch points, both modeled on pnpm's resolveDependencies BFS-batched fan-out:

  • Top-level prefetch in extend_tree. Direct deps from the importer (and each hoist iteration's freshly-picked peers) start fetching before the cooperative try_join_all walks them.
  • Lookahead inside resolve_node. As soon as a parent's manifest is known, packument fetches for every child name fire on background tasks so the recursive resolve_node calls below await a cache hit instead of a blocking fetch.

The npm resolver's per-(registry, name) packument fetch semaphore (shared_packument_fetch_locker) coalesces the prefetch with the later real resolve() call, so the recursion hits the in-memory packument cache and the per-version manifest cache instead of paying for a fresh network round-trip plus a fresh JSON parse + serde re-roundtrip per call site.

Detached spawns matter because the futures inside try_join_all cooperate on the current task: a CPU-heavy step in one resolve (JSON parse, semver pick, manifest serde) blocks the others until it yields. The prefetch tasks run on the multi-thread executor, so I/O and any duplicate-work-after-cache-hit can land on whichever worker is free.

Plumbing changes

To make tokio::spawn possible the resolver chain is now threaded as Arc<dyn Resolver> (instead of &Chain generic) through resolve_dependency_tree / resolve_importer / extend_tree, and stored on TreeCtx alongside an Arc<ResolveOptions> so the detached tasks bump refcounts instead of cloning PreferredVersions and PackageVersionPolicy per spawn.

Errors from the prefetch are silently dropped — the real resolve() later in resolve_node surfaces the same error through the normal error path. The result is otherwise identical to the existing tree walk: every assertion in the 50 deps-resolver tests + 299 package-manager tests + 96 cli tests still passes.

Test plan

  • cargo nextest run -p pacquet-resolving-deps-resolver — 50/50 pass
  • cargo nextest run -p pacquet-package-manager — 299/299 pass (3 slow, no new slowness)
  • cargo nextest run -p pacquet-cli — 96/96 pass (2 slow, expected)
  • cargo clippy --locked --workspace --all-targets -- --deny warnings — clean
  • cargo fmt --check — clean
  • Wall-clock measurement against the vlt.sh clean astro benchmark to confirm the issue's expected impact ratio. The change matches the upstream prefetch shape the issue calls for; a benchmark run on the same fixture is the next step.

Written by an agent (Claude Code, claude-opus-4-7).

Summary by CodeRabbit

  • Refactor
    • Improved the dependency resolution engine for safer concurrent operation, more efficient resource use, and more robust handling of complex resolution flows—resulting in faster, more reliable installs and fewer race-related issues during package resolution and prefetching.

Review Change Stack

…ve tree walk

Detach packument fetches from the cooperative `try_join_all` task by
firing `tokio::spawn`ed `resolver.resolve()` calls just before each
level recurses:

- Top-level prefetch in `extend_tree` for the importer's direct deps
  (and each hoist iteration's freshly-picked peers).
- Lookahead prefetch inside `resolve_node` as soon as the parent's
  manifest is known, so the grandchildren's packuments are already
  in-flight by the time the recursion enters them.

The npm resolver's per-`(registry, name)` packument fetch semaphore
coalesces the prefetch with the later real `resolve()` call, so the
recursion hits the in-memory packument and per-version manifest
caches instead of a fresh network round-trip plus a fresh JSON
parse + serde re-roundtrip per call site. Errors from the prefetch
are dropped — the real `resolve()` later in `resolve_node` surfaces
the same error through the normal error path.

To make the spawn possible the resolver chain is now threaded as
`Arc<dyn Resolver>` (instead of `&Chain` generic) through
`resolve_dependency_tree` / `resolve_importer` / `extend_tree`, and
stored on `TreeCtx` along with an `Arc<ResolveOptions>` so the
detached tasks just clone refcounts instead of cloning
`PreferredVersions` and `PackageVersionPolicy` per spawn.

Mirrors upstream pnpm's `resolveDependencies` BFS-batched fan-out at
`installing/deps-resolver/src/resolveDependencies.ts#L690-L697` where
every sibling's `pickPackage` runs concurrently via `Promise.all`
before the deferred `postponedResolution` queue descends.

Closes #11900.

---
Written by an agent (Claude Code, claude-opus-4-7).
@qodo-code-review

Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@coderabbitai

coderabbitai Bot commented May 24, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: f82af0d6-9ef5-459a-bd74-a04c1aa3c13a

📥 Commits

Reviewing files that changed from the base of the PR and between 5938f50 and bee3780.

📒 Files selected for processing (1)
  • pacquet/crates/resolving-deps-resolver/src/resolve_dependency_tree.rs
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Lint and Test (ubuntu-latest)
  • GitHub Check: Lint and Test (windows-latest)
  • GitHub Check: Lint and Test (macos-latest)
  • GitHub Check: Doc
  • GitHub Check: Run benchmark on ubuntu-latest
  • GitHub Check: Code Coverage
  • GitHub Check: Run benchmark on ubuntu-latest
  • GitHub Check: Compile & Lint
🧰 Additional context used
📓 Path-based instructions (1)
pacquet/**/*.rs

📄 CodeRabbit inference engine (pacquet/AGENTS.md)

pacquet/**/*.rs: When porting a function that fires pnpm:<channel> events through globalLogger, logger.debug(), or streamParser.write(), mirror the call site, payload, and ordering so the reporter parses pacquet's NDJSON the same way it parses pnpm's.
Declare a newtype wrapper for branded string types. Do not collapse the brand into a plain String or &str.
If upstream always validates before construction, validate in pacquet's wrapper too. The wrapper must construct only via TryFrom<String> and/or FromStr. Do not provide an infallible public constructor.
If upstream never validates, just brand for type-safety. Expose an infallible From<String> (and From<&str> when convenient).
If upstream occasionally constructs without validation, expose from_str_unchecked as an escape hatch alongside the validating constructor.
Match upstream serde behavior for branded types that cross JSON, YAML, or INI boundaries. Use #[serde(try_from = "String")] for deserialization and #[serde(into = "String")] for serialization.
Use #[derive(derive_more::From)] and #[derive(derive_more::Into)] for mechanical conversion impls. Fall back to manual impl only when conversion needs custom logic.
String-literal unions should become enums, not newtype wrappers. Model closed sets of valid string values as enums.
Template literal types should be treated as branded strings with validation discipline from rules 2-5.
Choose owned vs. borrowed parameters to minimize copies. Widen to the most encompassing type (&Path over &PathBuf, &str over &String) when it doesn't force extra copies.
Prefer Arc::clone(&x) / Rc::clone(&x) over x.clone() for reference-counted types, so the cost is visible at the call site.
Follow Rust API Guidelines for naming conventions.
Do not use star imports inside module bodies. Write use super::{Foo, bar} instead of use super::*;. Two forms stay allowed: external-crate preludes like use rayon::prelude::*; and root-of-module re-...

Files:

  • pacquet/crates/resolving-deps-resolver/src/resolve_dependency_tree.rs
🧠 Learnings (3)
📚 Learning: 2026-05-20T19:40:55.051Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11774
File: pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs:0-0
Timestamp: 2026-05-20T19:40:55.051Z
Learning: In the pacquet Rust code, ensure the semver implementation uses the `node-semver` crate (not `nodejs-semver`). `node-semver`’s public API does not include a `satisfies_with_prerelease`-style method; prerelease-tolerant matching should be implemented inline by first calling `Range::satisfies`, and when it rejects a prerelease version, retry matching against a stripped `MAJOR.MINOR.PATCH` base of the prerelease version.

Applied to files:

  • pacquet/crates/resolving-deps-resolver/src/resolve_dependency_tree.rs
📚 Learning: 2026-05-22T00:08:44.646Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11837
File: pacquet/crates/resolving-npm-resolver/src/pick_package.rs:33-51
Timestamp: 2026-05-22T00:08:44.646Z
Learning: In the pnpm/pnpm repo’s pacquet Rust crates, do not flag Unicode ellipsis characters (U+2026, `…`) in Rust doc comments (`///` / `/** */`) as a lint violation. The pacquet crate’s `dylint.toml` only enables `perfectionist::derive_ordering`, and the Dylint `unicode-ellipsis` rule is not enabled for this project—so `…` in doc comments is an intentional, repo-consistent style.

Applied to files:

  • pacquet/crates/resolving-deps-resolver/src/resolve_dependency_tree.rs
📚 Learning: 2026-05-20T23:07:58.444Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11784
File: pacquet/crates/resolving-deps-resolver/src/hoist_peers.rs:120-133
Timestamp: 2026-05-20T23:07:58.444Z
Learning: When reviewing code in this pacquet Rust port, follow the upstream pnpm compatibility rule: only match pnpm’s behavior exactly. Do not propose review changes that intentionally deviate from pnpm’s documented/observed behavior, even if pnpm appears buggy. If you identify a real bug in pnpm behavior, the review should prioritize fixing it upstream in pnpm first, and avoid implementing a pnpm-behavior workaround here unless the same fix has already landed upstream.

Applied to files:

  • pacquet/crates/resolving-deps-resolver/src/resolve_dependency_tree.rs
🔇 Additional comments (1)
pacquet/crates/resolving-deps-resolver/src/resolve_dependency_tree.rs (1)

378-384: LGTM!


📝 Walkthrough

Walkthrough

This PR refactors resolver ownership across the resolver and importer paths to use Arc, stores resolver and ResolveOptions as Arc in TreeCtx, invokes prefetch_packuments to prefetch child packuments before recursion, and updates call sites and tests to the Arc-based API.

Changes

Resolver Arc ownership and prefetch integration

Layer / File(s) Summary
TreeCtx Arc and Arc fields
pacquet/crates/resolving-deps-resolver/src/resolve_dependency_tree.rs
TreeCtx now stores resolver: Arc<dyn Resolver> and base_opts: Arc<ResolveOptions> enabling safe cloning into spawned prefetch tasks.
resolve_dependency_tree API: Arc ownership
pacquet/crates/resolving-deps-resolver/src/resolve_dependency_tree.rs
Public resolve_dependency_tree now takes resolver: Arc<dyn Resolver> and constructs TreeCtx with resolver and options.
extend_tree signature and prefetch_packuments integration
pacquet/crates/resolving-deps-resolver/src/resolve_dependency_tree.rs
extend_tree removes explicit resolver parameter and calls prefetch_packuments using ctx.resolver/ctx.base_opts; resolves wanted deps via resolve_node(ctx, ...).
resolve_node: ctx-based resolver and prefetch before recursion
pacquet/crates/resolving-deps-resolver/src/resolve_dependency_tree.rs
resolve_node uses ctx.resolver.resolve(&wanted, &ctx.base_opts), triggers prefetch_packuments for children before awaiting recursion, and updates recursive calls to the new signature.
resolve_importer: Arc API and TreeCtx wiring
pacquet/crates/resolving-deps-resolver/src/resolve_importer.rs
resolve_importer accepts Arc<dyn Resolver>, passes it into TreeCtx::new, and calls extend_tree(&ctx, ...) at each tree-expansion site without passing resolver explicitly.
install_with_fresh_lockfile: PrefetchingResolver as Arc
pacquet/crates/package-manager/src/install_with_fresh_lockfile.rs
PrefetchingResolver wrapper changed from Box<dyn Resolver> to Arc<dyn Resolver>; calls resolve_importer(Arc::clone(&resolver), ...).
resolve_importer tests: Arc and shared call logging
pacquet/crates/resolving-deps-resolver/src/resolve_importer/tests.rs
Tests updated to construct Arc<dyn Resolver> from StubResolver, use Arc<Mutex<Vec<_>>> for shared call logs, and adapt invocations/assertions accordingly.
resolve_dependency_tree tests: Arc and imports
pacquet/crates/resolving-deps-resolver/src/tests.rs
Numerous tests updated to construct Arc<dyn Resolver>, adjust imports to include Arc and Resolver, and pass Arc::clone(&resolver) into resolve_dependency_tree.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

  • #11900: This PR implements the packument prefetching strategy described in the perf issue by threading Arc<dyn Resolver> and Arc<ResolveOptions> through TreeCtx and adding prefetch_packuments calls before recursive resolution.

Possibly related PRs

  • pnpm/pnpm#11874: Modifies resolve_dependency_tree.rs (TreeCtx/resolve_node) and implements memoization/caching that complements this resolver/context refactor.
  • pnpm/pnpm#11787: Touches resolving-deps-resolver flows; both PRs change resolver wiring and error/option handling paths.
  • pnpm/pnpm#11856: Also touches install_with_fresh_lockfile.rs prefetch/resolver initialization; closely related to the PrefetchingResolver ownership change.

Poem

I hopped through arcs of shared resolve,
Cloned bits of resolver to speed the stroll,
Pre-fetching whispers on each tree's rung,
Mutexed notes in a shared little scroll. 🐇✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'perf(pacquet): prefetch packuments during fresh-resolve tree walk' accurately captures the main change—implementing prefetch optimization for packument fetches during dependency resolution.
Linked Issues check ✅ Passed The pull request fully implements the objectives from issue #11900: prefetching packuments before tree walk, lookahead fetches during recursion, using Arc for tokio::spawn compatibility, and storing Arc on TreeCtx.
Out of Scope Changes check ✅ Passed All changes are in-scope and directly support prefetch implementation: resolver refactoring to Arc, TreeCtx threading, prefetch_packuments spawning, and test updates to support the new signatures.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/11900

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented May 24, 2026

Copy link
Copy Markdown
Contributor

Micro-Benchmark Results

Linux

group                          main                                   pr
-----                          ----                                   --
tarball/download_dependency    1.04      5.4±0.10ms   800.3 KB/sec    1.00      5.2±0.07ms   830.1 KB/sec

`pacquet-resolving-deps-resolver` doesn't depend on
`pacquet-resolving-npm-resolver`, so the intra-doc link to
`shared_packument_fetch_locker` resolves to nothing and trips
`-D rustdoc::broken_intra_doc_links`. Reference the helper by name in
backticks instead.
@codecov-commenter

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 93.33333% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.84%. Comparing base (74a219e) to head (bee3780).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
...lving-deps-resolver/src/resolve_dependency_tree.rs 94.59% 2 Missing ⚠️
...es/resolving-deps-resolver/src/resolve_importer.rs 83.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #11903      +/-   ##
==========================================
- Coverage   87.84%   87.84%   -0.01%     
==========================================
  Files         206      206              
  Lines       24525    24533       +8     
==========================================
+ Hits        21545    21550       +5     
- Misses       2980     2983       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

Copy link
Copy Markdown
Contributor

Integrated-Benchmark Report (Linux)

Scenario: Isolated linker: fresh restore, cold cache + cold store

Command Mean [s] Min [s] Max [s] Relative
pacquet@HEAD 2.495 ± 0.116 2.359 2.724 1.04 ± 0.06
pacquet@main 2.399 ± 0.073 2.298 2.476 1.00
pnpm 4.880 ± 0.030 4.829 4.924 2.03 ± 0.06
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 2.49483338376,
      "stddev": 0.11573594554268046,
      "median": 2.50313421026,
      "user": 2.73696174,
      "system": 3.6343801,
      "min": 2.3587478392600003,
      "max": 2.7240554942600004,
      "times": [
        2.50511265126,
        2.62325076026,
        2.36775714426,
        2.50115576926,
        2.51918680826,
        2.7240554942600004,
        2.3587478392600003,
        2.5142014932600003,
        2.37409513426,
        2.4607707432600003
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 2.3993250014600003,
      "stddev": 0.07287850767173795,
      "median": 2.40912214526,
      "user": 2.7042120400000003,
      "system": 3.5976482999999995,
      "min": 2.2981187482600003,
      "max": 2.47631817626,
      "times": [
        2.47631817626,
        2.47183195826,
        2.4712638252600003,
        2.2981187482600003,
        2.39876058726,
        2.46824691726,
        2.35973895926,
        2.32386414826,
        2.3056229912600004,
        2.41948370326
      ]
    },
    {
      "command": "pnpm",
      "mean": 4.88006519976,
      "stddev": 0.029717276643322556,
      "median": 4.890332494759999,
      "user": 8.316276140000001,
      "system": 4.2179574,
      "min": 4.82868725526,
      "max": 4.92377747426,
      "times": [
        4.92377747426,
        4.90054831626,
        4.843837069259999,
        4.88392616026,
        4.82868725526,
        4.897169572259999,
        4.85237266426,
        4.896738829259999,
        4.87624837026,
        4.8973462862599995
      ]
    }
  ]
}

Scenario: Isolated linker: fresh restore, hot cache + hot store

Command Mean [ms] Min [ms] Max [ms] Relative
pacquet@HEAD 693.1 ± 27.6 675.1 768.5 1.03 ± 0.07
pacquet@main 675.4 ± 37.5 638.7 739.0 1.00
pnpm 2565.1 ± 63.3 2501.6 2678.2 3.80 ± 0.23
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 0.69311290418,
      "stddev": 0.027608993851098988,
      "median": 0.68686870038,
      "user": 0.39370326,
      "system": 1.4800386599999997,
      "min": 0.67512800788,
      "max": 0.76853018888,
      "times": [
        0.76853018888,
        0.69159816888,
        0.67871079188,
        0.68698957888,
        0.68706238288,
        0.70070257388,
        0.68674782188,
        0.67784025288,
        0.67512800788,
        0.6778192738800001
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 0.6753573055800001,
      "stddev": 0.03750540057740294,
      "median": 0.65961719438,
      "user": 0.36563416000000004,
      "system": 1.4640745599999998,
      "min": 0.6387349598800001,
      "max": 0.7390316408800001,
      "times": [
        0.7390316408800001,
        0.6525406798800001,
        0.65866022688,
        0.7346974338800001,
        0.66764365488,
        0.65447563688,
        0.6402649688800001,
        0.6387349598800001,
        0.70694969188,
        0.6605741618800001
      ]
    },
    {
      "command": "pnpm",
      "mean": 2.56513433508,
      "stddev": 0.06332603565450393,
      "median": 2.53984702438,
      "user": 3.2513525599999995,
      "system": 2.2288590599999996,
      "min": 2.50160932288,
      "max": 2.67824857088,
      "times": [
        2.6440181058800003,
        2.51442603088,
        2.58983103688,
        2.52581103388,
        2.50176718788,
        2.55388301488,
        2.61616969988,
        2.50160932288,
        2.67824857088,
        2.5255793468800003
      ]
    }
  ]
}

Scenario: Isolated linker: fresh install, cold cache + cold store

Command Mean [s] Min [s] Max [s] Relative
pacquet@HEAD 6.465 ± 0.134 6.311 6.706 1.34 ± 0.05
pacquet@main 4.840 ± 0.142 4.680 5.098 1.00
pnpm 6.468 ± 0.049 6.378 6.533 1.34 ± 0.04
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 6.46503683852,
      "stddev": 0.1340345106490292,
      "median": 6.421239805920001,
      "user": 13.0965711,
      "system": 3.7909842999999994,
      "min": 6.31107717242,
      "max": 6.70605679242,
      "times": [
        6.40243186142,
        6.43278450742,
        6.70605679242,
        6.36635247242,
        6.43868930142,
        6.31107717242,
        6.64426520642,
        6.34852211042,
        6.59049385642,
        6.409695104420001
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 4.84043628702,
      "stddev": 0.1424980129389664,
      "median": 4.76307670992,
      "user": 6.550233399999999,
      "system": 3.6315041,
      "min": 4.68017984342,
      "max": 5.098198286420001,
      "times": [
        4.74212069942,
        4.68017984342,
        5.060162822420001,
        5.098198286420001,
        4.75714755442,
        4.89816202442,
        4.756542764420001,
        4.75232285742,
        4.890520152420001,
        4.7690058654200005
      ]
    },
    {
      "command": "pnpm",
      "mean": 6.46799614082,
      "stddev": 0.049356668291705215,
      "median": 6.48084137892,
      "user": 10.808118200000001,
      "system": 4.487744099999999,
      "min": 6.37787152242,
      "max": 6.532890749420001,
      "times": [
        6.461692244420001,
        6.37787152242,
        6.501571340420001,
        6.49361436942,
        6.456180570420001,
        6.5038383494200005,
        6.39061950442,
        6.532890749420001,
        6.47722014042,
        6.48446261742
      ]
    }
  ]
}

Scenario: Isolated linker: fresh install, hot cache + hot store

Command Mean [s] Min [s] Max [s] Relative
pacquet@HEAD 5.266 ± 0.160 5.045 5.576 1.40 ± 0.05
pacquet@main 3.770 ± 0.078 3.672 3.921 1.00
pnpm 4.416 ± 0.081 4.301 4.547 1.17 ± 0.03
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 5.2659254644,
      "stddev": 0.16019328125847948,
      "median": 5.263438150600001,
      "user": 10.83401066,
      "system": 2.4446179199999998,
      "min": 5.0446142066,
      "max": 5.5761796186,
      "times": [
        5.0446142066,
        5.454158014600001,
        5.124632597600001,
        5.306157965600001,
        5.307611344600001,
        5.202063342600001,
        5.281986180600001,
        5.2448901206,
        5.1169612526,
        5.5761796186
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 3.7704669025,
      "stddev": 0.07841988062261988,
      "median": 3.7541864146,
      "user": 4.15980176,
      "system": 2.19442382,
      "min": 3.6719549315999997,
      "max": 3.9214461926,
      "times": [
        3.8286262236,
        3.6719549315999997,
        3.7280666256,
        3.6766699606,
        3.8418951466,
        3.8030178806,
        3.7480256206,
        3.9214461926,
        3.7246192346,
        3.7603472086
      ]
    },
    {
      "command": "pnpm",
      "mean": 4.416248005600001,
      "stddev": 0.08117547840113731,
      "median": 4.419385741100001,
      "user": 5.57652716,
      "system": 2.60115772,
      "min": 4.3007384466000005,
      "max": 4.5471464696,
      "times": [
        4.4477229596,
        4.5137429196,
        4.391750404600001,
        4.3548616746,
        4.3007384466000005,
        4.5471464696,
        4.4470210776000005,
        4.4672589436,
        4.3291105956,
        4.3631265646
      ]
    }
  ]
}

@github-actions

Copy link
Copy Markdown
Contributor

🐰 Bencher Report

Branchpr/11903
Testbedpacquet

🚨 2 Alerts

BenchmarkMeasure
Units
ViewBenchmark Result
(Result Δ%)
Upper Boundary
(Limit %)
isolated-linker.fresh-install.cold-cache.cold-storeLatency
seconds (s)
📈 plot
🚷 threshold
🚨 alert (🔔)
6.47 s
(+29.60%)Baseline: 4.99 s
5.99 s
(108.00%)

isolated-linker.fresh-install.hot-cache.hot-storeLatency
seconds (s)
📈 plot
🚷 threshold
🚨 alert (🔔)
5.27 s
(+34.45%)Baseline: 3.92 s
4.70 s
(112.04%)

Click to view all benchmark results
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
isolated-linker.fresh-install.cold-cache.cold-store📈 view plot
🚷 view threshold
🚨 view alert (🔔)
6,465.04 ms
(+29.60%)Baseline: 4,988.42 ms
5,986.10 ms
(108.00%)

isolated-linker.fresh-install.hot-cache.hot-store📈 view plot
🚷 view threshold
🚨 view alert (🔔)
5,265.93 ms
(+34.45%)Baseline: 3,916.76 ms
4,700.12 ms
(112.04%)

isolated-linker.fresh-restore.cold-cache.cold-store📈 view plot
🚷 view threshold
2,494.83 ms
(+4.63%)Baseline: 2,384.43 ms
2,861.32 ms
(87.19%)
isolated-linker.fresh-restore.hot-cache.hot-store📈 view plot
🚷 view threshold
693.11 ms
(+5.53%)Baseline: 656.79 ms
788.14 ms
(87.94%)
🐰 View full continuous benchmarking report in Bencher

@zkochan

zkochan commented May 24, 2026

Copy link
Copy Markdown
Member Author

Closing this. The integrated benchmark proves the approach is wrong on the targeted path:

Scenario pacquet@main pacquet@HEAD Δ
Isolated linker: fresh install, cold cache + cold store 4.840 s 6.465 s +34%
Isolated linker: fresh install, hot cache + hot store 3.770 s 5.266 s +40%
Isolated linker: fresh restore, cold cache + cold store 2.399 s 2.495 s +4% (≈noise)
Isolated linker: fresh restore, hot cache + hot store 675 ms 693 ms +3% (≈noise)

The hot/hot regression is the smoking gun: with everything an in-memory cache hit there is no I/O to parallelize, so the prefetch just doubles every resolver.resolve() call (once on the spawned task, once on the recursion) and pays the tokio::spawn overhead per child spec — pure cost, no benefit. The original try_join_all was already giving the same I/O-multiplexed concurrency pnpm gets from Promise.all; multi-thread spawning on top of cooperative futures helps only when there is meaningful CPU work that isn't already serialized by a downstream resource (the per-(registry, name) packument semaphore, the throttled HTTP client, the tarball mem cache), and in this hot path there isn't.

Worth noting that the issue's premise (pacquet 4–12× slower than pnpm on fresh-resolve, sourced from benchmarks.vlt.sh/latest/chart-data.json) was based on pacquet 0.2.8 vs pnpm 11.2.2. The current integrated-benchmark numbers show pacquet@main is already 2× faster than pnpm on the same path — the original perf gap has closed under unrelated work since the issue was filed.

Leaving the commits in for the record in case a future attempt wants to look at what doesn't work and why. The two changes were:

  1. Arc<dyn Resolver> threading through extend_tree / resolve_importer / resolve_dependency_tree so the walker can spawn.
  2. prefetch_packuments helper called at every recursion level, firing tokio::spawned resolver.resolve() calls coalesced via the packument fetch semaphore.

Both compile and pass tests, but the runtime cost-benefit is upside down. Closing without merge.


Written by an agent (Claude Code, claude-opus-4-7).

@zkochan zkochan closed this May 24, 2026
zkochan added a commit that referenced this pull request Jun 11, 2026
…ndexed metadata mirror (#12322)

## Why

Profiling a warm babylon resolve (metadata mirrors hot, ~520 MB across 1,476 packuments) showed the dominant cost was not I/O but **hydration**: every version of every packument was deserialized into typed manifests — maps, strings, and `serde_json::Value` trees built, hashed, and dropped — even though a pick consults only the version *strings* plus the handful of manifests it actually considers. typescript ships 3,800 versions; a pick needs one. The `#[serde(flatten)]` catch-all on `PackageVersion` compounded it by routing the whole struct through serde's buffering deserializer. The same hydration cost was paid on cold resolves inside the `spawn_blocking` parses from #12318.

## What

Three changes, applied in profile-driven order:

**1. Lazy packument hydration** (`df6f70eb57`). `Package::versions` becomes a `PackageVersions` map whose entries hold the raw JSON fragment serde captured (`Arc<RawValue>`, shared not copied) and hydrate into `Arc<PackageVersion>` on first access, cached per slot. Key scans never hydrate; `pinned_version` hydrates only the winner; the publish-date filter moves slots without touching manifests; undecodable fragments degrade to "version absent" (matching the JS implementation's tolerance); serialization re-emits raw fragments verbatim. Picked manifests travel as `Arc<PackageVersion>`. Enables serde_json's `raw_value` feature (already a workspace dep).

**2. Sharded in-memory packument cache** (`4c28c6679e`). Every resolve edge consults the shared meta cache, and its single `Mutex<HashMap>` was the top mutex-wait site in the profile; it is now the same `DashMap` shape the crate's other shared maps use. Honest note: wall time was unchanged by this alone — the post-hydration profile shows the warm resolve is critical-path-bound, not lock-bound — but the contention disappears and the cache no longer serializes workers under load.

**3. Indexed on-disk metadata mirror** (`126a416ae8`, maintainer-approved cache-format break). The two-line NDJSON mirror (headers + verbatim body) becomes:

```
pacquet-meta-v1 <headers_len> <index_len>\n
<headers JSON>     etag, modified
<index JSON>       name, dist-tags, time, homepage, versions: [version, offset, len]
<fragments>        concatenated raw per-version JSON
```

The loader reads the file once and hands `PackageVersions` byte spans into that buffer — no structural re-scan, hydration parses a slice in place. The writer streams the registry's own bytes (fragments borrow from the lazily-parsed response body), so the **cold-install cost stays one temp-file + rename per package**. Old-format files read as cache misses and are rewritten on the next 200. A span-per-fragment `pread` variant was tried first and measured *worse* (sys 0.4 s → 4.5 s; the pick paths probe many candidate fragments per package), hence the single-buffer design.

## Measurements (warm babylon `--lockfile-only` resolve, 10-core M-series)

| build | wall | notes |
|---|---|---|
| before this PR | ~9.1 s | malloc/free + `deserialize_any` dominate the profile |
| + lazy hydration | ~7.7–8 s | parse microbench 3–3.8× (`typescript` 36.1→9.5 ms) |
| + indexed mirror | **~5.5–6.7 s** | sys 0.4 s; no whole-body scan |

Cold resolves keep the same hydration savings inside the parse tasks; cold write volume and syscall pattern are unchanged.

## Evaluated and deliberately not included

Consolidating the install-phase thread pools (tokio + global rayon + the dedicated CAS-write pool + capped blocking threads show ~100k involuntary context switches on cold installs vs ~750 for the pnpm CLI). After the resolution fixes, repeated container A/Bs show no measurable wall-clock cost from the churn — it hides entirely behind network time — and history warns that speculative concurrency reshuffles here regress badly (see the #11903 prefetch revert). Deferred until a benchmark shows it on the critical path.

## Tests

New `package_versions` unit tests (hydrate-on-demand + Arc-identity caching, undecodable-fragment-as-absent, verbatim raw round-trip incl. unknown keys, eager construction, hydration-free filtering) and rewritten `mirror` tests (headers/index/fragment round-trip, span hydration, truncation → cache miss, old-format → cache miss, atomic overwrite). Full suites green: `pacquet-registry` (23), `pacquet-resolving-npm-resolver` (235), `pacquet-package-manager` + `pacquet-cli` (768); workspace clippy `-D warnings` (pedantic set), dylint, fmt.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf(pacquet): batch/prefetch packument fetches during fresh-resolve tree walk

2 participants