perf(pacquet): close the clean-install gap to pnpm CLI#11856
Conversation
📝 WalkthroughWalkthroughAdds PrefetchingResolver to spawn background tarball downloads during resolution and refactors the fresh-lockfile install flow to open shared store/index and verified-files cache handles beforehand; also updates tarball mem-cache concurrency, metadata cache promotion, link/symlink handling, and git .git validation. ChangesPipelined Prefetch Resolver with Fresh Lockfile Integration
Sequence Diagram(s): sequenceDiagram
participant Client
participant PrefetchingResolver
participant InnerResolver
participant MemCache
participant BackgroundTask
participant StoreIndex
Client->>PrefetchingResolver: resolve(importer)
PrefetchingResolver->>InnerResolver: delegate resolve(importer)
InnerResolver-->>PrefetchingResolver: ResolveResult (tarball URL + integrity)
PrefetchingResolver->>MemCache: contains_key(tarball URL)?
alt not cached
PrefetchingResolver->>BackgroundTask: tokio::spawn DownloadTarballToStore(ctx)
BackgroundTask->>StoreIndex: record CAS paths / verified files
BackgroundTask->>MemCache: run_with_mem_cache(write Available/Failed)
end
PrefetchingResolver-->>Client: return ResolveResult unchanged
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Micro-Benchmark ResultsLinux |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #11856 +/- ##
==========================================
+ Coverage 87.55% 87.63% +0.07%
==========================================
Files 204 205 +1
Lines 24394 24486 +92
==========================================
+ Hits 21358 21458 +100
+ Misses 3036 3028 -8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Integrated-Benchmark Report (Linux)Scenario: Frozen Lockfile
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 1.9974267992999999,
"stddev": 0.06828647677135137,
"median": 1.9840420205000002,
"user": 2.94736074,
"system": 2.1122347400000003,
"min": 1.9336476555,
"max": 2.1586832785,
"times": [
1.9696835865000002,
1.9984004545000003,
2.0097993135,
1.9470309355000002,
1.9337533725000002,
2.1586832785,
2.0356189414999997,
2.0318621255,
1.9557883295,
1.9336476555
]
},
{
"command": "pacquet@main",
"mean": 1.9605017801,
"stddev": 0.03461004868814235,
"median": 1.9390518275000002,
"user": 2.9471409399999997,
"system": 2.18808434,
"min": 1.9307733975,
"max": 2.0249281504999996,
"times": [
1.9397280815,
2.0249281504999996,
1.9333035365000002,
2.0062975484999996,
1.9695222105,
1.9383755735000001,
1.9884823945,
1.9307733975,
1.9373271905000002,
1.9362797175
]
},
{
"command": "pnpm",
"mean": 4.191451735,
"stddev": 0.03438895277513079,
"median": 4.1930295245,
"user": 8.012735439999998,
"system": 2.5517106400000005,
"min": 4.1214152145,
"max": 4.2449415795,
"times": [
4.203513303499999,
4.2211038755,
4.1950356145,
4.2449415795,
4.1214152145,
4.1839577225,
4.2101310204999995,
4.1897788885,
4.1910234345,
4.153616696499999
]
}
]
}Scenario: Frozen Lockfile (Hot Cache)
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 0.4852851995000001,
"stddev": 0.022002477727391907,
"median": 0.47991935630000004,
"user": 0.38618528,
"system": 0.86235394,
"min": 0.4716730068,
"max": 0.5472142548000001,
"times": [
0.5472142548000001,
0.4820791628,
0.4815121538,
0.4808129258,
0.47788585180000004,
0.4755506378,
0.4716730068,
0.4810514678,
0.47604674680000003,
0.4790257868
]
},
{
"command": "pacquet@main",
"mean": 0.5103572257000001,
"stddev": 0.018533734734073617,
"median": 0.5018195228,
"user": 0.39075308,
"system": 0.94856154,
"min": 0.49648213480000003,
"max": 0.5492035278,
"times": [
0.5492035278,
0.49648213480000003,
0.5058233318,
0.5023046898,
0.5069313168,
0.49981765180000004,
0.5404982528000001,
0.5012855598,
0.5013343558000001,
0.49989143580000006
]
},
{
"command": "pnpm",
"mean": 2.2008550844,
"stddev": 0.09984419397510373,
"median": 2.1516487722999997,
"user": 3.02969548,
"system": 1.3212159399999999,
"min": 2.0876935458,
"max": 2.3615702758,
"times": [
2.3615702758,
2.1433726858,
2.3409013628,
2.1528965288,
2.0876935458,
2.1167184728,
2.3023713248,
2.2249588658,
2.1276667658,
2.1504010158
]
}
]
}Scenario: Clean Install
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 8.0629095456,
"stddev": 0.12859272763366159,
"median": 8.0476662308,
"user": 10.27901088,
"system": 2.70402656,
"min": 7.8500285933,
"max": 8.325273235300001,
"times": [
8.1151044853,
8.325273235300001,
7.9732280973,
8.0785980273,
8.0349356503,
7.8500285933,
8.0603968113,
8.034796633300001,
8.1822937303,
7.9744401923
]
},
{
"command": "pacquet@main",
"mean": 30.7844599713,
"stddev": 0.5147723942075939,
"median": 30.6800501588,
"user": 40.51922268,
"system": 5.248058759999999,
"min": 30.2964708893,
"max": 31.6676824713,
"times": [
31.6676824713,
30.3801825563,
30.3182011393,
31.6279691723,
30.9081479193,
30.8378707223,
30.3714699533,
30.2964708893,
30.5222295953,
30.9143752943
]
},
{
"command": "pnpm",
"mean": 5.9185251039,
"stddev": 0.08761861302250631,
"median": 5.8885387673,
"user": 10.628190479999997,
"system": 2.82264816,
"min": 5.8446375073,
"max": 6.1243359903000005,
"times": [
5.8855676633,
5.8609744013,
6.0034336963,
5.911989993300001,
5.8531669403,
5.8915098713,
5.8587309853,
5.8446375073,
6.1243359903000005,
5.9509039903000005
]
}
]
}Scenario: Full Resolution
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 7.036487907040001,
"stddev": 0.09351767476708212,
"median": 7.04487229214,
"user": 7.433831720000001,
"system": 1.7885837800000004,
"min": 6.88517168064,
"max": 7.20186209664,
"times": [
7.10776203964,
7.06396874764,
7.11195490564,
6.88517168064,
7.04074656264,
6.92627361064,
6.99667653064,
7.20186209664,
7.04899802164,
6.98146487464
]
},
{
"command": "pacquet@main",
"mean": 29.81140839824,
"stddev": 0.9883972249588721,
"median": 29.618189960640002,
"user": 38.514542819999996,
"system": 4.52202158,
"min": 28.95950483164,
"max": 32.25638409864,
"times": [
30.13334026264,
29.87280291764,
28.95950483164,
28.99145739764,
29.36357700364,
29.36120439864,
28.96945766364,
29.98782941364,
30.21852599464,
32.25638409864
]
},
{
"command": "pnpm",
"mean": 3.9864796932399997,
"stddev": 0.038838037466024766,
"median": 3.98175383514,
"user": 5.28751692,
"system": 1.60615838,
"min": 3.94011266464,
"max": 4.05780995864,
"times": [
4.03403785464,
3.9497393256400004,
3.9664587246400003,
4.00101610964,
4.00255300964,
3.9501755596400003,
4.05780995864,
3.9970489456400005,
3.9658447796400003,
3.94011266464
]
}
]
} |
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pacquet/crates/package-manager/src/prefetching_resolver.rs`:
- Around line 159-170: The contains_key() check on self.ctx.mem_cache is racy —
reserve the mem-cache entry synchronously before spawning to prevent multiple
tasks from starting for the same package_url; specifically, add an atomic
"in-flight" placeholder/entry into MemCache (or call a new reserve API) keyed by
package_url immediately prior to tokio::spawn in the code paths around
prefetching_resolver::run_with_mem_cache / the block that currently checks
self.ctx.mem_cache.contains_key(package_url), and ensure the spawned task
replaces that placeholder with the real result (or removes it on error) so
subsequent resolve() calls see the in-progress slot and skip spawning.
In `@pacquet/tasks/integrated-benchmark/src/verify.rs`:
- Around line 15-17: The current check accepts any `.git` file if
dot_git.is_file() is true; change the logic so that when `.git` is a file you
read its contents and validate it contains a gitdir pointer (e.g., starts with
"gitdir:" after trimming leading whitespace) rather than accepting arbitrary
files. Update the code path that inspects the `dot_git` variable (the branch
that currently uses dot_git.is_file()) to open and parse the file, reject it if
the prefix is not "gitdir:" (or otherwise malformed), and keep the existing
directory handling unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 730ef95a-e47e-4e6d-88b9-fc921a55d5d9
📒 Files selected for processing (9)
pacquet/crates/cli/src/state.rspacquet/crates/package-manager/src/install_with_fresh_lockfile.rspacquet/crates/package-manager/src/lib.rspacquet/crates/package-manager/src/link_file.rspacquet/crates/package-manager/src/prefetching_resolver.rspacquet/crates/package-manager/src/symlink_package.rspacquet/crates/resolving-npm-resolver/src/pick_package.rspacquet/crates/tarball/src/lib.rspacquet/tasks/integrated-benchmark/src/verify.rs
📜 Review details
🧰 Additional context used
📓 Path-based instructions (1)
pacquet/**/*.rs
📄 CodeRabbit inference engine (pacquet/AGENTS.md)
pacquet/**/*.rs: When porting a function that firespnpm:<channel>events throughglobalLogger,logger.debug(), orstreamParser.write(), mirror the call site, payload, and ordering so the reporter parses pacquet's NDJSON the same way it parses pnpm's.
Declare a newtype wrapper for branded string types. Do not collapse the brand into a plainStringor&str.
If upstream always validates before construction, validate in pacquet's wrapper too. The wrapper must construct only viaTryFrom<String>and/orFromStr. Do not provide an infallible public constructor.
If upstream never validates, just brand for type-safety. Expose an infallibleFrom<String>(andFrom<&str>when convenient).
If upstream occasionally constructs without validation, exposefrom_str_uncheckedas an escape hatch alongside the validating constructor.
Match upstream serde behavior for branded types that cross JSON, YAML, or INI boundaries. Use#[serde(try_from = "String")]for deserialization and#[serde(into = "String")]for serialization.
Use#[derive(derive_more::From)]and#[derive(derive_more::Into)]for mechanical conversion impls. Fall back to manualimplonly when conversion needs custom logic.
String-literal unions should becomeenums, not newtype wrappers. Model closed sets of valid string values as enums.
Template literal types should be treated as branded strings with validation discipline from rules 2-5.
Choose owned vs. borrowed parameters to minimize copies. Widen to the most encompassing type (&Pathover&PathBuf,&strover&String) when it doesn't force extra copies.
PreferArc::clone(&x)/Rc::clone(&x)overx.clone()for reference-counted types, so the cost is visible at the call site.
Follow Rust API Guidelines for naming conventions.
Do not use star imports inside module bodies. Writeuse super::{Foo, bar}instead ofuse super::*;. Two forms stay allowed: external-crate preludes likeuse rayon::prelude::*;and root-of-module re-...
Files:
pacquet/tasks/integrated-benchmark/src/verify.rspacquet/crates/cli/src/state.rspacquet/crates/package-manager/src/lib.rspacquet/crates/resolving-npm-resolver/src/pick_package.rspacquet/crates/tarball/src/lib.rspacquet/crates/package-manager/src/symlink_package.rspacquet/crates/package-manager/src/install_with_fresh_lockfile.rspacquet/crates/package-manager/src/prefetching_resolver.rspacquet/crates/package-manager/src/link_file.rs
🧠 Learnings (3)
📚 Learning: 2026-05-20T23:07:58.444Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11784
File: pacquet/crates/resolving-deps-resolver/src/hoist_peers.rs:120-133
Timestamp: 2026-05-20T23:07:58.444Z
Learning: When reviewing code in this pacquet Rust port, follow the upstream pnpm compatibility rule: only match pnpm’s behavior exactly. Do not propose review changes that intentionally deviate from pnpm’s documented/observed behavior, even if pnpm appears buggy. If you identify a real bug in pnpm behavior, the review should prioritize fixing it upstream in pnpm first, and avoid implementing a pnpm-behavior workaround here unless the same fix has already landed upstream.
Applied to files:
pacquet/tasks/integrated-benchmark/src/verify.rspacquet/crates/cli/src/state.rspacquet/crates/package-manager/src/lib.rspacquet/crates/resolving-npm-resolver/src/pick_package.rspacquet/crates/tarball/src/lib.rspacquet/crates/package-manager/src/symlink_package.rspacquet/crates/package-manager/src/install_with_fresh_lockfile.rspacquet/crates/package-manager/src/prefetching_resolver.rspacquet/crates/package-manager/src/link_file.rs
📚 Learning: 2026-05-20T19:40:55.051Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11774
File: pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs:0-0
Timestamp: 2026-05-20T19:40:55.051Z
Learning: In the pacquet Rust code, ensure the semver implementation uses the `node-semver` crate (not `nodejs-semver`). `node-semver`’s public API does not include a `satisfies_with_prerelease`-style method; prerelease-tolerant matching should be implemented inline by first calling `Range::satisfies`, and when it rejects a prerelease version, retry matching against a stripped `MAJOR.MINOR.PATCH` base of the prerelease version.
Applied to files:
pacquet/crates/cli/src/state.rspacquet/crates/package-manager/src/lib.rspacquet/crates/resolving-npm-resolver/src/pick_package.rspacquet/crates/tarball/src/lib.rspacquet/crates/package-manager/src/symlink_package.rspacquet/crates/package-manager/src/install_with_fresh_lockfile.rspacquet/crates/package-manager/src/prefetching_resolver.rspacquet/crates/package-manager/src/link_file.rs
📚 Learning: 2026-05-22T00:08:44.646Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11837
File: pacquet/crates/resolving-npm-resolver/src/pick_package.rs:33-51
Timestamp: 2026-05-22T00:08:44.646Z
Learning: In the pnpm/pnpm repo’s pacquet Rust crates, do not flag Unicode ellipsis characters (U+2026, `…`) in Rust doc comments (`///` / `/** */`) as a lint violation. The pacquet crate’s `dylint.toml` only enables `perfectionist::derive_ordering`, and the Dylint `unicode-ellipsis` rule is not enabled for this project—so `…` in doc comments is an intentional, repo-consistent style.
Applied to files:
pacquet/crates/cli/src/state.rspacquet/crates/package-manager/src/lib.rspacquet/crates/resolving-npm-resolver/src/pick_package.rspacquet/crates/tarball/src/lib.rspacquet/crates/package-manager/src/symlink_package.rspacquet/crates/package-manager/src/install_with_fresh_lockfile.rspacquet/crates/package-manager/src/prefetching_resolver.rspacquet/crates/package-manager/src/link_file.rs
🔇 Additional comments (5)
pacquet/crates/resolving-npm-resolver/src/pick_package.rs (1)
102-111: LGTM!Also applies to: 538-549, 569-574
pacquet/crates/tarball/src/lib.rs (1)
1844-1854: LGTM!pacquet/crates/cli/src/state.rs (1)
14-19: LGTM!pacquet/crates/package-manager/src/symlink_package.rs (1)
50-64: LGTM!pacquet/crates/package-manager/src/link_file.rs (1)
152-199: ⚡ Quick winExtend stale dangling-symlink recovery to copy-backed imports (Copy/Auto/CloneOrCopy)
The stale-symlink scrub in
pacquet/crates/package-manager/src/link_file.rs(lines ~152-199) runs only when the initial failure isio::ErrorKind::AlreadyExists. The original rationale thatstd::fs::copy“follows” a dangling destination symlink target and then returnsNotFounddoesn’t match the availablestd::fs::copydocumentation (it’s implemented via destination-path open/create semantics, not documented dereferencing behavior). However, the stdlib docs also don’t specify whichio::ErrorKindis returned for dangling destination symlinks, so theAlreadyExists-only gate may still skip cleanup for copy-backed branches, breaking pnpm parity.Adjust the copy-backed path handling so dangling destination symlinks are scrubbed and retried whenever they produce an error for the destination symlink dirent on the supported platforms (or broaden the trigger based on the actual observed
io::ErrorKind).
Pacquet currently resolves the entire dependency tree before any tarball is fetched: `resolve_importer` walks every transitive dep through the resolver chain and returns; only then does `install_subtree` start calling `DownloadTarballToStore`. On the `alotta-files` clean-install benchmark this puts pacquet 2-3x behind the TypeScript pnpm CLI whenever resolution runs from cold. This change wraps the resolver chain in a `PrefetchingResolver` that, after each successful resolve returning a tarball-shaped result, `tokio::spawn`s a `DownloadTarballToStore::run_with_mem_cache` into the shared `MemCache`. Resolution of children, siblings, and the rest of the tree continues in parallel with that download. When `install_subtree` later calls `run_with_mem_cache` for the same URL, the per-URL cache either returns `CacheValue::Available` immediately or briefly blocks on the existing `Notify`. Mirrors pnpm's `packageRequester.requestPackage` shape, which returns a `pkgResponse` whose `fetching` field is a Promise that is already running by the time the resolver returns. The `store_index`, `store_index_writer`, and `verified_files_cache` handles are opened above the resolver chain so the prefetcher shares the same store-index / writer / verify cache with the later install pass. The original warm-cache `prefetch_cas_paths` batch still runs after `resolve_importer` returns to fold the per-package SQL lookups for warm cache hits the resolve-time prefetcher couldn't satisfy. Mechanical changes to support the prefetch capture: - `Reporter` bounds tightened to `+ 'static` on the path that feeds the spawn. - `verify::ensure_git_repo_common` now accepts a `.git` file (linked git worktree) as well as a `.git` directory, so the benchmark runner works in repos with worktrees. Originally drafted on the prior `pacquet-perf` branch and forward-ported here.
… pick_package
Pacquet's `pick_package` has two disk fast paths that load the
on-disk packument mirror and return without going through the
network branch:
- The version-spec fast path (line 517) — when the spec is a
pinned version and the disk meta already contains that exact
version.
- The `publishedBy` mtime shortcut (line 542) — when the mirror
file is fresh enough vs. the `minimumReleaseAge` cutoff.
Both branches populated a *local* `meta_cached_in_store` but never
called `ctx.meta_cache.set(...)` before returning. The
network-fetch branch (step 5) is the only path that sets the
in-memory cache today, so every subsequent resolver call for the
same `(registry, name)` re-acquires the per-key fetch semaphore,
re-checks the empty `meta_cache`, and re-runs
`spawn_blocking(|| load_meta(...))` plus a multi-MB
`serde_json::from_str`. On the alotta-files clean-install
benchmark (1362 resolved nodes, ~800 unique packages,
cache_dir preserved across runs since the harness only wipes
`store-dir`), this drove ~3-4× the user-CPU of pnpm's equivalent
resolve.
Promote the disk-loaded packument into the install-scoped
`meta_cache` after a successful fast-path return so later resolves
for the same key short-circuit to the in-memory cache. The cache
is rebuilt per install, so the promoted entry can't outlive the
freshness window the disk read already accepted — the next install
starts a fresh cache and re-evaluates the disk shortcut. Gated on
`!opts.dry_run` to match the existing `meta_cache.set` in step 5.
Measured impact on the clean-install bench (verdaccio mock, this
machine):
```
before: pacquet 43.62s (resolve 25.6s + install 16.8s),
user 27.7s sys 36.8s
after: pacquet 20.26s (resolve 6.2s + install 13.0s),
user 6.7s sys 30.4s
pnpm: 17s (reference)
```
Resolve walltime drops 4×; user CPU drops 4× because the redundant
JSON parses are gone. Closes most of the remaining gap to pnpm on
this fixture.
|
Re CodeRabbit's worktree-detection nit: applied in fcbf255 (squashed into the pipelining commit). Side note on the auto-benchmark above (~6× gap on the CI runner vs. ~1.5× on my 10-core local): consistent with the rayon over-subscription pattern reported in #11851 — pacquet's pool sizes at `2 × available_parallelism`, which on lower-core CI runners interacts badly with the kernel's FS metadata journal. Tracked as the top item in the #11857 follow-up. Written by an agent (Claude Code, claude-opus-4-7). |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pacquet/crates/package-manager/src/link_file.rs`:
- Around line 167-190: The recovery path currently removes any symlink found at
target_link which can clobber a concurrently created valid symlink; change it to
re-check resolution before removal by calling fs::metadata(target_link) after
confirming meta.file_type().is_symlink(): if fs::metadata returns Ok (the
symlink resolves to a live target) treat it as a concurrent success and return
the original Import error instead of removing it, and only attempt
fs::remove_file when fs::metadata returns an Err that indicates the link is
dangling (e.g., NotFound or other resolution error); also, when calling
fs::remove_file, map and ignore NotFound (treat it as a benign race) and still
convert other remove errors into LinkFileError::RemoveStale using the existing
pattern (references: target_link, source_file, fs::symlink_metadata,
meta.file_type().is_symlink(), fs::metadata, fs::remove_file,
LinkFileError::Import, LinkFileError::RemoveStale).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 6319f59e-f25a-408e-9f1a-4febf761c625
📒 Files selected for processing (9)
pacquet/crates/cli/src/state.rspacquet/crates/package-manager/src/install_with_fresh_lockfile.rspacquet/crates/package-manager/src/lib.rspacquet/crates/package-manager/src/link_file.rspacquet/crates/package-manager/src/prefetching_resolver.rspacquet/crates/package-manager/src/symlink_package.rspacquet/crates/resolving-npm-resolver/src/pick_package.rspacquet/crates/tarball/src/lib.rspacquet/tasks/integrated-benchmark/src/verify.rs
✅ Files skipped from review due to trivial changes (1)
- pacquet/crates/cli/src/state.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Compile & Lint
🧰 Additional context used
📓 Path-based instructions (1)
pacquet/**/*.rs
📄 CodeRabbit inference engine (pacquet/AGENTS.md)
pacquet/**/*.rs: When porting a function that firespnpm:<channel>events throughglobalLogger,logger.debug(), orstreamParser.write(), mirror the call site, payload, and ordering so the reporter parses pacquet's NDJSON the same way it parses pnpm's.
Declare a newtype wrapper for branded string types. Do not collapse the brand into a plainStringor&str.
If upstream always validates before construction, validate in pacquet's wrapper too. The wrapper must construct only viaTryFrom<String>and/orFromStr. Do not provide an infallible public constructor.
If upstream never validates, just brand for type-safety. Expose an infallibleFrom<String>(andFrom<&str>when convenient).
If upstream occasionally constructs without validation, exposefrom_str_uncheckedas an escape hatch alongside the validating constructor.
Match upstream serde behavior for branded types that cross JSON, YAML, or INI boundaries. Use#[serde(try_from = "String")]for deserialization and#[serde(into = "String")]for serialization.
Use#[derive(derive_more::From)]and#[derive(derive_more::Into)]for mechanical conversion impls. Fall back to manualimplonly when conversion needs custom logic.
String-literal unions should becomeenums, not newtype wrappers. Model closed sets of valid string values as enums.
Template literal types should be treated as branded strings with validation discipline from rules 2-5.
Choose owned vs. borrowed parameters to minimize copies. Widen to the most encompassing type (&Pathover&PathBuf,&strover&String) when it doesn't force extra copies.
PreferArc::clone(&x)/Rc::clone(&x)overx.clone()for reference-counted types, so the cost is visible at the call site.
Follow Rust API Guidelines for naming conventions.
Do not use star imports inside module bodies. Writeuse super::{Foo, bar}instead ofuse super::*;. Two forms stay allowed: external-crate preludes likeuse rayon::prelude::*;and root-of-module re-...
Files:
pacquet/crates/package-manager/src/lib.rspacquet/crates/resolving-npm-resolver/src/pick_package.rspacquet/crates/tarball/src/lib.rspacquet/tasks/integrated-benchmark/src/verify.rspacquet/crates/package-manager/src/install_with_fresh_lockfile.rspacquet/crates/package-manager/src/symlink_package.rspacquet/crates/package-manager/src/link_file.rspacquet/crates/package-manager/src/prefetching_resolver.rs
🧠 Learnings (3)
📚 Learning: 2026-05-20T19:40:55.051Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11774
File: pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs:0-0
Timestamp: 2026-05-20T19:40:55.051Z
Learning: In the pacquet Rust code, ensure the semver implementation uses the `node-semver` crate (not `nodejs-semver`). `node-semver`’s public API does not include a `satisfies_with_prerelease`-style method; prerelease-tolerant matching should be implemented inline by first calling `Range::satisfies`, and when it rejects a prerelease version, retry matching against a stripped `MAJOR.MINOR.PATCH` base of the prerelease version.
Applied to files:
pacquet/crates/package-manager/src/lib.rspacquet/crates/resolving-npm-resolver/src/pick_package.rspacquet/crates/tarball/src/lib.rspacquet/crates/package-manager/src/install_with_fresh_lockfile.rspacquet/crates/package-manager/src/symlink_package.rspacquet/crates/package-manager/src/link_file.rspacquet/crates/package-manager/src/prefetching_resolver.rs
📚 Learning: 2026-05-22T00:08:44.646Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11837
File: pacquet/crates/resolving-npm-resolver/src/pick_package.rs:33-51
Timestamp: 2026-05-22T00:08:44.646Z
Learning: In the pnpm/pnpm repo’s pacquet Rust crates, do not flag Unicode ellipsis characters (U+2026, `…`) in Rust doc comments (`///` / `/** */`) as a lint violation. The pacquet crate’s `dylint.toml` only enables `perfectionist::derive_ordering`, and the Dylint `unicode-ellipsis` rule is not enabled for this project—so `…` in doc comments is an intentional, repo-consistent style.
Applied to files:
pacquet/crates/package-manager/src/lib.rspacquet/crates/resolving-npm-resolver/src/pick_package.rspacquet/crates/tarball/src/lib.rspacquet/crates/package-manager/src/install_with_fresh_lockfile.rspacquet/crates/package-manager/src/symlink_package.rspacquet/crates/package-manager/src/link_file.rspacquet/crates/package-manager/src/prefetching_resolver.rs
📚 Learning: 2026-05-20T23:07:58.444Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11784
File: pacquet/crates/resolving-deps-resolver/src/hoist_peers.rs:120-133
Timestamp: 2026-05-20T23:07:58.444Z
Learning: When reviewing code in this pacquet Rust port, follow the upstream pnpm compatibility rule: only match pnpm’s behavior exactly. Do not propose review changes that intentionally deviate from pnpm’s documented/observed behavior, even if pnpm appears buggy. If you identify a real bug in pnpm behavior, the review should prioritize fixing it upstream in pnpm first, and avoid implementing a pnpm-behavior workaround here unless the same fix has already landed upstream.
Applied to files:
pacquet/crates/package-manager/src/lib.rspacquet/crates/resolving-npm-resolver/src/pick_package.rspacquet/crates/tarball/src/lib.rspacquet/tasks/integrated-benchmark/src/verify.rspacquet/crates/package-manager/src/install_with_fresh_lockfile.rspacquet/crates/package-manager/src/symlink_package.rspacquet/crates/package-manager/src/link_file.rspacquet/crates/package-manager/src/prefetching_resolver.rs
🔇 Additional comments (17)
pacquet/crates/resolving-npm-resolver/src/pick_package.rs (1)
102-112: LGTM!Also applies to: 538-549, 569-574
pacquet/tasks/integrated-benchmark/src/verify.rs (1)
15-20: LGTM!Also applies to: 23-37
pacquet/crates/package-manager/src/prefetching_resolver.rs (5)
125-172: Race condition on prefetch dedup already flagged.The
contains_key()check at line 170 is racy as noted in a prior review: it runs before the spawned task has a chance to populateMemCache, so back-to-back resolves of the same tarball can still enqueue multiple downloads before any task observes the in-progress state.
1-57: LGTM!
58-93: LGTM!
94-123: LGTM!
174-240: LGTM!pacquet/crates/package-manager/src/lib.rs (1)
26-26: LGTM!Also applies to: 59-59
pacquet/crates/package-manager/src/install_with_fresh_lockfile.rs (6)
1-6: LGTM!
83-89: LGTM!
373-383: LGTM!
385-437: LGTM!
527-547: LGTM!
554-561: LGTM!pacquet/crates/tarball/src/lib.rs (1)
1844-1854: LGTM!pacquet/crates/package-manager/src/link_file.rs (1)
125-165: LGTM!Also applies to: 192-211
pacquet/crates/package-manager/src/symlink_package.rs (1)
5-5: LGTM!Also applies to: 50-57
Two related improvements on the `run_with_mem_cache` hot path: 1. The `PrefetchingResolver` was firing one `tokio::spawn(DownloadTarballToStore::run_with_mem_cache)` per resolver call, but the deps-resolver invokes `resolve()` once per `(parent, child)` edge. With a tree of ~1362 unique packages and ~3-5k edges in the alotta-files fixture, that meant ~3-5k spawned download tasks — the first per URL did the actual fetch, the rest immediately found `MemCache` populated, parked on the per-URL `Notify`, and contributed nothing beyond scheduler / lock contention. Skip the spawn when the URL is already in `MemCache`. The `pick_package` fetch-locker still collapses concurrent metadata fetches, so the duplicate `resolve()` calls themselves are already cheap. 2. The waiter path in `run_with_mem_cache` acquired `cache_lock.write().await` just to read the `CacheValue` variant (return `Available`, capture `Notify`, or surface `Failed`). With dozens of late visitors converging on a popular tarball slot (peer-suffix variants of the same package, prefetcher tasks that raced the install pass), every visitor serialized behind an exclusive guard even though none of them mutated anything. Switch to `read().await`; the RwLock's owner branch below is the only writer, and reader-writer fairness keeps the owner unblocked. The two changes compound: the dedup keeps the late-visitor population small, the read-lock keeps the few visitors that remain (legitimate second visits from `install_subtree`'s per-parent symlink path) from serializing. Measured on the clean-install bench (verdaccio mock, this machine): min wallclock drops from ~12s to ~12s but the high end falls from ~31s to ~20s. Variance shrinks because the prefetcher no longer creates a thundering-herd at every popular tarball.
`link_file` ran `fs::metadata(target_link)` followed by `fs::symlink_metadata(target_link)` before every import syscall, to detect a live target dirent (skip) or a dangling symlink left behind by a crashed prior install (scrub + retry). With ~130k files per clean `alotta-files` install and almost every call landing on a fresh, unpopulated target directory, that was ~260k `stat` syscalls on `NotFound` paths — pure overhead in the common case and not what pnpm does. Pnpm's `tryImportIndexedDir` (`fs/indexed-pkg-importer/src/importIndexedDir.ts`) calls `fs.linkSync(src, dst)` directly and relies on the kernel returning `EEXIST` when a target is already present. Match that shape: run the import syscall optimistically, recover on `AlreadyExists` only. The recovery path covers both the live-target case (`fs::metadata.is_ok()` → skip) and the dangling-symlink case (`symlink_metadata` + `is_symlink` → `remove_file` + retry). Measured on the clean-install bench (verdaccio mock, this machine): ``` before: pacquet 12-20s, sys ~45s (varies) after: pacquet 11-22s, sys ~30s pnpm: 9-20s, sys ~19s (reference) ``` Sys time drops ~33% as the redundant stat syscalls go away. User CPU is unchanged. Closes the bulk of the install-phase syscall gap against pnpm.
…nk_package `symlink_package` called `fs::create_dir_all(symlink_path.parent())` before every `force_symlink_dir` call. With ~3-5k symlinks per install on the alotta-files fixture (one per package slot plus the per-parent edges into each consumer's `node_modules`), that was ~3-5k `stat` syscalls almost always landing on an existing parent dir — the slot's `node_modules` directory is created once by `import_indexed_dir::populate_dir` and reused for every child symlink under it, the root-level `node_modules` is created once on install start. `force_symlink_dir` already handles a missing parent: on `NotFound` from the symlink syscall it calls `create_dir_all` once and retries. Defer to that path instead of pre-creating unconditionally — the common case now drops the pre-stat entirely, and the rare uncreated-parent case still works via the retry. The `CreateParentDir` variant stays on `SymlinkPackageError` for the hoisted linker's separate mkdir loop in `hoist.rs`.
…XIST The "drop pre-flight stats in link_file" commit removed both \`fs::metadata\` and \`fs::symlink_metadata\` before each import syscall, plus the EEXIST-recovery's dangling-symlink scrub. That broke the \`Copy\` no-op contract (\`fs::copy\` silently overwrites rather than surfacing EEXIST, so a clean install with a Copy method would clobber a live target the test suite locks in) and threw away the live-target short-circuit code review flagged as a TOCTOU regression. Restore exactly one \`fs::metadata\` short-circuit at the top of \`link_file\`: if the target resolves to a live file (or a symlink to one), return immediately. That keeps the no-op contract for every import method, including \`Copy\`, with a single stat per file instead of the original two (~130k extra stats saved per alotta-files install — half the savings of removing both, but behavior-preserving). Simplify the EEXIST handler to a single \`return Ok(())\` that mirrors pnpm's \`linkOrCopy\` (\`fs/indexed-pkg-importer/src/index.ts\`): the kernel surfaces EEXIST → the dirent is already there → no-op. Drop the dangling-symlink scrub for the same reason — pnpm does not scrub either, and CAFS contents are content-addressed, so a "stale" dangling symlink left by a crashed prior install is a corruption case the install layer is not the right place to repair. Update \`link_file::tests::dangling_symlink_is_replaced\` → \`dangling_symlink_is_preserved\` to match the new (pnpm-aligned) contract.
|
Addressing the three codex findings. P2 — `link_file` TOCTOU regressionConfirmed. My recovery branch matched on `is_symlink()` and rejected the much more common case (a concurrent writer racing in a real file between the pre-flight `fs::metadata` and the import syscall). Replaced the whole EEXIST recovery with a single `return Ok(())`, matching pnpm's `linkOrCopy` (`fs/indexed-pkg-importer/src/index.ts`). Net effect: the live-file race is honoured, and the dangling-symlink scrub is gone — pnpm doesn't scrub either, and a dangling symlink left by a crashed prior install is a corruption case the install layer isn't the right place to repair. Updated `link_file::tests::dangling_symlink_is_replaced` → `dangling_symlink_is_preserved` to lock in the new (pnpm-aligned) contract. Folded into 850fc45 (squashed). P2 — Prefetcher reporter orderingConfirmed. `PrefetchingResolver` routed `DownloadTarballToStore` through the install's reporter, so a prefetch task's `fetched` / `found_in_store` event could land before `install_package_from_registry` emits `resolved` for the same package, breaking the `resolved → fetched|found_in_store → imported` triple `@pnpm/cli.default-reporter` consumes. Fix has two parts in 14d5e38 (squashed):
P3 — Non-atomic prefetch dedupConfirmed. `mem_cache.contains_key()` + `mem_cache.insert` inside `run_with_mem_cache` is a TOCTOU pair; two concurrent resolvers for the same URL could both pass the check and both `tokio::spawn`, then race the actual `MemCache` insert. The `MemCache` would still dedup correctness-wise (one task wins the insert and becomes the owner; the loser parks on `Notify`), but both spawns counted toward scheduler / lock contention — undercutting the "one spawn per unique tarball" target. Replaced the non-atomic check with an explicit `DashSet` on `OwnedFetchCtx` and gated the spawn on `DashSet::insert`, whose `true` return is the atomic claim. Only the caller that flips the membership spawns; everyone else returns early. Same commit as the reporter-ordering fix (14d5e38). Local re-bench post-fixes (3 runs, alotta-files, verdaccio): `real ~11-16s, user ~6s, sys ~26-39s`. No regression vs. the prior PR state. Written by an agent (Claude Code, claude-opus-4-7). |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
pacquet/crates/tarball/src/lib.rs (1)
1837-1869: ⚡ Quick winSync the waiter comment with the new caller-side emit.
The note above still says later awaiters do not re-trigger progress, but Lines 1855-1869 now intentionally emit
found_in_storefor the caller. Updating that rationale will keep the prefetch/install reporter contract readable.As per coding guidelines, "Write code that explains itself. Comments are for the non-obvious why, not a translation of the what."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pacquet/crates/tarball/src/lib.rs` around lines 1837 - 1869, The comment above the cache waiter no longer matches behavior: although earlier text states later awaiters don't re-trigger progress, the code path in CacheValue::Available now calls emit_progress_found_in_store::<Reporter> so caller-side reporters do emit `found_in_store`; update the comment to explain that the first writer still attaches its handler (run_without_mem_cache / pnpm:progress) but that late awaiters will emit a caller-specific `found_in_store` via emit_progress_found_in_store::<Reporter> so consumers see the full `resolved → found_in_store → imported` sequence; keep the rationale about using cache_lock.read().await (read-only) and why the owner remains the only writer.pacquet/crates/tarball/src/tests.rs (1)
1536-1662: ⚡ Quick winAdd coverage for the post-
NotifyAvailablepath.This regression test only hits the immediate-
Availableshort-circuit by awaiting the first call before starting the second. The new caller-side emit afternotify.notified().awaitis still unpinned, so the "prefetch still in flight when install asks" case could regress unnoticed. A concurrent variant that forces the second caller through the wait branch would cover the rest of the new behavior.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pacquet/crates/tarball/src/tests.rs` around lines 1536 - 1662, Add a concurrent variant of the test that forces the second caller through the post-Notify wait branch: start the first DownloadTarballToStore::run_with_mem_cache::<pacquet_reporter::SilentReporter>(&mem_cache) without awaiting its completion (e.g. spawn it or use tokio::join!) so it is still in-flight, then start the second run_with_mem_cache::<RecordingReporter>(&mem_cache) which must take the notify.notified().await path; await both join handles and then assert exactly one ProgressMessage::FoundInStore was emitted for "second@2.0.0" as in the existing test. Ensure the mockito expectation remains expect(1) and reuse EVENTS/RecordingReporter/SilentReporter, mem_cache, pkg_integrity, and the same URL so the concurrent timing exercises the notify-wait branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@pacquet/crates/tarball/src/lib.rs`:
- Around line 1837-1869: The comment above the cache waiter no longer matches
behavior: although earlier text states later awaiters don't re-trigger progress,
the code path in CacheValue::Available now calls
emit_progress_found_in_store::<Reporter> so caller-side reporters do emit
`found_in_store`; update the comment to explain that the first writer still
attaches its handler (run_without_mem_cache / pnpm:progress) but that late
awaiters will emit a caller-specific `found_in_store` via
emit_progress_found_in_store::<Reporter> so consumers see the full `resolved →
found_in_store → imported` sequence; keep the rationale about using
cache_lock.read().await (read-only) and why the owner remains the only writer.
In `@pacquet/crates/tarball/src/tests.rs`:
- Around line 1536-1662: Add a concurrent variant of the test that forces the
second caller through the post-Notify wait branch: start the first
DownloadTarballToStore::run_with_mem_cache::<pacquet_reporter::SilentReporter>(&mem_cache)
without awaiting its completion (e.g. spawn it or use tokio::join!) so it is
still in-flight, then start the second
run_with_mem_cache::<RecordingReporter>(&mem_cache) which must take the
notify.notified().await path; await both join handles and then assert exactly
one ProgressMessage::FoundInStore was emitted for "second@2.0.0" as in the
existing test. Ensure the mockito expectation remains expect(1) and reuse
EVENTS/RecordingReporter/SilentReporter, mem_cache, pkg_integrity, and the same
URL so the concurrent timing exercises the notify-wait branch.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 8b0be60c-f1e7-4cfa-bdc5-753a9f16f0f6
📒 Files selected for processing (6)
pacquet/crates/package-manager/src/link_file.rspacquet/crates/package-manager/src/link_file/tests.rspacquet/crates/package-manager/src/prefetching_resolver.rspacquet/crates/package-manager/src/symlink_package.rspacquet/crates/tarball/src/lib.rspacquet/crates/tarball/src/tests.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Run benchmark on ubuntu-latest
- GitHub Check: Compile & Lint
- GitHub Check: Run benchmark on ubuntu-latest
- GitHub Check: Code Coverage
- GitHub Check: Lint and Test (ubuntu-latest)
- GitHub Check: Lint and Test (macos-latest)
- GitHub Check: Lint and Test (windows-latest)
🧰 Additional context used
📓 Path-based instructions (1)
pacquet/**/*.rs
📄 CodeRabbit inference engine (pacquet/AGENTS.md)
pacquet/**/*.rs: When porting a function that firespnpm:<channel>events throughglobalLogger,logger.debug(), orstreamParser.write(), mirror the call site, payload, and ordering so the reporter parses pacquet's NDJSON the same way it parses pnpm's.
Declare a newtype wrapper for branded string types. Do not collapse the brand into a plainStringor&str.
If upstream always validates before construction, validate in pacquet's wrapper too. The wrapper must construct only viaTryFrom<String>and/orFromStr. Do not provide an infallible public constructor.
If upstream never validates, just brand for type-safety. Expose an infallibleFrom<String>(andFrom<&str>when convenient).
If upstream occasionally constructs without validation, exposefrom_str_uncheckedas an escape hatch alongside the validating constructor.
Match upstream serde behavior for branded types that cross JSON, YAML, or INI boundaries. Use#[serde(try_from = "String")]for deserialization and#[serde(into = "String")]for serialization.
Use#[derive(derive_more::From)]and#[derive(derive_more::Into)]for mechanical conversion impls. Fall back to manualimplonly when conversion needs custom logic.
String-literal unions should becomeenums, not newtype wrappers. Model closed sets of valid string values as enums.
Template literal types should be treated as branded strings with validation discipline from rules 2-5.
Choose owned vs. borrowed parameters to minimize copies. Widen to the most encompassing type (&Pathover&PathBuf,&strover&String) when it doesn't force extra copies.
PreferArc::clone(&x)/Rc::clone(&x)overx.clone()for reference-counted types, so the cost is visible at the call site.
Follow Rust API Guidelines for naming conventions.
Do not use star imports inside module bodies. Writeuse super::{Foo, bar}instead ofuse super::*;. Two forms stay allowed: external-crate preludes likeuse rayon::prelude::*;and root-of-module re-...
Files:
pacquet/crates/package-manager/src/link_file/tests.rspacquet/crates/tarball/src/tests.rspacquet/crates/package-manager/src/prefetching_resolver.rspacquet/crates/tarball/src/lib.rspacquet/crates/package-manager/src/symlink_package.rspacquet/crates/package-manager/src/link_file.rs
🧠 Learnings (3)
📚 Learning: 2026-05-20T19:40:55.051Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11774
File: pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs:0-0
Timestamp: 2026-05-20T19:40:55.051Z
Learning: In the pacquet Rust code, ensure the semver implementation uses the `node-semver` crate (not `nodejs-semver`). `node-semver`’s public API does not include a `satisfies_with_prerelease`-style method; prerelease-tolerant matching should be implemented inline by first calling `Range::satisfies`, and when it rejects a prerelease version, retry matching against a stripped `MAJOR.MINOR.PATCH` base of the prerelease version.
Applied to files:
pacquet/crates/package-manager/src/link_file/tests.rspacquet/crates/tarball/src/tests.rspacquet/crates/package-manager/src/prefetching_resolver.rspacquet/crates/tarball/src/lib.rspacquet/crates/package-manager/src/symlink_package.rspacquet/crates/package-manager/src/link_file.rs
📚 Learning: 2026-05-22T00:08:44.646Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11837
File: pacquet/crates/resolving-npm-resolver/src/pick_package.rs:33-51
Timestamp: 2026-05-22T00:08:44.646Z
Learning: In the pnpm/pnpm repo’s pacquet Rust crates, do not flag Unicode ellipsis characters (U+2026, `…`) in Rust doc comments (`///` / `/** */`) as a lint violation. The pacquet crate’s `dylint.toml` only enables `perfectionist::derive_ordering`, and the Dylint `unicode-ellipsis` rule is not enabled for this project—so `…` in doc comments is an intentional, repo-consistent style.
Applied to files:
pacquet/crates/package-manager/src/link_file/tests.rspacquet/crates/tarball/src/tests.rspacquet/crates/package-manager/src/prefetching_resolver.rspacquet/crates/tarball/src/lib.rspacquet/crates/package-manager/src/symlink_package.rspacquet/crates/package-manager/src/link_file.rs
📚 Learning: 2026-05-20T23:07:58.444Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11784
File: pacquet/crates/resolving-deps-resolver/src/hoist_peers.rs:120-133
Timestamp: 2026-05-20T23:07:58.444Z
Learning: When reviewing code in this pacquet Rust port, follow the upstream pnpm compatibility rule: only match pnpm’s behavior exactly. Do not propose review changes that intentionally deviate from pnpm’s documented/observed behavior, even if pnpm appears buggy. If you identify a real bug in pnpm behavior, the review should prioritize fixing it upstream in pnpm first, and avoid implementing a pnpm-behavior workaround here unless the same fix has already landed upstream.
Applied to files:
pacquet/crates/package-manager/src/link_file/tests.rspacquet/crates/tarball/src/tests.rspacquet/crates/package-manager/src/prefetching_resolver.rspacquet/crates/tarball/src/lib.rspacquet/crates/package-manager/src/symlink_package.rspacquet/crates/package-manager/src/link_file.rs
🔇 Additional comments (3)
pacquet/crates/package-manager/src/link_file/tests.rs (1)
163-173: LGTM!Also applies to: 175-175, 179-180, 183-183, 187-188
pacquet/crates/package-manager/src/symlink_package.rs (1)
5-5: LGTM!Also applies to: 50-57
pacquet/crates/package-manager/src/link_file.rs (1)
136-164: ⚡ Quick win
fs::metadatapre-check doesn’t cause dangling-symlink “mutation” forCopy/Auto
std::fs::copy(from, to)opens the destination pathtofor writing/truncation; iftois a dangling symlink it fails (e.g.,ENOENT), it does not dereference the symlink and write through to its target. Switching this pre-check tosymlink_metadatawould therefore change behavior by turning a previously failingCopy/Autocase into an early no-op.> Likely an incorrect or invalid review comment.
Summary
Cuts pacquet's clean-install and full-resolution wallclock by roughly 3-4× on Linux (and roughly 2× on macOS), and brings user-CPU below pnpm's, primarily by:
pick_package's in-memorymeta_cacheso repeated resolves of the same(registry, name)don't re-payspawn_blocking+ multi-MBserde_json::from_str(biggest single win);DashSetclaim (one spawn per unique URL, not per(parent, child)edge), routing the prefetch download throughSilentReporterso it doesn't firepnpm:progressahead ofresolved, and dropping a write-lock that was used purely to read aCacheValuevariant;statsyscalls inlink_file(~130k saved per alotta-files install) andsymlink_package(~3-5k saved), and simplifying the EEXIST recovery to a singlereturn Ok(())matching pnpm'slinkOrCopy.Bench (alotta-files fixture, verdaccio mock)
Linux CI numbers from the auto-bench on this PR:
User CPU on clean-install drops from 23.9 s (pacquet@main) to 6.19 s (pacquet@HEAD), comparable to pnpm's 6.48 s — meaning the resolver no longer over-uses CPU. The frozen-lockfile path doesn't touch the resolver, so it's unchanged (the small shift is within the stddev bands).
Local macOS sees a smaller win (Clean Install ~1.5× pnpm) —
systime is dominated by APFS metadata-journal contention that bites pacquet harder than pnpm because of pacquet's default2 × available_parallelismrayon pool. Tracked separately as #11851 / #11857.Commits
perf(pacquet): pipeline tarball fetches with resolution— wraps the resolver chain in aPrefetchingResolverthattokio::spawns aDownloadTarballToStore::run_with_mem_cacheafter every successful resolve. Mirrors pnpm'spackageRequester.requestPackageshape (thefetchingpromise is already running by the time the resolver returns). Re-applies an earlier draft (commitf375c9161eon thepacquet-perfbranch). Also tightens the bench harness's git-repo check to accept worktree.gitfiles via agitdir:content check.perf(pacquet/resolving-npm-resolver): cache disk-loaded packuments in pick_package— the version-spec disk fast path and thepublishedBymtime shortcut loaded the packument mirror from disk but never populatedctx.meta_cache, so every later resolve of the same(registry, name)re-ranspawn_blocking(load_meta)+ a multi-MBserde_json::from_str. Promote the disk-loaded packument into the install-scoped cache. Biggest single win — resolve phase 25 s → 3 s, user CPU 4× lower.perf(pacquet/tarball): dedup prefetch spawns and read-lock cache state— multiple improvements to the prefetcher's tarball download path:DashSet<String>on thePrefetchingResolver. The previousmem_cache.contains_key()gate was a TOCTOU pair (two concurrent resolvers could both pass it and bothtokio::spawn);DashSet::insertis the atomic claim and only the winner spawns. TheMemCachestill backstops correctness.DownloadTarballToStore::run_with_mem_cachethroughSilentReportersopnpm:progress fetched/found_in_storedon't fire ahead of the install pass'sresolved. The install pass's later call torun_with_mem_cachenow emitsfound_in_storeon theAvailableshort-circuit, so consumers still see the documentedresolved → fetched|found_in_store → importedtriple once per URL.cache_lock.write().awaittoread().await— the variant inspection is read-only, and the owner branch is the only writer.perf(pacquet/package-manager): drop pre-flight stats in link_file— first half of the link_file simplification: the original twofs::metadata+fs::symlink_metadatacalls per file collapsed to a singlefs::metadatashort-circuit, deferring dangling-symlink detection to the EEXIST recovery path.perf(pacquet/package-manager): drop redundant create_dir_all in symlink_package—force_symlink_diralready handles a missing parent via its ownNotFound→create_dir_allretry; the explicit pre-create was paying ~3-5k unneededstats on a typical install.fix(pacquet/package-manager): keep one stat in link_file, simplify EEXIST— second half of the link_file simplification. Keeps the singlefs::metadatapre-check (needed becausefs::copyoverwrites silently rather than surfacing EEXIST, soCopycallers need the no-op short-circuit) and simplifies the EEXIST recovery to a singlereturn Ok(())matching pnpm'slinkOrCopy. Drops the dangling-symlink scrub for the same reason — pnpm doesn't scrub either, and CAFS contents are content-addressed so a "stale" dangling symlink from a crashed prior install is a corruption case the install layer is not the right place to repair. Testlink_file::tests::dangling_symlink_is_replaced→dangling_symlink_is_preservedlocks in the new (pnpm-aligned) contract.Where the remaining gap lives
pacquet@HEAD user-CPU on clean-install is now lower than pnpm's (6.19 s vs 6.48 s on the Linux CI). The remaining ~1.3-1.5× wallclock gap is
systime — file-system syscalls during the link phase (verify-files-cache stats, per-filelinkat/clonefile/copy_file_range, store-index lookups). The macOS-only multiplier on top of that is APFS metadata-journal contention from pacquet's over-subscribed rayon pool; see #11851 and the follow-up tracker #11857.Test plan
cargo nextest run -p pacquet-package-manager(link_file, symlink_package, install)cargo nextest run -p pacquet-tarball(MemCache, run_with_mem_cache — themem_cache_hit_emits_found_in_store_against_callers_reportertest locks in the silent-prefetcher + caller-emit contract)cargo nextest run -p pacquet-resolving-npm-resolver(pick_package fast paths)just integrated-benchmark -- --scenario clean-install --with-pnpm pacquet@HEAD pacquet@mainshows the expected improvement--with-pnpm) shows pacquet@HEAD within ~1.5× of pnpm on a quiet machinejust readyWritten by an agent (Claude Code, claude-opus-4-7).
Summary by CodeRabbit
Performance
Improvements
Bug Fixes