Skip to content

perf(pacquet): lazy packument hydration, sharded meta cache, and an indexed metadata mirror#12322

Merged
zkochan merged 6 commits into
mainfrom
manifest-lazy-hydration
Jun 11, 2026
Merged

perf(pacquet): lazy packument hydration, sharded meta cache, and an indexed metadata mirror#12322
zkochan merged 6 commits into
mainfrom
manifest-lazy-hydration

Conversation

@zkochan

@zkochan zkochan commented Jun 11, 2026

Copy link
Copy Markdown
Member

Why

Profiling a warm babylon resolve (metadata mirrors hot, ~520 MB across 1,476 packuments) showed the dominant cost was not I/O but hydration: every version of every packument was deserialized into typed manifests — maps, strings, and serde_json::Value trees built, hashed, and dropped — even though a pick consults only the version strings plus the handful of manifests it actually considers. typescript ships 3,800 versions; a pick needs one. The #[serde(flatten)] catch-all on PackageVersion compounded it by routing the whole struct through serde's buffering deserializer. The same hydration cost was paid on cold resolves inside the spawn_blocking parses from #12318.

What

Three changes, applied in profile-driven order:

1. Lazy packument hydration (df6f70eb57). Package::versions becomes a PackageVersions map whose entries hold the raw JSON fragment serde captured (Arc<RawValue>, shared not copied) and hydrate into Arc<PackageVersion> on first access, cached per slot. Key scans never hydrate; pinned_version hydrates only the winner; the publish-date filter moves slots without touching manifests; undecodable fragments degrade to "version absent" (matching the JS implementation's tolerance); serialization re-emits raw fragments verbatim. Picked manifests travel as Arc<PackageVersion>. Enables serde_json's raw_value feature (already a workspace dep).

2. Sharded in-memory packument cache (4c28c6679e). Every resolve edge consults the shared meta cache, and its single Mutex<HashMap> was the top mutex-wait site in the profile; it is now the same DashMap shape the crate's other shared maps use. Honest note: wall time was unchanged by this alone — the post-hydration profile shows the warm resolve is critical-path-bound, not lock-bound — but the contention disappears and the cache no longer serializes workers under load.

3. Indexed on-disk metadata mirror (126a416ae8, maintainer-approved cache-format break). The two-line NDJSON mirror (headers + verbatim body) becomes:

pacquet-meta-v1 <headers_len> <index_len>\n
<headers JSON>     etag, modified
<index JSON>       name, dist-tags, time, homepage, versions: [version, offset, len]
<fragments>        concatenated raw per-version JSON

The loader reads the file once and hands PackageVersions byte spans into that buffer — no structural re-scan, hydration parses a slice in place. The writer streams the registry's own bytes (fragments borrow from the lazily-parsed response body), so the cold-install cost stays one temp-file + rename per package. Old-format files read as cache misses and are rewritten on the next 200. A span-per-fragment pread variant was tried first and measured worse (sys 0.4 s → 4.5 s; the pick paths probe many candidate fragments per package), hence the single-buffer design.

Measurements (warm babylon --lockfile-only resolve, 10-core M-series)

build wall notes
before this PR ~9.1 s malloc/free + deserialize_any dominate the profile
+ lazy hydration ~7.7–8 s parse microbench 3–3.8× (typescript 36.1→9.5 ms)
+ indexed mirror ~5.5–6.7 s sys 0.4 s; no whole-body scan

Cold resolves keep the same hydration savings inside the parse tasks; cold write volume and syscall pattern are unchanged.

Evaluated and deliberately not included

Consolidating the install-phase thread pools (tokio + global rayon + the dedicated CAS-write pool + capped blocking threads show ~100k involuntary context switches on cold installs vs ~750 for the pnpm CLI). After the resolution fixes, repeated container A/Bs show no measurable wall-clock cost from the churn — it hides entirely behind network time — and history warns that speculative concurrency reshuffles here regress badly (see the #11903 prefetch revert). Deferred until a benchmark shows it on the critical path.

Tests

New package_versions unit tests (hydrate-on-demand + Arc-identity caching, undecodable-fragment-as-absent, verbatim raw round-trip incl. unknown keys, eager construction, hydration-free filtering) and rewritten mirror tests (headers/index/fragment round-trip, span hydration, truncation → cache miss, old-format → cache miss, atomic overwrite). Full suites green: pacquet-registry (23), pacquet-resolving-npm-resolver (235), pacquet-package-manager + pacquet-cli (768); workspace clippy -D warnings (pedantic set), dylint, fmt.


Written by an agent (Claude Code, claude-fable-5).

Summary by CodeRabbit

  • Chores

    • Added lazy, fragment-backed package version storage and an indexed on-disk mirror format for metadata.
    • Enabled additional JSON features to improve handling of raw fragments.
  • Bug Fixes

    • Safer handling of undecodable or truncated version data — failures are treated as absent without crashing.
    • Stronger mirror validation and atomic writes to avoid corrupt reads.
  • Performance

    • Reduced memory churn and faster, shared version selection during resolution.
  • Tests

    • Expanded tests for hydration, fallback selection, trust checks, and mirror behaviors.

@qodo-free-for-open-source-projects

qodo-free-for-open-source-projects Bot commented Jun 11, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (9) 📘 Rule violations (0)

Grey Divider


Action required

1. Trust scan hides evidence 🐞 Bug ⛨ Security
Description
detect_strongest_trust_evidence_before iterates via meta.versions.iter(), but
PackageVersions::iter() drops any version whose manifest fails to hydrate. This can miss prior
trust evidence and lower prior_rank, allowing a trust downgrade to pass undetected when some prior
version fragments are undecodable.
Code

pacquet/crates/resolving-npm-resolver/src/trust_checks.rs[224]

+    for (version, manifest) in meta.versions.iter() {
Evidence
The trust-downgrade logic depends on scanning all prior versions to compute the strongest prior
trust evidence; after this PR it uses meta.versions.iter(). But PackageVersions::iter() is
explicitly defined to skip undecodable fragments (via filter_map over hydrate(version)?), so any
prior version whose manifest fails to decode is silently omitted from the scan, potentially lowering
prior_rank.

pacquet/crates/resolving-npm-resolver/src/trust_checks.rs[164-189]
pacquet/crates/resolving-npm-resolver/src/trust_checks.rs[214-259]
pacquet/crates/registry/src/package_versions.rs[108-165]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The trust-downgrade scan walks prior versions using `meta.versions.iter()`, but the new lazy-hydration iterator skips undecodable manifests. That means earlier versions with trust evidence can be silently ignored, reducing the strongest prior trust rank and potentially allowing a downgrade to pass.
### Issue Context
- `PackageVersions::hydrate()` returns `None` on decode error.
- `PackageVersions::iter()` uses `filter_map(... hydrate(version)? ...)`, so decode failures are indistinguishable from absence and are silently dropped.
- Trust downgrade detection explicitly aims to avoid skipping data in ways that could mask earlier evidence.
### Fix Focus Areas
- pacquet/crates/resolving-npm-resolver/src/trust_checks.rs[164-258]
- pacquet/crates/registry/src/package_versions.rs[108-165]
### Suggested fix
In `detect_strongest_trust_evidence_before`, iterate version strings (e.g., `meta.versions.keys()`) and attempt hydration per version.
- If `meta.versions.get(version)` is `None` **and** `meta.versions.contains_key(version)` is `true`, treat that as a *decode failure* and return an error (e.g., `TrustViolation::TrustCheckFailed { reason: ... }`) rather than skipping it.
- This likely requires changing `detect_strongest_trust_evidence_before` to return `Result<Option<TrustEvidence>, TrustViolation>` and updating its caller in `fail_if_trust_downgraded` to propagate the error.
This makes the trust check fail closed when it cannot reliably determine prior trust evidence.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Unbounded mirror header allocation 🐞 Bug ☼ Reliability
Description
read_mirror_headers() allocates a Vec sized from the on-disk headers_len field before
validating it against the actual file size, so a malformed/corrupted mirror can trigger an OOM
during load_meta_headers(_async) and crash warm resolves. This is reachable because
fetch_full_metadata_cached() calls load_meta_headers_async() on the hot path before issuing
requests.
Code

pacquet/crates/resolving-npm-resolver/src/mirror.rs[R284-310]

+fn read_mirror_headers(file: &mut File) -> Option<MetaHeaders> {
+    // Magic + two decimal lengths fit well inside this; the headers
+    // record is ~100 bytes of etag + timestamp.
+    let mut buf = [0u8; 1024];
+    let mut filled = 0usize;
+    while filled < buf.len() {
+        let n = file.read(&mut buf[filled..]).ok()?;
+        if n == 0 {
+            break;
+        }
+        filled += n;
+    }
+    let chunk = &buf[..filled];
+    let newline = chunk.iter().position(|&byte| byte == b'\n')?;
+    let line = std::str::from_utf8(&chunk[..newline]).ok()?;
+    let (headers_len, _) = parse_mirror_magic(line)?;
+    let headers_start = newline + 1;
+    let headers_end = headers_start.checked_add(headers_len)?;
+    let headers_json: std::borrow::Cow<'_, [u8]> = if headers_end <= chunk.len() {
+        std::borrow::Cow::Borrowed(&chunk[headers_start..headers_end])
+    } else {
+        // Headers record larger than the probe buffer — read the rest.
+        let mut rest = vec![0u8; headers_end - chunk.len()];
+        file.read_exact(&mut rest).ok()?;
+        let mut whole = chunk[headers_start..].to_vec();
+        whole.extend_from_slice(&rest);
+        std::borrow::Cow::Owned(whole)
Evidence
The mirror parser computes headers_end from the parsed headers_len and allocates rest of size
headers_end - chunk.len() without checking against file size or any cap; the header read is
invoked from the cached metadata fetcher before conditional GETs.

pacquet/crates/resolving-npm-resolver/src/mirror.rs[284-313]
pacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached.rs[120-136]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`read_mirror_headers()` trusts the mirror file’s declared `headers_len` and may allocate an arbitrarily large buffer (`vec![0u8; ...]`) before confirming the file actually contains that many bytes. A corrupted/malicious cache entry can therefore cause excessive allocation/OOM and crash the resolver.
## Issue Context
This is on the warm path: `fetch_full_metadata_cached()` calls `load_meta_headers_async()` which delegates to `load_meta_headers()`/`read_mirror_headers()`.
## Fix Focus Areas
- pacquet/crates/resolving-npm-resolver/src/mirror.rs[274-313]
- pacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached.rs[120-136]
### Suggested fix approach
- Before allocating for `headers_len`, validate it is within a reasonable maximum (e.g., a small constant like 8–64 KiB).
- Also validate `headers_len` against the file size (`file.metadata()?.len()`) and return `None` early if the declared record would exceed file length.
- Avoid allocating based solely on declared lengths; prefer bounded incremental reads or a `take()`-limited reader.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Pinned version no fallback 🐞 Bug ≡ Correctness
Description
Package::pinned_version picks the highest satisfying version string and then hydrates only that one;
if hydration fails, it returns None even when lower satisfying versions exist. This breaks the
intended “undecodable behaves as absent” semantics and can incorrectly report no match for a valid
range.
Code

pacquet/crates/registry/src/package.rs[R146-160]

+    pub fn pinned_version(&self, version_range: &str) -> Option<Arc<PackageVersion>> {
    let range: node_semver::Range = version_range.parse().unwrap(); // TODO: this step should have happened in PackageManifest
-        let mut satisfied_versions = self
+        // Match on the version *strings* so only the winning manifest
+        // hydrates from its raw fragment.
+        let highest = self
        .versions
-            .values()
-            .filter(|version| version.version.satisfies(&range))
-            .collect::<Vec<&PackageVersion>>();
-
-        satisfied_versions.sort_by(|a, b| a.version.partial_cmp(&b.version).unwrap());
-
-        // Optimization opportunity:
-        // We can store this in a cache to remove filter operation and make this a O(1) operation.
-        satisfied_versions.last().copied()
+            .keys()
+            .filter_map(|key| {
+                key.parse::<node_semver::Version>()
+                    .ok()
+                    .filter(|version| version.satisfies(&range))
+                    .map(|version| (version, key))
+            })
+            .max_by(|(left, _), (right, _)| left.partial_cmp(right).unwrap())?;
+        self.versions.get(highest.1)
Evidence
PackageVersions defines decode failure as “absent” and get returns None on decode failure; tests
also show a version can be present in key scans but absent on hydration. pinned_version currently
selects one highest candidate and does not retry when get returns None.

pacquet/crates/registry/src/package_versions.rs[13-16]
pacquet/crates/registry/src/package_versions.rs[82-89]
pacquet/crates/registry/src/package_versions/tests.rs[44-48]
pacquet/crates/registry/src/package.rs[145-161]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`Package::pinned_version` selects a single “highest” satisfying version string and then calls `self.versions.get(...)`. With lazy hydration, `get` returns `None` when the manifest fragment is undecodable, causing `pinned_version` to incorrectly return `None` even if other satisfying versions are present.
### Issue Context
`PackageVersions` explicitly defines undecodable fragments as behaving “as if the version were absent”, so callers that implement picking should treat hydration failures as a signal to try the next candidate.
### Fix Focus Areas
- pacquet/crates/registry/src/package.rs[145-161]
- pacquet/crates/registry/src/package_versions.rs[82-89]
### Suggested fix
Compute satisfying candidates in descending semver order and attempt hydration in that order, returning the first `Some(Arc<PackageVersion>)`. If the best candidate fails to hydrate, continue to the next-best rather than returning `None` immediately.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (1)
4. Picker stops on decode fail 🐞 Bug ≡ Correctness
Description
pick_package_from_meta returns Ok(None) immediately when the chosen version’s manifest fails to
hydrate, instead of skipping it and retrying other candidates. This can cause npm resolution to
return None even when other satisfying versions exist.
Code

pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs[R229-240]

let Some(version) = picked_version else { return Ok(None) };
-    let Some(manifest) = meta_ref.versions.get(&version).cloned() else { return Ok(None) };
-    let mut manifest = manifest;
-    if !meta_ref.name.is_empty() {
+    let Some(manifest) = meta_ref.versions.get(&version) else { return Ok(None) };
+    if !meta_ref.name.is_empty() && manifest.name != meta_ref.name {
    // GitHub registry quirk: a scoped package can be published as
    // `@owner/foo` while the per-version `name` is just `foo`.
    // Match upstream's shim that pins the manifest name to the
    // packument-level name.
-        manifest.name.clone_from(&meta_ref.name);
+        let mut pinned = (*manifest).clone();
+        pinned.name.clone_from(&meta_ref.name);
+        return Ok(Some(Arc::new(pinned)));
}
Ok(Some(manifest))
Evidence
The module contract says undecodable versions should behave as absent, but pick_package_from_meta
treats a hydration failure as terminal Ok(None). npm_resolver then treats picked_package == None
as “no resolution” and returns Ok(None).

pacquet/crates/registry/src/package_versions.rs[13-16]
pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs[216-241]
pacquet/crates/resolving-npm-resolver/src/npm_resolver.rs[393-401]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`pick_package_from_meta` picks a version *string* and then hydrates it via `meta_ref.versions.get(&version)`. Under lazy hydration, a malformed manifest yields `None`, and the function returns `Ok(None)` rather than treating that specific version as absent and attempting the next-best candidate.
### Issue Context
The lazy-hydration contract states undecodable fragments should behave as if absent. The resolver treats `picked_package: None` as “no resolution”, so an undecodable top candidate can incorrectly make the whole dependency unresolvable.
### Fix Focus Areas
- pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs[216-241]
- pacquet/crates/resolving-npm-resolver/src/npm_resolver.rs[393-401]
- pacquet/crates/registry/src/package_versions.rs[13-16]
### Suggested fix
For range picks, loop: pick the best candidate string; attempt hydration; if hydration fails, record the version in an exclusion set and re-run the picker excluding those versions, until a hydratable manifest is found or no candidates remain. (Optional: `tracing::warn` once per excluded version to aid debugging.)

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

5. Mirror write doubles memory 🐞 Bug ➹ Performance
Description
save_meta_indexed() builds fragment_bytes containing all version fragments and then copies it
again into a second bytes buffer (bytes.extend_from_slice(&fragment_bytes)), doubling peak
memory for large packuments during mirror writes. This can significantly increase allocator pressure
and risk OOM when caching very large packuments.
Code

pacquet/crates/resolving-npm-resolver/src/mirror.rs[R238-267]

+    let mut fragment_bytes = Vec::new();
+    let mut spans = Vec::with_capacity(meta.versions.len());
+    for (version, json) in meta.versions.fragments() {
+        let offset = fragment_bytes.len() as u64;
+        let len = u32::try_from(json.len()).unwrap_or(u32::MAX);
+        if len as usize != json.len() {
+            // A single >4 GiB version manifest is not a thing the npm
+            // registry produces; skip it rather than corrupt the index.
+            continue;
+        }
+        fragment_bytes.extend_from_slice(json.as_bytes());
+        spans.push((version.clone(), offset, len));
+    }
+
+    let index = serde_json::to_string(&MirrorIndex {
+        name: meta.name.clone(),
+        dist_tags: meta.dist_tags.clone(),
+        time: meta.time.clone(),
+        homepage: meta.homepage.clone(),
+        versions: spans,
+    })
+    .map_err(|error| SaveMetaError::Encode(EncodeMetaError(error)))?;
+
+    let mut contents = String::with_capacity(headers.len() + index.len() + 64);
+    let _ = writeln!(contents, "{MIRROR_MAGIC} {} {}", headers.len(), index.len());
+    contents.push_str(&headers);
+    contents.push_str(&index);
+    let mut bytes = contents.into_bytes();
+    bytes.extend_from_slice(&fragment_bytes);
+    save_meta(pkg_mirror, &bytes)
Evidence
The function first accumulates all fragment bytes into fragment_bytes, then constructs bytes
from the header/index string and appends fragment_bytes via copying; save_meta currently
requires a full &[u8], encouraging this extra buffering.

pacquet/crates/resolving-npm-resolver/src/mirror.rs[227-268]
pacquet/crates/resolving-npm-resolver/src/mirror.rs[417-437]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`save_meta_indexed()` materializes the fragment section twice in memory: once in `fragment_bytes`, then again when copying into `bytes` for `save_meta()`. For very large packuments, this amplifies peak memory usage unnecessarily.
## Issue Context
The new mirror format is meant to improve warm-load performance; mirror writes happen on 200 responses and on explicit persist paths.
## Fix Focus Areas
- pacquet/crates/resolving-npm-resolver/src/mirror.rs[227-268]
- pacquet/crates/resolving-npm-resolver/src/mirror.rs[417-443]
### Suggested fix approach
- Avoid the double buffer by either:
- Writing directly to the temp file in `save_meta` (change `save_meta` to accept a closure/writer and stream headers/index/fragments sequentially), or
- Building a single `Vec<u8>`: push the magic line + headers + index into one `Vec`, `reserve_exact` for fragment total size (if computed) and append fragments directly without an intermediate `fragment_bytes`.
- Keep the existing atomic temp+rename semantics.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


6. Deprecated fallback hydrates all 🐞 Bug ➹ Performance
Description
In pick_version_by_version_range, the deprecated-version fallback builds the non-deprecated
candidate list using opts.meta.versions.iter(), which hydrates every version fragment into a full
PackageVersion. When the initially selected max satisfying version is deprecated, this can trigger
full-packument hydration work and negate the lazy-hydration performance win for that pick.
Code

pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs[R305-310]

 if let Some(ref picked) = max_pick {
     let picked_meta = opts.meta.versions.get(picked);
-        let picked_is_deprecated =
-            picked_meta.and_then(|version| version.deprecated.as_ref()).is_some();
+        let picked_is_deprecated = picked_meta.is_some_and(|version| version.deprecated.is_some());
     if picked_is_deprecated && all_versions.len() > 1 {
         let non_deprecated: Vec<&str> = opts
             .meta
Evidence
The deprecated fallback computes non_deprecated by calling opts.meta.versions.iter(), and
PackageVersions::iter() is defined to hydrate every fragment; therefore, this fallback branch
forces full hydration across all versions when it triggers.

pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs[298-316]
pacquet/crates/registry/src/package_versions.rs[112-118]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`pick_version_by_version_range` uses `opts.meta.versions.iter()` inside the deprecated-fallback branch, which now hydrates *all* version manifests. This is avoidable because the fallback only needs to find the highest satisfying **non-deprecated** version.
### Issue Context
- `PackageVersions::iter()` hydrates every raw fragment (and is documented as suitable only for cold paths).
- The deprecated-fallback branch can run during normal picking when the best satisfying version is deprecated.
### Fix Focus Areas
- pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs[298-320]
- pacquet/crates/registry/src/package_versions.rs[112-118]
### Suggested fix approach
- Replace the `versions.iter().filter(|(_, manifest)| manifest.deprecated.is_none())...` full scan with a descending semver scan over candidate version **strings**.
- Hydrate only as needed:
- If `max_pick` is deprecated, iterate versions from highest to lowest (restricted to versions satisfying the range) and call `meta.versions.get(candidate)` until you find one with `deprecated.is_none()`.
- This should hydrate only a handful of candidates in typical cases rather than every version.
- Treat undecodable fragments (`get` returns `None`) as “absent”/skip when searching for a non-deprecated candidate, consistent with the PR’s semantics.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


7. Tarball projection hydrates all 🐞 Bug ➹ Performance
Description
project_abbreviated_meta uses meta.versions.iter() to build a version -> dist.tarball map,
which forces hydration of every version into a full PackageVersion even though only dist.tarball
is needed. For large packuments, this reintroduces substantial CPU/memory cost into the verifier
path and undermines the lazy-hydration benefit.
Code

pacquet/crates/registry/src/package_versions.rs[R112-118]

+    /// Iterate `(version, manifest)` pairs, hydrating every fragment.
+    /// Undecodable fragments are skipped. Full walks defeat the lazy
+    /// representation, so this belongs only on cold paths (the trust
+    /// verifier's history scan, tests).
+    pub fn iter(&self) -> impl Iterator<Item = (&String, Arc<PackageVersion>)> {
+        self.slots.iter().filter_map(|(version, slot)| Some((version, slot.hydrate(version)?)))
+    }
Evidence
PackageVersions::iter() is implemented to hydrate every version fragment, and
project_abbreviated_meta uses it to read only dist.tarball, so it necessarily performs full
hydration work for all versions despite only needing a small field.

pacquet/crates/registry/src/package_versions.rs[112-118]
pacquet/crates/resolving-npm-resolver/src/create_npm_resolution_verifier.rs[1018-1034]
pacquet/crates/resolving-npm-resolver/src/lookup_context.rs[30-40]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The verifier’s `project_abbreviated_meta` builds `version_tarballs` by iterating `meta.versions.iter()`, which hydrates every raw fragment into a full `PackageVersion`. This is unnecessary because the projection only needs `dist.tarball` strings.
### Issue Context
- `PackageVersions::iter()` explicitly hydrates every fragment.
- The verifier projection is explicitly trying to keep only tarball URLs to avoid holding full packuments in memory.
### Fix Focus Areas
- pacquet/crates/registry/src/package_versions.rs[82-135]
- pacquet/crates/resolving-npm-resolver/src/create_npm_resolution_verifier.rs[1018-1034]
- pacquet/crates/resolving-npm-resolver/src/lookup_context.rs[30-48]
### Suggested fix approach
- Add a non-hydrating iterator/accessor on `PackageVersions` that exposes raw fragments (e.g. `iter_raw()` yielding `(&str, &Arc<RawValue>)`), or a dedicated helper like `iter_dist_tarballs()`.
- In `project_abbreviated_meta`, iterate raw fragments and deserialize into a *minimal* struct that only contains `dist.tarball` (and maybe `deprecated` if needed later), instead of deserializing `PackageVersion`.
- Keep the existing `iter()` for truly cold paths that need full manifests, but avoid it for projections that only need a small subset of fields.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (2)
8. Publish-date tags can point invalid 🐞 Bug ≡ Correctness
Description
filter_pkg_metadata_by_publish_date may rewrite a dist-tag to a version key whose manifest cannot
decode, because it chooses candidates from keys/contains_key without requiring hydration success.
This can produce filtered metadata whose dist-tags later fail to resolve to a manifest.
Code

pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs[R410-419]

            Some((ref best, best_raw)) => {
                let best_deprecated = versions_within_date
                    .get(best_raw)
-                        .and_then(|manifest| manifest.deprecated.as_ref())
-                        .is_some();
+                        .is_some_and(|manifest| manifest.deprecated.is_some());
                let candidate_deprecated = versions_within_date
                    .get(candidate_raw)
-                        .and_then(|manifest| manifest.deprecated.as_ref())
-                        .is_some();
+                        .is_some_and(|manifest| manifest.deprecated.is_some());
                let candidate_wins = (candidate > *best
                    && best_deprecated == candidate_deprecated)
                    || (best_deprecated && !candidate_deprecated);
Evidence
Key scans can include versions whose manifests fail to decode (contains_key true while get is
None). The publish-date filter’s tag rewrite chooses candidates from keys and may select a version
that can’t be hydrated later.

pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs[378-429]
pacquet/crates/registry/src/package_versions/tests.rs[44-48]
pacquet/crates/registry/src/package_versions.rs[82-89]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`filter_pkg_metadata_by_publish_date` selects dist-tag replacements by iterating `versions_within_date.keys()` and can choose a candidate even if its manifest fragment is undecodable. Because `contains_key`/`keys` include undecodable entries, tags can end up pointing at a version that later yields `PackageVersions::get(...) == None`.
### Issue Context
Undecodable fragments are intended to behave as absent, so dist-tag rewrite should avoid choosing them as winners.
### Fix Focus Areas
- pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs[378-429]
- pacquet/crates/registry/src/package_versions.rs[82-89]
- pacquet/crates/registry/src/package_versions/tests.rs[44-48]
### Suggested fix
When evaluating candidates for `best_version`, require `versions_within_date.get(candidate_raw).is_some()` before (a) seeding `best_version = Some(...)` and (b) considering a candidate as a winner. Also consider changing the early `contains_key(version)` tag preservation check to verify `get(version).is_some()` if you want “absent on decode fail” semantics for tags too.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


9. latest() may panic 🐞 Bug ☼ Reliability
Description
Package::latest uses expect() on PackageVersions::get, but with lazy hydration invalid manifests no
longer fail whole-packument deserialization and instead return None on access. If dist-tags.latest
points at an undecodable manifest, latest() will panic.
Code

pacquet/crates/registry/src/package.rs[R164-168]

+    pub fn latest(&self) -> Arc<PackageVersion> {
    let version =
        self.dist_tags.get("latest").expect("latest tag is expected but not found for package");
-        self.versions.get(version).unwrap()
+        self.versions.get(version).expect("manifest of the latest version should decode")
}
Evidence
PackageVersions::get returns None on decode failure, and the module explicitly allows
undecodable fragments; latest() currently panics if that happens for the latest dist-tag target.

pacquet/crates/registry/src/package_versions.rs[13-16]
pacquet/crates/registry/src/package_versions.rs[82-89]
pacquet/crates/registry/src/package.rs[163-168]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`Package::latest` calls `self.versions.get(version).expect(...)`. With lazy hydration, `get` can return `None` for undecodable fragments, so `latest()` can now panic at runtime.
### Issue Context
The `PackageVersions` contract allows undecodable fragments and treats them as absent; `latest()` should either propagate that absence or fail in a controlled, non-panicking way.
### Fix Focus Areas
- pacquet/crates/registry/src/package.rs[163-168]
- pacquet/crates/registry/src/package_versions.rs[82-89]
### Suggested fix
Change `latest()` to return `Option<Arc<PackageVersion>>` (or `Result<Arc<PackageVersion>, ...>`), and update callers/tests accordingly. If the API must remain infallible, consider selecting an alternative (e.g., the highest hydratable version) rather than panicking.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Previous review results

Review updated until commit f6b75fc

Results up to commit ae28135


🐞 Bugs (9) 📘 Rule violations (0)


Action required
1. Trust scan hides evidence 🐞 Bug ⛨ Security ⭐ New
Description
detect_strongest_trust_evidence_before iterates via meta.versions.iter(), but
PackageVersions::iter() drops any version whose manifest fails to hydrate. This can miss prior
trust evidence and lower prior_rank, allowing a trust downgrade to pass undetected when some prior
version fragments are undecodable.
Code

pacquet/crates/resolving-npm-resolver/src/trust_checks.rs[224]

+    for (version, manifest) in meta.versions.iter() {
Evidence
The trust-downgrade logic depends on scanning all prior versions to compute the strongest prior
trust evidence; after this PR it uses meta.versions.iter(). But PackageVersions::iter() is
explicitly defined to skip undecodable fragments (via filter_map over hydrate(version)?), so any
prior version whose manifest fails to decode is silently omitted from the scan, potentially lowering
prior_rank.

pacquet/crates/resolving-npm-resolver/src/trust_checks.rs[164-189]
pacquet/crates/resolving-npm-resolver/src/trust_checks.rs[214-259]
pacquet/crates/registry/src/package_versions.rs[108-165]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The trust-downgrade scan walks prior versions using `meta.versions.iter()`, but the new lazy-hydration iterator skips undecodable manifests. That means earlier versions with trust evidence can be silently ignored, reducing the strongest prior trust rank and potentially allowing a downgrade to pass.

### Issue Context
- `PackageVersions::hydrate()` returns `None` on decode error.
- `PackageVersions::iter()` uses `filter_map(... hydrate(version)? ...)`, so decode failures are indistinguishable from absence and are silently dropped.
- Trust downgrade detection explicitly aims to avoid skipping data in ways that could mask earlier evidence.

### Fix Focus Areas
- pacquet/crates/resolving-npm-resolver/src/trust_checks.rs[164-258]
- pacquet/crates/registry/src/package_versions.rs[108-165]

### Suggested fix
In `detect_strongest_trust_evidence_before`, iterate version strings (e.g., `meta.versions.keys()`) and attempt hydration per version.
- If `meta.versions.get(version)` is `None` **and** `meta.versions.contains_key(version)` is `true`, treat that as a *decode failure* and return an error (e.g., `TrustViolation::TrustCheckFailed { reason: ... }`) rather than skipping it.
- This likely requires changing `detect_strongest_trust_evidence_before` to return `Result<Option<TrustEvidence>, TrustViolation>` and updating its caller in `fail_if_trust_downgraded` to propagate the error.

This makes the trust check fail closed when it cannot reliably determine prior trust evidence.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Unbounded mirror header allocation 🐞 Bug ☼ Reliability
Description
read_mirror_headers() allocates a Vec sized from the on-disk headers_len field before
validating it against the actual file size, so a malformed/corrupted mirror can trigger an OOM
during load_meta_headers(_async) and crash warm resolves. This is reachable because
fetch_full_metadata_cached() calls load_meta_headers_async() on the hot path before issuing
requests.
Code

pacquet/crates/resolving-npm-resolver/src/mirror.rs[R284-310]

+fn read_mirror_headers(file: &mut File) -> Option<MetaHeaders> {
+    // Magic + two decimal lengths fit well inside this; the headers
+    // record is ~100 bytes of etag + timestamp.
+    let mut buf = [0u8; 1024];
+    let mut filled = 0usize;
+    while filled < buf.len() {
+        let n = file.read(&mut buf[filled..]).ok()?;
+        if n == 0 {
+            break;
+        }
+        filled += n;
+    }
+    let chunk = &buf[..filled];
+    let newline = chunk.iter().position(|&byte| byte == b'\n')?;
+    let line = std::str::from_utf8(&chunk[..newline]).ok()?;
+    let (headers_len, _) = parse_mirror_magic(line)?;
+    let headers_start = newline + 1;
+    let headers_end = headers_start.checked_add(headers_len)?;
+    let headers_json: std::borrow::Cow<'_, [u8]> = if headers_end <= chunk.len() {
+        std::borrow::Cow::Borrowed(&chunk[headers_start..headers_end])
+    } else {
+        // Headers record larger than the probe buffer — read the rest.
+        let mut rest = vec![0u8; headers_end - chunk.len()];
+        file.read_exact(&mut rest).ok()?;
+        let mut whole = chunk[headers_start..].to_vec();
+        whole.extend_from_slice(&rest);
+        std::borrow::Cow::Owned(whole)
Evidence
The mirror parser computes headers_end from the parsed headers_len and allocates rest of size
headers_end - chunk.len() without checking against file size or any cap; the header read is
invoked from the cached metadata fetcher before conditional GETs.

pacquet/crates/resolving-npm-resolver/src/mirror.rs[284-313]
pacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached.rs[120-136]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`read_mirror_headers()` trusts the mirror file’s declared `headers_len` and may allocate an arbitrarily large buffer (`vec![0u8; ...]`) before confirming the file actually contains that many bytes. A corrupted/malicious cache entry can therefore cause excessive allocation/OOM and crash the resolver.
## Issue Context
This is on the warm path: `fetch_full_metadata_cached()` calls `load_meta_headers_async()` which delegates to `load_meta_headers()`/`read_mirror_headers()`.
## Fix Focus Areas
- pacquet/crates/resolving-npm-resolver/src/mirror.rs[274-313]
- pacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached.rs[120-136]
### Suggested fix approach
- Before allocating for `headers_len`, validate it is within a reasonable maximum (e.g., a small constant like 8–64 KiB).
- Also validate `headers_len` against the file size (`file.metadata()?.len()`) and return `None` early if the declared record would exceed file length.
- Avoid allocating based solely on declared lengths; prefer bounded incremental reads or a `take()`-limited reader.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Pinned version no fallback 🐞 Bug ≡ Correctness
Description
Package::pinned_version picks the highest satisfying version string and then hydrates only that one;
if hydration fails, it returns None even when lower satisfying versions exist. This breaks the
intended “undecodable behaves as absent” semantics and can incorrectly report no match for a valid
range.
Code

pacquet/crates/registry/src/package.rs[R146-160]

+    pub fn pinned_version(&self, version_range: &str) -> Option<Arc<PackageVersion>> {
     let range: node_semver::Range = version_range.parse().unwrap(); // TODO: this step should have happened in PackageManifest
-        let mut satisfied_versions = self
+        // Match on the version *strings* so only the winning manifest
+        // hydrates from its raw fragment.
+        let highest = self
         .versions
-            .values()
-            .filter(|version| version.version.satisfies(&range))
-            .collect::<Vec<&PackageVersion>>();
-
-        satisfied_versions.sort_by(|a, b| a.version.partial_cmp(&b.version).unwrap());
-
-        // Optimization opportunity:
-        // We can store this in a cache to remove filter operation and make this a O(1) operation.
-        satisfied_versions.last().copied()
+            .keys()
+            .filter_map(|key| {
+                key.parse::<node_semver::Version>()
+                    .ok()
+                    .filter(|version| version.satisfies(&range))
+                    .map(|version| (version, key))
+            })
+            .max_by(|(left, _), (right, _)| left.partial_cmp(right).unwrap())?;
+        self.versions.get(highest.1)
Evidence
PackageVersions defines decode failure as “absent” and get returns None on decode failure; tests
also show a version can be present in key scans but absent on hydration. pinned_version currently
selects one highest candidate and does not retry when get returns None.

pacquet/crates/registry/src/package_versions.rs[13-16]
pacquet/crates/registry/src/package_versions.rs[82-89]
pacquet/crates/registry/src/package_versions/tests.rs[44-48]
pacquet/crates/registry/src/package.rs[145-161]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`Package::pinned_version` selects a single “highest” satisfying version string and then calls `self.versions.get(...)`. With lazy hydration, `get` returns `None` when the manifest fragment is undecodable, causing `pinned_version` to incorrectly return `None` even if other satisfying versions are present.
### Issue Context
`PackageVersions` explicitly defines undecodable fragments as behaving “as if the version were absent”, so callers that implement picking should treat hydration failures as a signal to try the next candidate.
### Fix Focus Areas
- pacquet/crates/registry/src/package.rs[145-161]
- pacquet/crates/registry/src/package_versions.rs[82-89]
### Suggested fix
Compute satisfying candidates in descending semver order and attempt hydration in that order, returning the first `Some(Arc<PackageVersion>)`. If the best candidate fails to hydrate, continue to the next-best rather than returning `None` immediately.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (1)
4. Picker stops on decode fail 🐞 Bug ≡ Correctness
Description
pick_package_from_meta returns Ok(None) immediately when the chosen version’s manifest fails to
hydrate, instead of skipping it and retrying other candidates. This can cause npm resolution to
return None even when other satisfying versions exist.
Code

pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs[R229-240]

 let Some(version) = picked_version else { return Ok(None) };
-    let Some(manifest) = meta_ref.versions.get(&version).cloned() else { return Ok(None) };
-    let mut manifest = manifest;
-    if !meta_ref.name.is_empty() {
+    let Some(manifest) = meta_ref.versions.get(&version) else { return Ok(None) };
+    if !meta_ref.name.is_empty() && manifest.name != meta_ref.name {
     // GitHub registry quirk: a scoped package can be published as
     // `@owner/foo` while the per-version `name` is just `foo`.
     // Match upstream's shim that pins the manifest name to the
     // packument-level name.
-        manifest.name.clone_from(&meta_ref.name);
+        let mut pinned = (*manifest).clone();
+        pinned.name.clone_from(&meta_ref.name);
+        return Ok(Some(Arc::new(pinned)));
 }
 Ok(Some(manifest))
Evidence
The module contract says undecodable versions should behave as absent, but pick_package_from_meta
treats a hydration failure as terminal Ok(None). npm_resolver then treats picked_package == None
as “no resolution” and returns Ok(None).

pacquet/crates/registry/src/package_versions.rs[13-16]
pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs[216-241]
pacquet/crates/resolving-npm-resolver/src/npm_resolver.rs[393-401]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`pick_package_from_meta` picks a version *string* and then hydrates it via `meta_ref.versions.get(&version)`. Under lazy hydration, a malformed manifest yields `None`, and the function returns `Ok(None)` rather than treating that specific version as absent and attempting the next-best candidate.
### Issue Context
The lazy-hydration contract states undecodable fragments should behave as if absent. The resolver treats `picked_package: None` as “no resolution”, so an undecodable top candidate can incorrectly make the whole dependency unresolvable.
### Fix Focus Areas
- pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs[216-241]
- pacquet/crates/resolving-npm-resolver/src/npm_resolver.rs[393-401]
- pacquet/crates/registry/src/package_versions.rs[13-16]
### Suggested fix
For range picks, loop: pick the best candidate string; attempt hydration; if hydration fails, record the version in an exclusion set and re-run the picker excluding those versions, until a hydratable manifest is found or no candidates remain. (Optional: `tracing::warn` once per excluded version to aid debugging.)

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended
5. Mirror write doubles memory 🐞 Bug ➹ Performance
Description
save_meta_indexed() builds fragment_bytes containing all version fragments and then copies it
again into a second bytes buffer (bytes.extend_from_slice(&fragment_bytes)), doubling peak
memory for large packuments during mirror writes. This can significantly increase allocator pressure
and risk OOM when caching very large packuments.
Code

pacquet/crates/resolving-npm-resolver/src/mirror.rs[R238-267]

+    let mut fragment_bytes = Vec::new();
+    let mut spans = Vec::with_capacity(meta.versions.len());
+    for (version, json) in meta.versions.fragments() {
+        let offset = fragment_bytes.len() as u64;
+        let len = u32::try_from(json.len()).unwrap_or(u32::MAX);
+        if len as usize != json.len() {
+            // A single >4 GiB version manifest is not a thing the npm
+            // registry produces; skip it rather than corrupt the index.
+            continue;
+        }
+        fragment_bytes.extend_from_slice(json.as_bytes());
+        spans.push((version.clone(), offset, len));
+    }
+
+    let index = serde_json::to_string(&MirrorIndex {
+        name: meta.name.clone(),
+        dist_tags: meta.dist_tags.clone(),
+        time: meta.time.clone(),
+        homepage: meta.homepage.clone(),
+        versions: spans,
+    })
+    .map_err(|error| SaveMetaError::Encode(EncodeMetaError(error)))?;
+
+    let mut contents = String::with_capacity(headers.len() + index.len() + 64);
+    let _ = writeln!(contents, "{MIRROR_MAGIC} {} {}", headers.len(), index.len());
+    contents.push_str(&headers);
+    contents.push_str(&index);
+    let mut bytes = contents.into_bytes();
+    bytes.extend_from_slice(&fragment_bytes);
+    save_meta(pkg_mirror, &bytes)
Evidence
The function first accumulates all fragment bytes into fragment_bytes, then constructs bytes
from the header/index string and appends fragment_bytes via copying; save_meta currently
requires a full &[u8], encouraging this extra buffering.

pacquet/crates/resolving-npm-resolver/src/mirror.rs[227-268]
pacquet/crates/resolving-npm-resolver/src/mirror.rs[417-437]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`save_meta_indexed()` materializes the fragment section twice in memory: once in `fragment_bytes`, then again when copying into `bytes` for `save_meta()`. For very large packuments, this amplifies peak memory usage unnecessarily.
## Issue Context
The new mirror format is meant to improve warm-load performance; mirror writes happen on 200 responses and on explicit persist paths.
## Fix Focus Areas
- pacquet/crates/resolving-npm-resolver/src/mirror.rs[227-268]
- pacquet/crates/resolving-npm-resolver/src/mirror.rs[417-443]
### Suggested fix approach
- Avoid the double buffer by either:
- Writing directly to the temp file in `save_meta` (change `save_meta` to accept a closure/writer and stream headers/index/fragments sequentially), or
- Building a single `Vec<u8>`: push the magic line + headers + index into one `Vec`, `reserve_exact` for fragment total size (if computed) and append fragments directly without an intermediate `fragment_bytes`.
- Keep the existing atomic temp+rename semantics.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


6. Deprecated fallback hydrates all 🐞 Bug ➹ Performance
Description
In pick_version_by_version_range, the deprecated-version fallback builds the non-deprecated
candidate list using opts.meta.versions.iter(), which hydrates every version fragment into a full
PackageVersion. When the initially selected max satisfying version is deprecated, this can trigger
full-packument hydration work and negate the lazy-hydration performance win for that pick.
Code

pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs[R305-310]

  if let Some(ref picked) = max_pick {
      let picked_meta = opts.meta.versions.get(picked);
-        let picked_is_deprecated =
-            picked_meta.and_then(|version| version.deprecated.as_ref()).is_some();
+        let picked_is_deprecated = picked_meta.is_some_and(|version| version.deprecated.is_some());
      if picked_is_deprecated && all_versions.len() > 1 {
          let non_deprecated: Vec<&str> = opts
              .meta
Evidence
The deprecated fallback computes non_deprecated by calling opts.meta.versions.iter(), and
PackageVersions::iter() is defined to hydrate every fragment; therefore, this fallback branch
forces full hydration across all versions when it triggers.

pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs[298-316]
pacquet/crates/registry/src/package_versions.rs[112-118]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`pick_version_by_version_range` uses `opts.meta.versions.iter()` inside the deprecated-fallback branch, which now hydrates *all* version manifests. This is avoidable because the fallback only needs to find the highest satisfying **non-deprecated** version.
### Issue Context
- `PackageVersions::iter()` hydrates every raw fragment (and is documented as suitable only for cold paths).
- The deprecated-fallback branch can run during normal picking when the best satisfying version is deprecated.
### Fix Focus Areas
- pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs[298-320]
- pacquet/crates/registry/src/package_versions.rs[112-118]
### Suggested fix approach
- Replace the `versions.iter().filter(|(_, manifest)| manifest.deprecated.is_none())...` full scan with a descending semver scan over candidate version **strings**.
- Hydrate only as needed:
- If `max_pick` is deprecated, iterate versions from highest to lowest (restricted to versions satisfying the range) and call `meta.versions.get(candidate)` until you find one with `deprecated.is_none()`.
- This should hydrate only a handful of candidates in typical cases rather than every version.
- Treat undecodable fragments (`get` returns `None`) as “absent”/skip when searching for a non-deprecated candidate, consistent with the PR’s semantics.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


7. Tarball projection hydrates all 🐞 Bug ➹ Performance
Description
project_abbreviated_meta uses meta.versions.iter() to build a version -> dist.tarball map,
which forces hydration of every version into a full PackageVersion even though only dist.tarball
is needed. For large packuments, this reintroduces substantial CPU/memory cost into the verifier
path and undermines the lazy-hydration benefit.
Code

pacquet/crates/registry/src/package_versions.rs[R112-118]

+    /// Iterate `(version, manifest)` pairs, hydrating every fragment.
+    /// Undecodable fragments are skipped. Full walks defeat the lazy
+    /// representation, so this belongs only on cold paths (the trust
+    /// verifier's history scan, tests).
+    pub fn iter(&self) -> impl Iterator<Item = (&String, Arc<PackageVersion>)> {
+        self.slots.iter().filter_map(|(version, slot)| Some((version, slot.hydrate(version)?)))
+    }
Evidence
PackageVersions::iter() is implemented t...

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Review Change Stack

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 10b35404-de7d-47d2-9a13-de2b1a484b64

📥 Commits

Reviewing files that changed from the base of the PR and between ae28135 and f6b75fc.

📒 Files selected for processing (2)
  • pacquet/crates/resolving-npm-resolver/src/trust_checks.rs
  • pacquet/crates/resolving-npm-resolver/src/trust_checks/tests.rs

📝 Walkthrough

Walkthrough

Adds a lazily-hydrated PackageVersions container, an indexed pacquet-meta-v1 mirror format, and propagates shared Arc across registry, resolver, cache, trust checks, mirror persistence, and CLI; updates related tests.

Changes

Lazy version hydration with Arc refactor

Layer / File(s) Summary
Enable serde_json raw_value & tracing
Cargo.toml, pacquet/crates/registry/Cargo.toml
serde_json workspace dependency gains raw_value; tracing dependency is added to registry crate.
PackageVersions core storage, serde, API
pacquet/crates/registry/src/package_versions.rs
Adds PackageVersions and VersionSlot with fragment-backed lazy hydration (OnceLock), FragmentSource variants, hydrate, lazy get/iter/filtered, buffer-span mirror support, and custom serde that prefers raw fragments when serializing.
Package struct updates and tests
pacquet/crates/registry/src/package.rs, pacquet/crates/registry/src/package_versions/tests.rs
Replace Package.versions with PackageVersions; pinned_version/latest now return Option<Arc<PackageVersion>>; tests verify hydration caching, undecodable-fragment fail-closed behavior, serialization round-trips, typed construction, filtered semantics, pinned-version fallback, and latest() None behavior.
Indexed mirror format and API
pacquet/crates/resolving-npm-resolver/src/mirror.rs, .../mirror/tests.rs
Switch mirror to pacquet-meta-v1 indexed format, add save_meta_indexed, header/index/fragment parsing and span validation, update save_meta to accept bytes, and add tests for headers-only load, fragment hydration, truncated-file rejection, NDJSON-as-miss, and atomic overwrite.
Mirror tests (indexed format)
pacquet/crates/resolving-npm-resolver/src/mirror/tests.rs
Replace NDJSON-shape tests with indexed-mirror tests validating headers-only round-trip, full hydration, truncated-fragment miss, NDJSON-as-miss, and atomic overwrite semantics.
Fetch persistence updates
pacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached.rs
Persist fetched packument metadata using save_meta_indexed and update imports/docs for the indexed mirror layout; remove prior serialize-then-write branch.
Resolver pickers, cache, and persistence
pacquet/crates/resolving-npm-resolver/src/pick_package.rs
Propagate Arc<PackageVersion> through picker return types and helpers, switch InMemoryPackageMetaCache from Mutex<HashMap> to DashMap, and persist metadata via save_meta_indexed.
pick_package_from_meta: borrow and Arc return; filtering
pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs
Return Result<Option<Arc<PackageVersion>>, ...>, borrow manifests instead of cloning (clone only when pinning), add retry-on-undecodable-winner logic via without_version, mark filter_pkg_metadata_by_publish_date #[must_use], and use PackageVersions::filtered for date-based filtering.
pick_package_from_meta tests
pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta/tests.rs
Add tests covering retry when the winning manifest is undecodable and exact-version returning None for undecodable-only entries.
Trust projection and checks
pacquet/crates/resolving-npm-resolver/src/create_npm_resolution_verifier.rs, .../trust_checks.rs, .../npm_resolver.rs
Pass manifest references in trust projection, borrow manifests in trust-evidence walks, change strongest-evidence scan to return Result<Option<_>, TrustViolation>, and set PickedFromRegistry.version to Arc<PackageVersion>.
CLI outdated helper
pacquet/crates/cli/src/cli_args/outdated.rs
Use fetched package homepage directly (remove clone); resolve_target now returns Option<Arc<PackageVersion>>.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related issues

Possibly related PRs

  • pnpm/pnpm#12112: Touches the same CLI outdated helper and related resolve_target changes.
  • pnpm/pnpm#11710: Contains tests exercising Package::latest and pinned-version behavior adjusted here.
  • pnpm/pnpm#11794: Overlaps resolver fetch/persistence changes and save_meta_indexed adoption.

Poem

🐰 I nibble raw JSON in a guarded spot,
OnceLock keeps secrets till the code asks a lot.
Arcs link the manifests, shared and light,
Indexed mirrors hum through day and night.
A lazy rabbit hops — hydrate only what's right.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the three main performance optimizations: lazy packument hydration, sharded meta cache, and indexed metadata mirror.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch manifest-lazy-hydration

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@qodo-free-for-open-source-projects

Copy link
Copy Markdown

PR Summary by Qodo

perf(registry): lazily hydrate packument versions from raw JSON fragments
✨ Enhancement 🧪 Tests ⚙️ Configuration changes 🕐 40+ Minutes

Grey Divider

Walkthroughs

Description
• Store per-version manifests as raw JSON and hydrate typed manifests only on first access.
• Cache hydrated manifests per version slot and tolerate undecodable fragments as absent.
• Thread shared Arc through resolver, trust checks, and outdated CLI.
Diagram
graph TD
  R{{"NPM registry"}} --> P["Package parse"] --> V["PackageVersions (raw slots)"] --> L[("Lazy hydrate cache")] --> M["Arc<PackageVersion>"] --> C["Resolver / trust / CLI"]
  V -->|"keys()/contains"| C
  V --> S["Mirror serialize (verbatim)"]
  subgraph Legend
    direction LR
    _ext{{"External"}} ~~~ _mod["Module"] ~~~ _cache[("Cache")]
  end
Loading
High-Level Assessment

The following are alternative approaches to this PR:

1. Represent versions as serde_json::Value + on-demand typed decode
  • ➕ Avoids enabling serde_json raw_value feature
  • ➕ Simpler custom Deserialize implementation
  • ➖ Still eagerly builds full Value trees during packument parse (loses most CPU/allocation win)
  • ➖ Does not preserve raw bytes for exact mirror round-trips without extra work
2. Separate index pass: parse only keys first, fetch per-version JSON later
  • ➕ Minimizes initial parse to just version strings
  • ➕ Could avoid holding raw fragments for all versions in memory
  • ➖ Requires additional network/filesystem fetches or a different packument wire format
  • ➖ Breaks current assumption that one packument response contains all version manifests
3. Global LRU cache keyed by (package, version) instead of per-slot OnceLock
  • ➕ Can bound memory across many packages and evict cold manifests
  • ➕ Potentially reduces per-packument retained state
  • ➖ Adds concurrency complexity and tuning (capacity, eviction policy)
  • ➖ Per-slot OnceLock already matches access patterns and keeps lookup O(1) with minimal contention

Recommendation: Current approach (RawValue fragments + per-slot OnceLock + Arc manifests) is the best fit: it removes the dominant eager-hydration cost without changing I/O patterns, preserves registry bytes for mirror round-trips, and keeps concurrency/ownership simple. The per-slot caching also matches resolver access patterns (many key scans, few manifest hydrations).

Grey Divider

File Changes

Enhancement (2)
package.rs Replace eager versions map with lazy PackageVersions +19/-15

Replace eager versions map with lazy PackageVersions

• Changes 'Package.versions' from 'HashMap<String, PackageVersion>' to 'PackageVersions'. Reworks 'pinned_version' and 'latest' to hydrate only the selected manifest and return 'Arc<PackageVersion>'.

pacquet/crates/registry/src/package.rs


package_versions.rs Implement lazy-hydrated per-version manifest storage +190/-0

Implement lazy-hydrated per-version manifest storage

• Introduces 'PackageVersions', storing each version entry as an 'Arc<RawValue>' and hydrating into 'Arc<PackageVersion>' on first access with 'OnceLock' caching. Serialization round-trips raw fragments verbatim and undecodable fragments behave as absent with a single warning.

pacquet/crates/registry/src/package_versions.rs


Refactor (7)
outdated.rs Adapt outdated target resolution to Arc<PackageVersion> +4/-4

Adapt outdated target resolution to Arc<PackageVersion>

• Updates 'resolve_target' to return 'Arc<PackageVersion>' from the new lazy-hydrated versions map. Avoids cloning 'homepage' by moving the 'Option<String>' out of the fetched 'Package'.

pacquet/crates/cli/src/cli_args/outdated.rs


lib.rs Wire in new PackageVersions module +2/-0

Wire in new PackageVersions module

• Registers the 'package_versions' module and re-exports 'PackageVersions' as part of the crate public API.

pacquet/crates/registry/src/lib.rs


create_npm_resolution_verifier.rs Adjust trust-meta projection for Arc manifests +1/-1

Adjust trust-meta projection for Arc manifests

• Updates the trust-meta projection to handle iterator values as 'Arc<PackageVersion>' and pass references into the projection function while keeping the minimized trust-only shape.

pacquet/crates/resolving-npm-resolver/src/create_npm_resolution_verifier.rs


npm_resolver.rs Store picked manifest as Arc to avoid cloning +1/-1

Store picked manifest as Arc to avoid cloning

• Changes 'PickedFromRegistry.version' to 'Arc<PackageVersion>' so picks reuse the lazily hydrated shared manifest rather than cloning full structs.

pacquet/crates/resolving-npm-resolver/src/npm_resolver.rs


pick_package.rs Thread Arc<PackageVersion> through pick results +12/-8

Thread Arc<PackageVersion> through pick results

• Updates pick helpers and 'PickPackageResult' to return 'Option<Arc<PackageVersion>>' and adjusts 'pick_max' accordingly, aligning resolver plumbing with shared manifests.

pacquet/crates/resolving-npm-resolver/src/pick_package.rs


pick_package_from_meta.rs Use lazy versions filtering and return shared manifests +16/-18

Use lazy versions filtering and return shared manifests

• Changes 'pick_package_from_meta' to return 'Arc<PackageVersion>' directly from 'PackageVersions::get' (hydrating only the chosen version). Rewrites publish-date filtering to move version slots without hydration via 'PackageVersions::filtered', and updates deprecated checks to work with Arc-based access.

pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs


trust_checks.rs Iterate versions via lazy iterator and borrow manifests +3/-3

Iterate versions via lazy iterator and borrow manifests

• Updates trust checks to iterate 'meta.versions.iter()' (hydrating each manifest only on this cold path) and to pass references where the picked manifest is now Arc-backed.

pacquet/crates/resolving-npm-resolver/src/trust_checks.rs


Tests (1)
tests.rs Add unit tests for lazy hydration, caching, and round-trips +95/-0

Add unit tests for lazy hydration, caching, and round-trips

• Covers on-demand hydration with Arc identity caching, undecodable-fragment-as-absent behavior, verbatim serialization of unknown keys, eager construction from typed manifests, and hydration-free filtering.

pacquet/crates/registry/src/package_versions/tests.rs


Other (2)
Cargo.toml Enable serde_json raw_value feature +1/-1

Enable serde_json raw_value feature

• Turns on serde_json's 'raw_value' feature to allow capturing per-version JSON fragments as 'RawValue' for lazy hydration.

Cargo.toml


Cargo.toml Add tracing dependency for decode warnings +1/-0

Add tracing dependency for decode warnings

• Adds 'tracing' so lazy hydration can warn once per undecodable version fragment.

pacquet/crates/registry/Cargo.toml


Grey Divider

Qodo Logo

Comment thread pacquet/crates/registry/src/package.rs Outdated
Comment thread pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs Outdated
@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Micro-Benchmark Results

Linux

group                          main                                   pr
-----                          ----                                   --
tarball/download_dependency    1.00      5.9±0.04ms   739.4 KB/sec    1.02      6.0±0.16ms   723.8 KB/sec

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
pacquet/crates/registry/src/package_versions/tests.rs (1)

31-50: ⚡ Quick win

Add selector-level regression coverage for undecodable highest versions.

This suite proves get/iter fail closed, but it doesn’t guard Package::pinned_version/latest behavior when the highest semver key (or dist-tags.latest) points to an undecodable fragment. A targeted regression test would lock this contract.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pacquet/crates/registry/src/package_versions/tests.rs` around lines 31 - 50,
Add a regression test that ensures selector-level APIs treat undecodable
highest-version fragments as absent: create a Package like in the existing test
but with the highest semver key (and separately a case where
"dist-tags"."latest" points) to an undecodable manifest, then assert
Package::pinned_version(...) (or the crate's pinned_version method) and
Package::latest (or latest method) return None/absent rather than panicking or
returning an invalid value; reuse the existing parse_package setup and the
"9.9.9" undecodable fragment, calling the package.pinned_version(...) and
package.latest() helpers and adding assertions that they are None and do not
change iteration/get behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pacquet/crates/registry/src/package.rs`:
- Around line 146-160: The current pinned_version parses keys to find the
highest satisfying version string but then looks up that key and returns None if
the stored manifest is undecodable, dropping valid lower matches; change the
logic in pinned_version so you iterate over self.versions entries (use
self.versions.iter()), parse each entry's key with
key.parse::<node_semver::Version>(), filter to those that parse and satisfy the
range, and keep the associated value (Arc<PackageVersion>) at selection time;
then pick the maximum parsed Version (or sort) among those already-decoded
entries and return its value, ensuring undecodable slots are treated as absent
and lower valid manifests are returned.
- Around line 164-167: The current latest() method panics when the dist_tags
"latest" target points to a missing/undecodable entry; change latest() to avoid
expect() by returning a non-panicking result (e.g., Option<Arc<PackageVersion>>
or Result<Arc<PackageVersion>, RegistryError>), look up the "latest" tag via
self.dist_tags.get("latest") and then use
self.versions.get(version).cloned().ok_or(...) or .cloned() wrapped in
Some/None, propagate the error type up (or convert to Option) and update all
callers of latest() to handle the new return type; keep references to dist_tags,
versions, and the latest() function name to locate and fix the code.

---

Nitpick comments:
In `@pacquet/crates/registry/src/package_versions/tests.rs`:
- Around line 31-50: Add a regression test that ensures selector-level APIs
treat undecodable highest-version fragments as absent: create a Package like in
the existing test but with the highest semver key (and separately a case where
"dist-tags"."latest" points) to an undecodable manifest, then assert
Package::pinned_version(...) (or the crate's pinned_version method) and
Package::latest (or latest method) return None/absent rather than panicking or
returning an invalid value; reuse the existing parse_package setup and the
"9.9.9" undecodable fragment, calling the package.pinned_version(...) and
package.latest() helpers and adding assertions that they are None and do not
change iteration/get behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 2c274600-f8f7-4ce7-b325-3b67c1056fe5

📥 Commits

Reviewing files that changed from the base of the PR and between ac367fc and 1681623.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (12)
  • Cargo.toml
  • pacquet/crates/cli/src/cli_args/outdated.rs
  • pacquet/crates/registry/Cargo.toml
  • pacquet/crates/registry/src/lib.rs
  • pacquet/crates/registry/src/package.rs
  • pacquet/crates/registry/src/package_versions.rs
  • pacquet/crates/registry/src/package_versions/tests.rs
  • pacquet/crates/resolving-npm-resolver/src/create_npm_resolution_verifier.rs
  • pacquet/crates/resolving-npm-resolver/src/npm_resolver.rs
  • pacquet/crates/resolving-npm-resolver/src/pick_package.rs
  • pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs
  • pacquet/crates/resolving-npm-resolver/src/trust_checks.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run benchmark on ubuntu-latest
  • GitHub Check: Run benchmark on ubuntu-latest
  • GitHub Check: Code Coverage
  • GitHub Check: Compile & Lint
🧰 Additional context used
📓 Path-based instructions (2)
pacquet/**/Cargo.toml

📄 CodeRabbit inference engine (pacquet/AGENTS.md)

pacquet/**/Cargo.toml: Check whether the workspace already depends on something suitable in [workspace.dependencies] in the root Cargo.toml before adding a new dependency
Keep dependencies at the right level — add a new dependency to the specific crate that needs it, not to the workspace root or shared crate unless multiple crates depend on it

Files:

  • pacquet/crates/registry/Cargo.toml
pacquet/**/*.rs

📄 CodeRabbit inference engine (pacquet/AGENTS.md)

pacquet/**/*.rs: Log emissions are part of matching pnpm — when porting a function that fires pnpm:<channel> events through globalLogger, logger.debug(...), or streamParser.write(...), mirror the call site, payload, and ordering so @pnpm/cli.default-reporter parses pacquet's NDJSON the same way
Declare a newtype wrapper for branded string types instead of collapsing the brand into a plain String or &str in Rust
If upstream TypeScript always validates before construction of a branded string, validate in the Rust wrapper too via TryFrom<String> and/or FromStr and do not provide an infallible public constructor
If upstream TypeScript never validates a branded string, just brand for type-safety in Rust by exposing an infallible From<String> constructor
If upstream TypeScript occasionally constructs a branded string without validation, expose from_str_unchecked in Rust as an escape hatch alongside the validating constructor
Match upstream serde behavior for branded strings crossing JSON, YAML, or INI boundaries by using #[serde(try_from = "String")] for deserialization and #[serde(into = "String")] for serialization
Derive simple conversions for branded strings using #[derive(derive_more::From)] and #[derive(derive_more::Into)] instead of handwriting impl blocks; use manual impl only when conversion needs custom logic
Model TypeScript string literal unions (like 'auto' | 'always' | 'never') as Rust enums instead of newtype wrappers, since the set of valid values is closed
Treat TypeScript string template literal types (like `${string}@${string}`) the same as branded string types in Rust, using a newtype wrapper with validation
Follow the code style guide in CODE_STYLE_GUIDE.md — imports, modules, naming, ownership and borrowing, parameter type selection, trait bounds, pattern matching, pipe-trait, error handling, test layout, and cloning of Arc and Rc
Choose owned vs. borrowed parameters to minimize copies; widen to t...

Files:

  • pacquet/crates/registry/src/lib.rs
  • pacquet/crates/resolving-npm-resolver/src/npm_resolver.rs
  • pacquet/crates/registry/src/package_versions/tests.rs
  • pacquet/crates/cli/src/cli_args/outdated.rs
  • pacquet/crates/registry/src/package.rs
  • pacquet/crates/resolving-npm-resolver/src/create_npm_resolution_verifier.rs
  • pacquet/crates/registry/src/package_versions.rs
  • pacquet/crates/resolving-npm-resolver/src/pick_package.rs
  • pacquet/crates/resolving-npm-resolver/src/trust_checks.rs
  • pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs
🧠 Learnings (4)
📚 Learning: 2026-05-20T19:40:55.051Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11774
File: pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs:0-0
Timestamp: 2026-05-20T19:40:55.051Z
Learning: In the pacquet Rust code, ensure the semver implementation uses the `node-semver` crate (not `nodejs-semver`). `node-semver`’s public API does not include a `satisfies_with_prerelease`-style method; prerelease-tolerant matching should be implemented inline by first calling `Range::satisfies`, and when it rejects a prerelease version, retry matching against a stripped `MAJOR.MINOR.PATCH` base of the prerelease version.

Applied to files:

  • pacquet/crates/registry/src/lib.rs
  • pacquet/crates/resolving-npm-resolver/src/npm_resolver.rs
  • pacquet/crates/registry/src/package_versions/tests.rs
  • pacquet/crates/cli/src/cli_args/outdated.rs
  • pacquet/crates/registry/src/package.rs
  • pacquet/crates/resolving-npm-resolver/src/create_npm_resolution_verifier.rs
  • pacquet/crates/registry/src/package_versions.rs
  • pacquet/crates/resolving-npm-resolver/src/pick_package.rs
  • pacquet/crates/resolving-npm-resolver/src/trust_checks.rs
  • pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs
📚 Learning: 2026-05-22T00:08:44.646Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11837
File: pacquet/crates/resolving-npm-resolver/src/pick_package.rs:33-51
Timestamp: 2026-05-22T00:08:44.646Z
Learning: In the pnpm/pnpm repo’s pacquet Rust crates, do not flag Unicode ellipsis characters (U+2026, `…`) in Rust doc comments (`///` / `/** */`) as a lint violation. The pacquet crate’s `dylint.toml` only enables `perfectionist::derive_ordering`, and the Dylint `unicode-ellipsis` rule is not enabled for this project—so `…` in doc comments is an intentional, repo-consistent style.

Applied to files:

  • pacquet/crates/registry/src/lib.rs
  • pacquet/crates/resolving-npm-resolver/src/npm_resolver.rs
  • pacquet/crates/registry/src/package_versions/tests.rs
  • pacquet/crates/cli/src/cli_args/outdated.rs
  • pacquet/crates/registry/src/package.rs
  • pacquet/crates/resolving-npm-resolver/src/create_npm_resolution_verifier.rs
  • pacquet/crates/registry/src/package_versions.rs
  • pacquet/crates/resolving-npm-resolver/src/pick_package.rs
  • pacquet/crates/resolving-npm-resolver/src/trust_checks.rs
  • pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs
📚 Learning: 2026-05-20T23:07:58.444Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11784
File: pacquet/crates/resolving-deps-resolver/src/hoist_peers.rs:120-133
Timestamp: 2026-05-20T23:07:58.444Z
Learning: When reviewing code in this pacquet Rust port, follow the upstream pnpm compatibility rule: only match pnpm’s behavior exactly. Do not propose review changes that intentionally deviate from pnpm’s documented/observed behavior, even if pnpm appears buggy. If you identify a real bug in pnpm behavior, the review should prioritize fixing it upstream in pnpm first, and avoid implementing a pnpm-behavior workaround here unless the same fix has already landed upstream.

Applied to files:

  • pacquet/crates/registry/src/lib.rs
  • pacquet/crates/resolving-npm-resolver/src/npm_resolver.rs
  • pacquet/crates/registry/src/package_versions/tests.rs
  • pacquet/crates/cli/src/cli_args/outdated.rs
  • pacquet/crates/registry/src/package.rs
  • pacquet/crates/resolving-npm-resolver/src/create_npm_resolution_verifier.rs
  • pacquet/crates/registry/src/package_versions.rs
  • pacquet/crates/resolving-npm-resolver/src/pick_package.rs
  • pacquet/crates/resolving-npm-resolver/src/trust_checks.rs
  • pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs
📚 Learning: 2026-06-06T18:58:37.156Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 12243
File: pacquet/crates/package-manager/src/install_package_by_snapshot.rs:319-322
Timestamp: 2026-06-06T18:58:37.156Z
Learning: When reviewing Rust code, do not assume `matches!(expr, Pattern(_))` will move out of `expr` if `Pattern(_)` contains no by-value bindings. `matches!` desugars to a `match` that auto-borrows the scrutinee for discrimination, so even if `expr` is a non-`Copy` value behind a shared reference (e.g., `&T`), the macro should not move-out of the borrowed data purely due to `matches!`. Treat `matches!(&expr, Pattern(_))` as a readability/clarity improvement, not a correctness requirement. Only flag potential move-out-of-borrow risks when the pattern includes by-value bindings that would require moving the matched value.

Applied to files:

  • pacquet/crates/registry/src/lib.rs
  • pacquet/crates/resolving-npm-resolver/src/npm_resolver.rs
  • pacquet/crates/registry/src/package_versions/tests.rs
  • pacquet/crates/cli/src/cli_args/outdated.rs
  • pacquet/crates/registry/src/package.rs
  • pacquet/crates/resolving-npm-resolver/src/create_npm_resolution_verifier.rs
  • pacquet/crates/registry/src/package_versions.rs
  • pacquet/crates/resolving-npm-resolver/src/pick_package.rs
  • pacquet/crates/resolving-npm-resolver/src/trust_checks.rs
  • pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs
🔇 Additional comments (6)
pacquet/crates/resolving-npm-resolver/src/pick_package.rs (1)

307-307: LGTM!

Also applies to: 746-752, 764-783, 794-822, 827-849, 855-876, 882-897

pacquet/crates/resolving-npm-resolver/src/pick_package_from_meta.rs (1)

142-152: LGTM!

Also applies to: 230-241, 307-307, 367-389, 411-416

pacquet/crates/resolving-npm-resolver/src/npm_resolver.rs (1)

535-538: LGTM!

pacquet/crates/resolving-npm-resolver/src/create_npm_resolution_verifier.rs (1)

951-955: LGTM!

pacquet/crates/resolving-npm-resolver/src/trust_checks.rs (1)

178-178: LGTM!

Also applies to: 224-224, 243-243

pacquet/crates/cli/src/cli_args/outdated.rs (1)

149-149: LGTM!

Also applies to: 159-178

Comment thread pacquet/crates/registry/src/package.rs Outdated
Comment thread pacquet/crates/registry/src/package.rs Outdated
@codecov-commenter

codecov-commenter commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 91.96429% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.73%. Comparing base (657d322) to head (f6b75fc).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pacquet/crates/registry/src/package_versions.rs 89.20% 15 Missing ⚠️
...acquet/crates/resolving-npm-resolver/src/mirror.rs 92.00% 8 Missing ⚠️
...solving-npm-resolver/src/pick_package_from_meta.rs 93.33% 3 Missing ⚠️
...ing-npm-resolver/src/fetch_full_metadata_cached.rs 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #12322      +/-   ##
==========================================
+ Coverage   87.69%   87.73%   +0.03%     
==========================================
  Files         290      291       +1     
  Lines       35809    36034     +225     
==========================================
+ Hits        31404    31613     +209     
- Misses       4405     4421      +16     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@qodo-free-for-open-source-projects

qodo-free-for-open-source-projects Bot commented Jun 11, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 4c28c66

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pacquet/crates/resolving-npm-resolver/src/pick_package.rs (1)

88-91: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update stale cache backend docs to match the new implementation.

Line 90 still says the default InMemoryPackageMetaCache uses a std Mutex, but the implementation now uses DashMap (Lines 174-194). Please align this trait doc comment so it reflects the current concurrency model.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pacquet/crates/resolving-npm-resolver/src/pick_package.rs` around lines 88 -
91, The doc comment stating that the default InMemoryPackageMetaCache uses a std
Mutex is stale; update the trait/class doc (the comment starting
"Implementations must be safe to call concurrently...") to reflect that the
current default InMemoryPackageMetaCache implementation uses DashMap for
concurrent access (see InMemoryPackageMetaCache implementation) and describe the
resulting concurrency characteristics instead of mentioning a std Mutex or a
pending tokio-aware variant.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@pacquet/crates/resolving-npm-resolver/src/pick_package.rs`:
- Around line 88-91: The doc comment stating that the default
InMemoryPackageMetaCache uses a std Mutex is stale; update the trait/class doc
(the comment starting "Implementations must be safe to call concurrently...") to
reflect that the current default InMemoryPackageMetaCache implementation uses
DashMap for concurrent access (see InMemoryPackageMetaCache implementation) and
describe the resulting concurrency characteristics instead of mentioning a std
Mutex or a pending tokio-aware variant.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 12060a3e-f406-4059-981a-a1c9a35f2557

📥 Commits

Reviewing files that changed from the base of the PR and between 1681623 and 4c28c66.

📒 Files selected for processing (1)
  • pacquet/crates/resolving-npm-resolver/src/pick_package.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Lint and Test (macos-latest)
  • GitHub Check: Lint and Test (ubuntu-latest)
  • GitHub Check: Lint and Test (windows-latest)
  • GitHub Check: Dylint
  • GitHub Check: Doc
  • GitHub Check: Run benchmark on ubuntu-latest
  • GitHub Check: Run benchmark on ubuntu-latest
  • GitHub Check: Code Coverage
  • GitHub Check: Compile & Lint
🧰 Additional context used
📓 Path-based instructions (1)
pacquet/**/*.rs

📄 CodeRabbit inference engine (pacquet/AGENTS.md)

pacquet/**/*.rs: Log emissions are part of matching pnpm — when porting a function that fires pnpm:<channel> events through globalLogger, logger.debug(...), or streamParser.write(...), mirror the call site, payload, and ordering so @pnpm/cli.default-reporter parses pacquet's NDJSON the same way
Declare a newtype wrapper for branded string types instead of collapsing the brand into a plain String or &str in Rust
If upstream TypeScript always validates before construction of a branded string, validate in the Rust wrapper too via TryFrom<String> and/or FromStr and do not provide an infallible public constructor
If upstream TypeScript never validates a branded string, just brand for type-safety in Rust by exposing an infallible From<String> constructor
If upstream TypeScript occasionally constructs a branded string without validation, expose from_str_unchecked in Rust as an escape hatch alongside the validating constructor
Match upstream serde behavior for branded strings crossing JSON, YAML, or INI boundaries by using #[serde(try_from = "String")] for deserialization and #[serde(into = "String")] for serialization
Derive simple conversions for branded strings using #[derive(derive_more::From)] and #[derive(derive_more::Into)] instead of handwriting impl blocks; use manual impl only when conversion needs custom logic
Model TypeScript string literal unions (like 'auto' | 'always' | 'never') as Rust enums instead of newtype wrappers, since the set of valid values is closed
Treat TypeScript string template literal types (like `${string}@${string}`) the same as branded string types in Rust, using a newtype wrapper with validation
Follow the code style guide in CODE_STYLE_GUIDE.md — imports, modules, naming, ownership and borrowing, parameter type selection, trait bounds, pattern matching, pipe-trait, error handling, test layout, and cloning of Arc and Rc
Choose owned vs. borrowed parameters to minimize copies; widen to t...

Files:

  • pacquet/crates/resolving-npm-resolver/src/pick_package.rs
🧠 Learnings (4)
📚 Learning: 2026-05-20T19:40:55.051Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11774
File: pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs:0-0
Timestamp: 2026-05-20T19:40:55.051Z
Learning: In the pacquet Rust code, ensure the semver implementation uses the `node-semver` crate (not `nodejs-semver`). `node-semver`’s public API does not include a `satisfies_with_prerelease`-style method; prerelease-tolerant matching should be implemented inline by first calling `Range::satisfies`, and when it rejects a prerelease version, retry matching against a stripped `MAJOR.MINOR.PATCH` base of the prerelease version.

Applied to files:

  • pacquet/crates/resolving-npm-resolver/src/pick_package.rs
📚 Learning: 2026-05-22T00:08:44.646Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11837
File: pacquet/crates/resolving-npm-resolver/src/pick_package.rs:33-51
Timestamp: 2026-05-22T00:08:44.646Z
Learning: In the pnpm/pnpm repo’s pacquet Rust crates, do not flag Unicode ellipsis characters (U+2026, `…`) in Rust doc comments (`///` / `/** */`) as a lint violation. The pacquet crate’s `dylint.toml` only enables `perfectionist::derive_ordering`, and the Dylint `unicode-ellipsis` rule is not enabled for this project—so `…` in doc comments is an intentional, repo-consistent style.

Applied to files:

  • pacquet/crates/resolving-npm-resolver/src/pick_package.rs
📚 Learning: 2026-05-20T23:07:58.444Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11784
File: pacquet/crates/resolving-deps-resolver/src/hoist_peers.rs:120-133
Timestamp: 2026-05-20T23:07:58.444Z
Learning: When reviewing code in this pacquet Rust port, follow the upstream pnpm compatibility rule: only match pnpm’s behavior exactly. Do not propose review changes that intentionally deviate from pnpm’s documented/observed behavior, even if pnpm appears buggy. If you identify a real bug in pnpm behavior, the review should prioritize fixing it upstream in pnpm first, and avoid implementing a pnpm-behavior workaround here unless the same fix has already landed upstream.

Applied to files:

  • pacquet/crates/resolving-npm-resolver/src/pick_package.rs
📚 Learning: 2026-06-06T18:58:37.156Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 12243
File: pacquet/crates/package-manager/src/install_package_by_snapshot.rs:319-322
Timestamp: 2026-06-06T18:58:37.156Z
Learning: When reviewing Rust code, do not assume `matches!(expr, Pattern(_))` will move out of `expr` if `Pattern(_)` contains no by-value bindings. `matches!` desugars to a `match` that auto-borrows the scrutinee for discrimination, so even if `expr` is a non-`Copy` value behind a shared reference (e.g., `&T`), the macro should not move-out of the borrowed data purely due to `matches!`. Treat `matches!(&expr, Pattern(_))` as a readability/clarity improvement, not a correctness requirement. Only flag potential move-out-of-borrow risks when the pattern includes by-value bindings that would require moving the matched value.

Applied to files:

  • pacquet/crates/resolving-npm-resolver/src/pick_package.rs
🔇 Additional comments (1)
pacquet/crates/resolving-npm-resolver/src/pick_package.rs (1)

174-185: LGTM!

Also applies to: 188-194

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Integrated-Benchmark Report (Linux)

Each scenario reports direct installs and pnpr installs. Bencher consumes pacquet@HEAD and pnpr@HEAD.

Scenario: Isolated linker: fresh restore, cold cache + cold store

Command Mean [s] Min [s] Max [s] Relative
pacquet@HEAD 10.149 ± 0.135 10.011 10.400 1.89 ± 0.05
pacquet@main 10.177 ± 0.188 10.044 10.576 1.89 ± 0.05
pnpr@HEAD 5.374 ± 0.110 5.274 5.657 1.00
pnpr@main 5.378 ± 0.077 5.286 5.549 1.00 ± 0.02
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 10.14895907022,
      "stddev": 0.13535932001226222,
      "median": 10.101273604420001,
      "user": 3.484995799999999,
      "system": 3.31729076,
      "min": 10.011422428420001,
      "max": 10.40017307842,
      "times": [
        10.123563926420001,
        10.249765678420001,
        10.283154000420001,
        10.046638165420001,
        10.078983282420001,
        10.242341663420001,
        10.03434345642,
        10.019205022420001,
        10.011422428420001,
        10.40017307842
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 10.177395697120001,
      "stddev": 0.18841431008561577,
      "median": 10.098530082420002,
      "user": 3.603321799999999,
      "system": 3.4358805599999998,
      "min": 10.04369395142,
      "max": 10.57592223642,
      "times": [
        10.07411984842,
        10.13612290342,
        10.070103016420001,
        10.10404078642,
        10.06754050842,
        10.093019378420001,
        10.48067891442,
        10.57592223642,
        10.128715427420001,
        10.04369395142
      ]
    },
    {
      "command": "pnpr@HEAD",
      "mean": 5.3739660669200005,
      "stddev": 0.11005569832868796,
      "median": 5.33922932642,
      "user": 2.6232628,
      "system": 2.8919311600000004,
      "min": 5.27414200742,
      "max": 5.65716592642,
      "times": [
        5.38708023542,
        5.3534034864199995,
        5.331558054419999,
        5.44500857442,
        5.3469005984199995,
        5.65716592642,
        5.30563927142,
        5.32401458142,
        5.31474793342,
        5.27414200742
      ]
    },
    {
      "command": "pnpr@main",
      "mean": 5.37791434542,
      "stddev": 0.07693020410749572,
      "median": 5.35248788642,
      "user": 2.6032692999999996,
      "system": 2.90310306,
      "min": 5.28637444142,
      "max": 5.54892284842,
      "times": [
        5.47858373342,
        5.34809436542,
        5.54892284842,
        5.35641388842,
        5.36100003342,
        5.34393415242,
        5.37197507242,
        5.28637444142,
        5.34856188442,
        5.33528303442
      ]
    }
  ]
}

Scenario: Isolated linker: fresh restore, hot cache + hot store

Command Mean [ms] Min [ms] Max [ms] Relative
pacquet@HEAD 663.7 ± 13.2 646.9 683.6 1.00
pacquet@main 698.7 ± 42.8 648.8 795.7 1.05 ± 0.07
pnpr@HEAD 786.0 ± 23.3 761.5 830.5 1.18 ± 0.04
pnpr@main 802.9 ± 33.7 768.8 864.8 1.21 ± 0.06
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 0.6637450609000002,
      "stddev": 0.013180780607185168,
      "median": 0.6651824353000001,
      "user": 0.38407778000000004,
      "system": 1.3056872,
      "min": 0.6469065303000001,
      "max": 0.6836442153000001,
      "times": [
        0.6830866403000001,
        0.6683059963000001,
        0.6836442153000001,
        0.6480208333,
        0.6551512273000001,
        0.6529429953000001,
        0.6469065303000001,
        0.6684339343000001,
        0.6620588743000001,
        0.6688993623000001
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 0.6986614629000002,
      "stddev": 0.04283027328249936,
      "median": 0.6870851588000001,
      "user": 0.39904338000000006,
      "system": 1.3458992999999997,
      "min": 0.6487529123000001,
      "max": 0.7956709373000002,
      "times": [
        0.6590283323000001,
        0.6487529123000001,
        0.7956709373000002,
        0.7442229363000001,
        0.6820792333000001,
        0.6864469653,
        0.7090320423,
        0.6877233523000001,
        0.6829440803000001,
        0.6907138373000001
      ]
    },
    {
      "command": "pnpr@HEAD",
      "mean": 0.7860048464000001,
      "stddev": 0.023321894011010345,
      "median": 0.7768085003,
      "user": 0.41237308000000006,
      "system": 1.3352523999999995,
      "min": 0.7614944493000001,
      "max": 0.8304547153000001,
      "times": [
        0.7748881773000001,
        0.7778336723000001,
        0.7733739733000001,
        0.8064063843000001,
        0.7792175883000001,
        0.8304547153000001,
        0.7641721813000001,
        0.7614944493000001,
        0.8164239943000001,
        0.7757833283000001
      ]
    },
    {
      "command": "pnpr@main",
      "mean": 0.8028894355000002,
      "stddev": 0.03373847223617222,
      "median": 0.7862647298000001,
      "user": 0.42310598,
      "system": 1.3410217999999998,
      "min": 0.7688092133000001,
      "max": 0.8648411883000001,
      "times": [
        0.8159633223000001,
        0.8582764933000001,
        0.7837910943000002,
        0.7793350673000001,
        0.7887383653000001,
        0.8648411883000001,
        0.8052862013000001,
        0.7829272903000001,
        0.7809261193000001,
        0.7688092133000001
      ]
    }
  ]
}

Scenario: Isolated linker: fresh install, cold cache + cold store

Command Mean [s] Min [s] Max [s] Relative
pacquet@HEAD 8.695 ± 0.066 8.590 8.803 1.71 ± 0.02
pacquet@main 8.783 ± 0.029 8.718 8.814 1.72 ± 0.02
pnpr@HEAD 5.129 ± 0.128 5.022 5.419 1.01 ± 0.03
pnpr@main 5.094 ± 0.053 5.039 5.219 1.00
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 8.69509315152,
      "stddev": 0.06565295926042186,
      "median": 8.68818638192,
      "user": 3.5854102999999995,
      "system": 3.30741786,
      "min": 8.59009770042,
      "max": 8.80333055142,
      "times": [
        8.62154689942,
        8.68934699642,
        8.68702576742,
        8.80333055142,
        8.75290833142,
        8.66625430942,
        8.77260804142,
        8.59009770042,
        8.67638476742,
        8.69142815042
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 8.78289499182,
      "stddev": 0.02902244516861471,
      "median": 8.79123591242,
      "user": 3.9435722000000006,
      "system": 3.3759075600000004,
      "min": 8.71828869342,
      "max": 8.81406782942,
      "times": [
        8.80298746542,
        8.77289720942,
        8.77746487542,
        8.808364430420001,
        8.81406782942,
        8.75361017042,
        8.78990741342,
        8.71828869342,
        8.79879741942,
        8.79256441142
      ]
    },
    {
      "command": "pnpr@HEAD",
      "mean": 5.12924214162,
      "stddev": 0.12772183203561363,
      "median": 5.07021094992,
      "user": 2.4274785,
      "system": 2.78094516,
      "min": 5.02246941842,
      "max": 5.41944841642,
      "times": [
        5.05467387742,
        5.02246941842,
        5.05326839442,
        5.41944841642,
        5.24245348442,
        5.09250299342,
        5.08574802242,
        5.22934556942,
        5.04393922342,
        5.04857201642
      ]
    },
    {
      "command": "pnpr@main",
      "mean": 5.09422230592,
      "stddev": 0.053246322689229936,
      "median": 5.08105667192,
      "user": 2.4310724,
      "system": 2.81011656,
      "min": 5.03921762442,
      "max": 5.21929737442,
      "times": [
        5.13556548542,
        5.07895215642,
        5.0568734304200005,
        5.03921762442,
        5.21929737442,
        5.12300109442,
        5.08733708542,
        5.08316118742,
        5.06603993042,
        5.05277769042
      ]
    }
  ]
}

Scenario: Isolated linker: fresh install, hot cache + hot store

Command Mean [s] Min [s] Max [s] Relative
pacquet@HEAD 1.192 ± 0.045 1.150 1.311 1.72 ± 0.07
pacquet@main 1.506 ± 0.031 1.474 1.583 2.17 ± 0.06
pnpr@HEAD 0.694 ± 0.013 0.670 0.716 1.00
pnpr@main 0.699 ± 0.017 0.676 0.729 1.01 ± 0.03
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 1.19244784868,
      "stddev": 0.04524514694559847,
      "median": 1.1849458083800002,
      "user": 1.1714078399999999,
      "system": 1.7402268200000002,
      "min": 1.15021030438,
      "max": 1.3111284113800001,
      "times": [
        1.2116949143800002,
        1.15021030438,
        1.3111284113800001,
        1.1652188093800002,
        1.18179784138,
        1.1913870423800001,
        1.16897941938,
        1.16493649738,
        1.19103147138,
        1.18809377538
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 1.50606618148,
      "stddev": 0.03106317473053343,
      "median": 1.5005556548800003,
      "user": 1.6329497400000001,
      "system": 1.83477672,
      "min": 1.47378697738,
      "max": 1.58333078838,
      "times": [
        1.58333078838,
        1.5186828073800003,
        1.49379966538,
        1.4871332543800002,
        1.5080467423800001,
        1.5071438133800001,
        1.51612326838,
        1.47378697738,
        1.4939674963800003,
        1.4786470013800002
      ]
    },
    {
      "command": "pnpr@HEAD",
      "mean": 0.6938548239800002,
      "stddev": 0.013490312991107714,
      "median": 0.69520321388,
      "user": 0.35663203999999993,
      "system": 1.2828899200000001,
      "min": 0.6698163313800001,
      "max": 0.71584310738,
      "times": [
        0.6970342353800001,
        0.68544542838,
        0.71247267138,
        0.6698163313800001,
        0.69549757538,
        0.68667182338,
        0.6845030563800001,
        0.6963551583800001,
        0.71584310738,
        0.69490885238
      ]
    },
    {
      "command": "pnpr@main",
      "mean": 0.69908728138,
      "stddev": 0.0168330519148571,
      "median": 0.6960881883800001,
      "user": 0.35024714,
      "system": 1.2924232199999997,
      "min": 0.67587454638,
      "max": 0.72932897738,
      "times": [
        0.72932897738,
        0.69230397138,
        0.7050045633800001,
        0.68295670438,
        0.72267373638,
        0.7040005543800001,
        0.6960596853800001,
        0.67587454638,
        0.69611669138,
        0.68655338338
      ]
    }
  ]
}

Scenario: Isolated linker: fresh install, cold cache + hot store

Command Mean [s] Min [s] Max [s] Relative
pacquet@HEAD 4.893 ± 0.027 4.839 4.924 6.99 ± 0.14
pacquet@main 5.070 ± 0.090 4.958 5.199 7.24 ± 0.19
pnpr@HEAD 0.729 ± 0.017 0.702 0.766 1.04 ± 0.03
pnpr@main 0.700 ± 0.013 0.679 0.722 1.00
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 4.89301711724,
      "stddev": 0.02722789499164887,
      "median": 4.9049695705400005,
      "user": 1.5899087399999998,
      "system": 1.9773394399999997,
      "min": 4.83939035454,
      "max": 4.92379187554,
      "times": [
        4.857428752540001,
        4.90370017554,
        4.90623896554,
        4.83939035454,
        4.91464366854,
        4.92379187554,
        4.87473761754,
        4.90672187754,
        4.910457360540001,
        4.89306052454
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 5.069503105640001,
      "stddev": 0.08954544430737345,
      "median": 5.05763078954,
      "user": 2.06637104,
      "system": 2.1002849400000008,
      "min": 4.9582244735400005,
      "max": 5.19864888554,
      "times": [
        4.97075187554,
        4.9582244735400005,
        4.97939061054,
        5.04973502554,
        5.18140542554,
        5.0437483395400005,
        5.19864888554,
        5.07021109354,
        5.06552655354,
        5.177388773540001
      ]
    },
    {
      "command": "pnpr@HEAD",
      "mean": 0.7292415136400001,
      "stddev": 0.016778757615346866,
      "median": 0.72922503204,
      "user": 0.36008063999999995,
      "system": 1.3028663399999998,
      "min": 0.7017121255400001,
      "max": 0.7659922235400001,
      "times": [
        0.72414327654,
        0.73452336854,
        0.73691882054,
        0.7341149945400001,
        0.7302877545400001,
        0.7017121255400001,
        0.72816230954,
        0.7134783865400001,
        0.72308187654,
        0.7659922235400001
      ]
    },
    {
      "command": "pnpr@main",
      "mean": 0.7002026153400001,
      "stddev": 0.013182275802987331,
      "median": 0.6994050380400001,
      "user": 0.35404124,
      "system": 1.27546034,
      "min": 0.6792600725400001,
      "max": 0.7223846795400001,
      "times": [
        0.71248425354,
        0.71336379254,
        0.6893137865400001,
        0.7223846795400001,
        0.68974750354,
        0.69284610254,
        0.70381588654,
        0.6968614145400001,
        0.6792600725400001,
        0.70194866154
      ]
    }
  ]
}

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

🐰 Bencher Report

Branchpr/12322
Testbedpacquet
Click to view all benchmark results
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
isolated-linker.fresh-install.cold-cache.cold-store📈 view plot
🚷 view threshold
8,695.09 ms
(-5.88%)Baseline: 9,238.07 ms
11,085.68 ms
(78.44%)
isolated-linker.fresh-install.cold-cache.hot-store📈 view plot
🚷 view threshold
4,893.02 ms
(-2.41%)Baseline: 5,014.09 ms
6,016.91 ms
(81.32%)
isolated-linker.fresh-install.hot-cache.hot-store📈 view plot
🚷 view threshold
1,192.45 ms
(-16.58%)Baseline: 1,429.38 ms
1,715.26 ms
(69.52%)
isolated-linker.fresh-restore.cold-cache.cold-store📈 view plot
🚷 view threshold
10,148.96 ms
(+0.55%)Baseline: 10,093.26 ms
12,111.91 ms
(83.79%)
isolated-linker.fresh-restore.hot-cache.hot-store📈 view plot
🚷 view threshold
663.75 ms
(+1.54%)Baseline: 653.68 ms
784.41 ms
(84.62%)
🐰 View full continuous benchmarking report in Bencher

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

🐰 Bencher Report

Branchpr/12322
Testbedpnpr

⚠️ WARNING: No Threshold found!

Without a Threshold, no Alerts will ever be generated.

Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the --ci-only-thresholds flag.

Click to view all benchmark results
BenchmarkLatencymilliseconds (ms)
isolated-linker.fresh-install.cold-cache.cold-store📈 view plot
⚠️ NO THRESHOLD
5,129.24 ms
isolated-linker.fresh-install.cold-cache.hot-store📈 view plot
⚠️ NO THRESHOLD
729.24 ms
isolated-linker.fresh-install.hot-cache.hot-store📈 view plot
⚠️ NO THRESHOLD
693.85 ms
isolated-linker.fresh-restore.cold-cache.cold-store📈 view plot
⚠️ NO THRESHOLD
5,373.97 ms
isolated-linker.fresh-restore.hot-cache.hot-store📈 view plot
⚠️ NO THRESHOLD
786.00 ms
🐰 View full continuous benchmarking report in Bencher

@qodo-free-for-open-source-projects

qodo-free-for-open-source-projects Bot commented Jun 11, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 126a416

@zkochan zkochan changed the title perf(pacquet): hydrate packument versions lazily from raw JSON fragments perf(pacquet): lazy packument hydration, sharded meta cache, and an indexed metadata mirror Jun 11, 2026
Comment thread pacquet/crates/resolving-npm-resolver/src/mirror.rs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pacquet/crates/registry/src/package_versions.rs`:
- Around line 287-305: The loop over self.slots can silently drop a slot when
serde_json::from_str::<&RawValue>(&json) fails; update the branch in the
serialization code that handles slot.source.json() so that on Err(e) you emit a
tracing::warn (including context: the version key and the error) before
continuing, mirroring the other error paths (see the hydration path and
fragments() handling) and keeping the subsequent continue and behavior for
slot.parsed.get() unchanged; locate the logic around slot.source.json(),
serde_json::from_str::<&RawValue>, and map.serialize_entry to add the warning.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 890ddeda-fceb-4ef7-9168-4cdb499a140f

📥 Commits

Reviewing files that changed from the base of the PR and between 4c28c66 and 126a416.

📒 Files selected for processing (5)
  • pacquet/crates/registry/src/package_versions.rs
  • pacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached.rs
  • pacquet/crates/resolving-npm-resolver/src/mirror.rs
  • pacquet/crates/resolving-npm-resolver/src/mirror/tests.rs
  • pacquet/crates/resolving-npm-resolver/src/pick_package.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • pacquet/crates/resolving-npm-resolver/src/pick_package.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Lint and Test (ubuntu-latest)
  • GitHub Check: Doc
  • GitHub Check: Lint and Test (macos-latest)
  • GitHub Check: Lint and Test (windows-latest)
  • GitHub Check: Dylint
  • GitHub Check: Run benchmark on ubuntu-latest
  • GitHub Check: Run benchmark on ubuntu-latest
  • GitHub Check: Code Coverage
  • GitHub Check: Analyze (javascript)
  • GitHub Check: Compile & Lint
🧰 Additional context used
📓 Path-based instructions (1)
pacquet/**/*.rs

📄 CodeRabbit inference engine (pacquet/AGENTS.md)

pacquet/**/*.rs: Log emissions are part of matching pnpm — when porting a function that fires pnpm:<channel> events through globalLogger, logger.debug(...), or streamParser.write(...), mirror the call site, payload, and ordering so @pnpm/cli.default-reporter parses pacquet's NDJSON the same way
Declare a newtype wrapper for branded string types instead of collapsing the brand into a plain String or &str in Rust
If upstream TypeScript always validates before construction of a branded string, validate in the Rust wrapper too via TryFrom<String> and/or FromStr and do not provide an infallible public constructor
If upstream TypeScript never validates a branded string, just brand for type-safety in Rust by exposing an infallible From<String> constructor
If upstream TypeScript occasionally constructs a branded string without validation, expose from_str_unchecked in Rust as an escape hatch alongside the validating constructor
Match upstream serde behavior for branded strings crossing JSON, YAML, or INI boundaries by using #[serde(try_from = "String")] for deserialization and #[serde(into = "String")] for serialization
Derive simple conversions for branded strings using #[derive(derive_more::From)] and #[derive(derive_more::Into)] instead of handwriting impl blocks; use manual impl only when conversion needs custom logic
Model TypeScript string literal unions (like 'auto' | 'always' | 'never') as Rust enums instead of newtype wrappers, since the set of valid values is closed
Treat TypeScript string template literal types (like `${string}@${string}`) the same as branded string types in Rust, using a newtype wrapper with validation
Follow the code style guide in CODE_STYLE_GUIDE.md — imports, modules, naming, ownership and borrowing, parameter type selection, trait bounds, pattern matching, pipe-trait, error handling, test layout, and cloning of Arc and Rc
Choose owned vs. borrowed parameters to minimize copies; widen to t...

Files:

  • pacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached.rs
  • pacquet/crates/resolving-npm-resolver/src/mirror/tests.rs
  • pacquet/crates/resolving-npm-resolver/src/mirror.rs
  • pacquet/crates/registry/src/package_versions.rs
🧠 Learnings (4)
📚 Learning: 2026-05-20T19:40:55.051Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11774
File: pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs:0-0
Timestamp: 2026-05-20T19:40:55.051Z
Learning: In the pacquet Rust code, ensure the semver implementation uses the `node-semver` crate (not `nodejs-semver`). `node-semver`’s public API does not include a `satisfies_with_prerelease`-style method; prerelease-tolerant matching should be implemented inline by first calling `Range::satisfies`, and when it rejects a prerelease version, retry matching against a stripped `MAJOR.MINOR.PATCH` base of the prerelease version.

Applied to files:

  • pacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached.rs
  • pacquet/crates/resolving-npm-resolver/src/mirror/tests.rs
  • pacquet/crates/resolving-npm-resolver/src/mirror.rs
  • pacquet/crates/registry/src/package_versions.rs
📚 Learning: 2026-05-22T00:08:44.646Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11837
File: pacquet/crates/resolving-npm-resolver/src/pick_package.rs:33-51
Timestamp: 2026-05-22T00:08:44.646Z
Learning: In the pnpm/pnpm repo’s pacquet Rust crates, do not flag Unicode ellipsis characters (U+2026, `…`) in Rust doc comments (`///` / `/** */`) as a lint violation. The pacquet crate’s `dylint.toml` only enables `perfectionist::derive_ordering`, and the Dylint `unicode-ellipsis` rule is not enabled for this project—so `…` in doc comments is an intentional, repo-consistent style.

Applied to files:

  • pacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached.rs
  • pacquet/crates/resolving-npm-resolver/src/mirror/tests.rs
  • pacquet/crates/resolving-npm-resolver/src/mirror.rs
  • pacquet/crates/registry/src/package_versions.rs
📚 Learning: 2026-05-20T23:07:58.444Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11784
File: pacquet/crates/resolving-deps-resolver/src/hoist_peers.rs:120-133
Timestamp: 2026-05-20T23:07:58.444Z
Learning: When reviewing code in this pacquet Rust port, follow the upstream pnpm compatibility rule: only match pnpm’s behavior exactly. Do not propose review changes that intentionally deviate from pnpm’s documented/observed behavior, even if pnpm appears buggy. If you identify a real bug in pnpm behavior, the review should prioritize fixing it upstream in pnpm first, and avoid implementing a pnpm-behavior workaround here unless the same fix has already landed upstream.

Applied to files:

  • pacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached.rs
  • pacquet/crates/resolving-npm-resolver/src/mirror/tests.rs
  • pacquet/crates/resolving-npm-resolver/src/mirror.rs
  • pacquet/crates/registry/src/package_versions.rs
📚 Learning: 2026-06-06T18:58:37.156Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 12243
File: pacquet/crates/package-manager/src/install_package_by_snapshot.rs:319-322
Timestamp: 2026-06-06T18:58:37.156Z
Learning: When reviewing Rust code, do not assume `matches!(expr, Pattern(_))` will move out of `expr` if `Pattern(_)` contains no by-value bindings. `matches!` desugars to a `match` that auto-borrows the scrutinee for discrimination, so even if `expr` is a non-`Copy` value behind a shared reference (e.g., `&T`), the macro should not move-out of the borrowed data purely due to `matches!`. Treat `matches!(&expr, Pattern(_))` as a readability/clarity improvement, not a correctness requirement. Only flag potential move-out-of-borrow risks when the pattern includes by-value bindings that would require moving the matched value.

Applied to files:

  • pacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached.rs
  • pacquet/crates/resolving-npm-resolver/src/mirror/tests.rs
  • pacquet/crates/resolving-npm-resolver/src/mirror.rs
  • pacquet/crates/registry/src/package_versions.rs
🔇 Additional comments (4)
pacquet/crates/registry/src/package_versions.rs (1)

18-27: LGTM!

Also applies to: 42-127, 129-181, 184-244, 246-308

pacquet/crates/resolving-npm-resolver/src/mirror.rs (1)

1-464: LGTM!

pacquet/crates/resolving-npm-resolver/src/mirror/tests.rs (1)

1-198: LGTM!

pacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached.rs (1)

15-17: LGTM!

Also applies to: 31-31, 192-207

Comment thread pacquet/crates/registry/src/package_versions.rs
zkochan added 5 commits June 11, 2026 08:26
…ents

Profiling a warm babylon resolve (metadata mirrors hot, ~520 MB across
1,476 packuments) showed the dominant cost was not I/O but hydration:
every version of every packument was deserialized into typed manifests
— maps, strings, and `serde_json::Value` trees built, hashed, and
dropped — even though a pick consults only the version strings plus
the handful of manifests it actually considers. High-release-cadence
packages make this quadratic-feeling: typescript ships 3,800 versions
and a pick needs one.

`Package::versions` is now a `PackageVersions` map whose entries hold
the raw JSON fragment serde captured (`Arc<RawValue>`, shared rather
than copied) and hydrate into `Arc<PackageVersion>` on first access,
caching per slot. Key scans (range matching, preferred-version
prioritization, the publish-date filter, existence probes) never
hydrate; `pinned_version` matches on version strings and hydrates only
the winner; the publish-date filter moves slots without touching
manifests. A fragment that fails to decode behaves as if the version
were absent (with a warning), matching the tolerance of the JavaScript
implementation, which never validates entries it doesn't pick.
Serialization emits raw fragments verbatim, so mirror round-trips keep
the registry's exact bytes. Picked manifests now travel as
`Arc<PackageVersion>` instead of by-value clones.

Parsing the three largest packuments in the babylon graph improves
3-3.8x (e.g. typescript 36.1 ms -> 9.5 ms, 905 MB/s), and a warm
babylon `--lockfile-only` resolve drops from ~9.1 s to ~7.7-8 s wall
with allocator pressure largely gone from the profile; the same
hydration savings apply to cold resolves. The remaining warm-resolve
cost shifts to lock contention in the resolve machinery, which is a
separate follow-up.

Requires the `raw_value` feature of serde_json (already a workspace
dependency).
Every resolve edge consults the shared meta cache before any other
work, so a large graph drives tens of thousands of lookups through it
from all runtime workers at once. The warm-resolve time profile showed
the cache's single `Mutex<HashMap>` as the top mutex-wait site; back
it with the sharded `DashMap` the crate already uses for the fetch
locker and the picked-manifest cache. Wall time on the warm babylon
resolve is unchanged (the walk is critical-path-bound after the lazy
hydration change), but the contention disappears from the profile and
the cache no longer serializes workers under load.
…ssed version fragments

Replace the mirror's two-line NDJSON shape (headers line + verbatim
packument body) with an indexed format:

    pacquet-meta-v1 <headers_len> <index_len>\n
    <headers JSON>     etag, modified
    <index JSON>       name, dist-tags, time, homepage,
                       versions: [version, offset, len]
    <fragments>        concatenated raw per-version JSON

Loading a packument from the cache previously re-read and
structurally re-scanned the whole body to recover the version
fragments. The index records each fragment's byte span, so the loader
now reads the file once and hands `PackageVersions` spans into that
buffer — hydration parses a slice in place, with no scan and no
re-allocation. Conditional-GET headers still come from a fixed-size
probe of the file head. The writer streams the registry's own bytes
(fragments borrow from the lazily-parsed response body), so the
cold-install cost stays one temp-file + rename per package.

Span-addressed slots were first implemented as open-per-fragment
reads; the pick paths can probe many candidate fragments per package,
and the measured syscall cost (sys 0.4 s -> 4.5 s on a warm babylon
resolve) outweighed the bytes saved, so the loader reads the file
into one shared buffer instead.

Warm babylon `--lockfile-only` resolves drop from ~8 s to ~5.5-6.7 s
on top of the lazy-hydration change (~9.1 s before this PR). Files in
the previous NDJSON format read as cache misses and are rewritten in
the indexed format on the next 200 response; top-level packument keys
outside the index were never read back by the resolver, so nothing is
lost by not persisting them. The on-disk metadata cache is now
pacquet-specific rather than byte-shared with the pnpm CLI's mirror
(agreed maintainer decision).
…agments

Review findings on the lazy-hydration contract ("an undecodable
fragment behaves as if the version were absent"):

- `Package::pinned_version` walks satisfying candidates from highest
  to lowest and hydrates until one decodes, instead of giving up when
  the single highest match fails.
- `pick_package_from_meta` retries the pick against a clone that
  excludes an undecodable winner (dropping dist-tags that point at
  it, so the latest-tag fast path can't re-pick it); an exact-version
  spec naming an undecodable manifest yields no match. Previously the
  first undecodable winner ended resolution for that dependency.
- `Package::latest` returns `Option` — a dangling `dist-tags.latest`
  or an undecodable manifest must not be able to panic the process
  from registry-served data.
- `read_mirror_headers` bounds the declared headers length before
  allocating from it, so a corrupted mirror can't trigger an
  arbitrarily large allocation on the warm path.
- the serializer warns (instead of silently skipping) when a fragment
  is corrupt, matching the other error paths.

Also fixes the doc build: an intra-doc link still referenced the
`FileSpan` variant that became `BufferSpan`.
@zkochan zkochan force-pushed the manifest-lazy-hydration branch from 126a416 to ae28135 Compare June 11, 2026 06:54
@qodo-free-for-open-source-projects

qodo-free-for-open-source-projects Bot commented Jun 11, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit ae28135

Comment thread pacquet/crates/resolving-npm-resolver/src/trust_checks.rs Outdated
The trust-downgrade scan walked prior versions with
`PackageVersions::iter()`, which silently skips fragments that fail
to hydrate — an undecodable prior manifest could hide the strongest
prior trust evidence and let a downgrade pass undetected.

`detect_strongest_trust_evidence_before` now walks version keys,
applies the cheap timestamp/prerelease filters first, and surfaces
`TrustViolation::TrustCheckFailed` when an in-scope version is listed
but its manifest does not decode. The early return on `StagedPublish`
stays: once max-rank evidence is found, no skipped fragment can hide
anything stronger.
@qodo-free-for-open-source-projects

qodo-free-for-open-source-projects Bot commented Jun 11, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit f6b75fc

@zkochan zkochan merged commit cb18695 into main Jun 11, 2026
19 of 21 checks passed
@zkochan zkochan deleted the manifest-lazy-hydration branch June 11, 2026 10:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants