Skip to content

perf(pacquet): port peekManifestFromStore fast path to skip the picker on hot-cache resolves #11843

Description

@zkochan

Summary

The dominant cost on a warm-cache pacquet install is the resolve walk: every node still routes through pick_package, even when the wanted lockfile already pins (integrity, name@version) to a row that's right there in index.db. Upstream pnpm short-circuits this via peekManifestFromStore — assembling a ResolveResult straight from the store-index row and skipping the registry metadata fetch entirely.

Pacquet's npm_resolver crate-level docs already mark this as out of scope; this issue tracks closing it.

Why this matters

On the alotta-files warm-cache fixture (1362 nodes, lockfile unchanged from the previous run) pacquet currently sits at ~5 s wall vs pnpm's ~4.16 s. Per-phase trace:

phase: "resolve_importer"        elapsed_ms: 3125  nodes: 1362
phase: "prefetch_cas_paths"      elapsed_ms: 69
phase: "build_fresh_lockfile"    elapsed_ms: 3
phase: "virtual_store_layout_new" elapsed_ms: 11
phase: "install_subtree"         elapsed_ms: 1440

resolve_importer is 62% of wall time. peekManifestFromStore is the single biggest unimplemented optimization left, and the only one positioned to materially close the gap to pnpm. Concretely it would let ~95% of nodes (every unchanged one on a hot lockfile) skip:

  • The packument fetch / conditional-GET (even with the in-memory + on-disk packument cache, it still costs a lookup + dedup-locker round-trip per name).
  • The picker's version-selection walk (pickVersionByVersionRange / selectVersionByPreferred).
  • Serializing the picked manifest into ResolveResult.manifest — the row's bundled manifest is already serde_json::Value-shaped (see PackageFilesIndex.manifest at store_index.rs:756-757).

What's available in the store-index row

PackageFilesIndex carries:

  • manifest: Option<serde_json::Value> — the tarball's package.json (name, version, dependencies, peerDependencies, bin, engines, cpu, os, libc, etc.)
  • requires_build: Option<bool>
  • algo, files, side_effects — used by the install / build phases, not by the resolver

The dist.integrity and dist.tarball are not stored on the row directly, but the row is addressed by the <integrity>\t<pkg_id> key — so the integrity is implicit in the lookup, and the tarball URL is in the wanted lockfile entry the resolver was about to resolve anyway.

What's NOT in the row

  • time / publish timestamps. These live on the packument document, not on the per-version package.json.
  • dist-tags.latest. Same reason.

That sets a clean gating condition.

Gating conditions

Fast-path is safe iff all of:

  1. opts.published_by is None (no --publish-by / minimumReleaseAge policy in effect — the policy check at detect_min_release_age_violation needs published_at).
  2. config.minimum_release_age is None for the same reason.
  3. opts.update == UpdateBehavior::None (we're honoring an existing pin, not chasing the registry's latest).
  4. The wanted dependency carries a (integrity, name@version) from a previous lockfile (i.e. preferFrozenLockfile-style rewrite or frozen-lockfile install).
  5. The store-index row exists for that (integrity, pkg_id) and carries a non-None manifest.

Any miss → fall through to today's pick_package flow unchanged.

Implementation sketch

  • Thread SharedReadonlyStoreIndex and the prior Lockfile (already passed to InstallWithFreshLockfile) into the npm resolver chain.
  • Add a peek_manifest_from_store(integrity, pkg_id) -> Option<ResolveResult> helper that:
    • Computes store_index_key(integrity, pkg_id).
    • Calls the existing StoreIndex::get (the one prefetch_cas_paths already uses).
    • Returns None on missing row, missing manifest, or stale row.
    • On hit: builds ResolveResult { resolution: Tarball(...), manifest: Some(Arc::new(row.manifest)), name_ver, ... } directly, with published_at: None and latest: None.
  • At the resolver entry point (NpmResolver::resolve_impl), check the gating conditions and try peek_manifest_from_store first; on None, fall through to pick_from_registry.

The store-index reader is already wrapped in Arc<Mutex<_>> for the batched-prefetch path, but resolve happens before prefetch_cas_paths runs. Either (a) move the prefetch earlier so the resolver can reuse its output as a name-keyed map (cheap, since the prefetch is already running per-install), or (b) do per-node StoreIndex::get from the resolver — the underlying SQLite reader is cheap and the resolver walk is already concurrent.

Expected impact

If ~95% of 1362 nodes hit the fast path on warm cache, the resolve_importer phase should drop from ~3.1 s to a few hundred ms. That would put pacquet at parity with — or under — pnpm on the alotta-files benchmark.

References


Written by an agent (Claude Code, claude-opus-4-7).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions