Skip to content

pacquet: pipeline tarball fetch with resolution to close the 3-5x gap on resolution-heavy installs #11832

Description

@zkochan

Summary

When the install path involves resolution (no lockfile, stale lockfile, or update), pacquet is 3-5× slower than the TypeScript pnpm CLI. When a lockfile is available and resolution is skipped, pacquet is 2-3× faster. The gap is entirely on the resolve side.

Benchmark numbers

From pnpm.io/benchmarks/results/ on the alotta-files fixture, pacquet 0.2.3 vs pnpm 11.2.2:

Scenario pnpm 11.2.2 pacquet 0.2.3 Ratio
withWarmCacheAndLockfile (no resolve) 2319ms 721ms pacquet 3.2× faster
withLockfile (no resolve) 7000ms 2825ms pacquet 2.5× faster
withWarmModulesAndLockfile (no resolve) 489ms 222ms pacquet 2.2× faster
firstInstall (resolve + fetch + import) 7737ms 22461ms pacquet 2.9× slower
withWarmCache (resolve only; tarballs already on disk) 4149ms 22474ms pacquet 5.4× slower
withWarmModules (resolve, modules already on disk) 7980ms 24592ms pacquet 3.1× slower
updatedDependencies (re-resolve) 3842ms 20179ms pacquet 5.3× slower

withWarmCache is the most damning: tarballs are already in the store, modules and lockfile are absent, so the run is essentially just resolve → import. pacquet spends 5.4× longer here.

Root cause: pacquet resolves the whole tree before fetching anything

pacquet uses a strict two-phase pipeline. In install_with_fresh_lockfile.rs:

let importer_result =
    resolve_importer(&*resolver, manifest, dependency_groups, importer_opts)
        .await
        .map_err(...)?;                       // every transitive dep is walked here
drop(resolver);
drop(npm_resolver);
// ...
peers_result.direct_dependencies_by_alias.iter()
    .map(|(alias, dep_path)| install_subtree::<Reporter>(...))   // fetches start here
    .pipe(future::try_join_all).await?;

The Resolver trait returns only a ResolveResult carrying the picked manifest + tarball URL + integrity; npm_resolver::build_resolve_result does not start any download. The tarball is only fetched inside InstallPackageFromRegistry::run, which only runs once the entire resolve_importer pass has returned.

Wall-clock shape:

[ resolve every transitive dep (packuments) ][ fetch every tarball ][ import ]

pnpm CLI pipelines them. packageRequester.ts calls fetchPackageToStore({ ... }) synchronously, which returns immediately with a { fetching: Promise<PkgRequestFetchResult> }. The download is already on the wire by the time requestPackage returns. resolveDependencies.ts stashes that promise on the ResolvedPackage:

```ts
ctx.resolvedPkgsById[pkgResponse.body.id] = getResolvedPackage({
// ...
fetching: pkgResponse.fetching, // not awaited here
})
```

Resolution of children, siblings, and the rest of the tree continues in parallel with that download. By the time the install/link pass awaits each package's fetching promise, most are already in the store.

Wall-clock shape:

[ resolve A ][ resolve A.deps ][ resolve A.deps.deps ]
       \\          \\                   \\
        [fetch A][fetch A.deps      ][fetch A.deps.deps]
                                                       \\
                                            [ import & symlink ]

That is roughly max(per-package resolve+fetch chain) + import, instead of sum(resolve) + sum(fetch) + import. That difference matches the 3-5× factor pacquet is currently paying.

Proposed fix

Port pnpm's pipelined model into pacquet:

  1. Extend ResolveResult (or sibling envelope returned to extend_tree / resolve_node) with a fetching: Shared<Future<Result<CasPaths, TarballError>>> field — a polled-once future that is started immediately and shareable across visitors of the same (name, version) slot.

  2. Move DownloadTarballToStore invocation from install_package_from_registry.rs into resolve_dependency_tree::resolve_node, right after the resolver returns (mirroring packageRequester.ts:266). Dedupe via a fetchingLocker-style DashMap<PkgId, Shared<Future>> so the hoist loop's repeated visits do not refire downloads.

  3. install_subtree becomes "await the prefetched fetching future + do the per-edge import + symlink." The existing resolved_packages: DashMap<String, watch::Sender<bool>> dedupe stays — it still gates the import/symlink step, not the download.

The only non-mechanical piece is the hoist loop calling extend_tree multiple times. pnpm's own loop kicks off duplicate fetchPackageToStore calls per visit because fetchingLocker is keyed by pkg.id; pacquet should do the same.

This should bring pacquet's resolve-heavy installs roughly in line with — and likely past — the TypeScript pnpm CLI, the same way the no-resolve cases already are.

Scope

Pacquet-only change. No user-visible behavior change. Lockfile format, error codes, and CLI surface unaffected.


Written by an agent (Claude Code, claude-opus-4-7).

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions