You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fetch_full_metadata in pacquet-resolving-npm-resolver issues exactly one reqwest::Client::send().await and treats any error as fatal. Pnpm's TypeScript implementation routes every registry fetch through make-fetch-happen, which retries transient network errors (fetchRetries, default 2, exponential backoff). The gap is a known follow-up — pacquet's own Config::fetch_retries doc comment at 8695496 says:
Today this only gates the pacquet-tarball download path; crates/registry's metadata fetches still issue a single request. Threading the same retry policy through the registry client is a follow-up.
No wrapper, no retry, no backoff. The two install_package_* call sites that download tarballs do honor retries — they pass retry_opts_from_config(config) into the tarball path — so the infrastructure exists; it just isn't threaded through to the metadata client.
For comparison, pnpm's fetchFromRegistry.ts at 2a9bd897bf wraps every registry call (metadata and tarball) in make-fetch-happen with the same fetchRetries/fetchRetryFactor/fetchRetryMintimeout/fetchRetryMaxtimeout policy.
Symptom
The user-visible failure is a Failed to fetch metadata from <url>: error sending request for url (<url>) error that aborts the entire install. The wrapped reqwest error covers transient TCP-level failures the make-fetch-happen equivalent would silently retry — chiefly stale keep-alive sockets, which pacquet's network crate documents as a known mode at pacquet/crates/network/src/lib.rs:130-140 (8695496):
pool_idle_timeout(4s) matches agentkeepalive's default freeSocketTimeout. Most CDN / load-balancer edges in front of registry.npmjs.org close idle sockets after 5–15s without sending FIN that hyper notices; a pool TTL above that lets pacquet reuse a half-dead socket and surface the next request as a generic "error sending request for url".
The 4 s pool TTL narrows the race window but doesn't close it; on the tarball download path the RetryOpts wrapper absorbs the same class of error transparently. The metadata path doesn't, so a single stale-socket reuse fails the install.
Reproduction
# Build pacquet@main (8695496 or later)
cargo build --release --bin=pacquet
# Stand up @pnpm/registry-mock 6.0.0's verdaccio (or any verdaccio with# `proxy: npmjs` against registry.npmjs.org)
node node_modules/.pnpm/node_modules/verdaccio/bin/verdaccio \
--config registry-mock-config.yaml --listen 4873 &# Set up the integrated-benchmark `alotta-files` fixture as a clean# install (no lockfile)
cp pacquet/tasks/integrated-benchmark/src/fixtures/package.json .
cat > .npmrc <<EOFregistry=http://localhost:4873/auto-install-peers=trueignore-scripts=truelockfile=falseEOF
cat > pnpm-workspace.yaml <<EOFstoreDir: ./store-dirregistry: http://localhost:4873/autoInstallPeers: trueignoreScripts: truelockfile: falseEOF# Run; flakes intermittently with `error sending request for url`
target/release/pacquet install
In a 16-run sample against a freshly-started verdaccio, 1 cold run failed with the symptom above; 15 subsequent warm-pool runs all succeeded with rc=0. The bug isn't deterministic — it's a TCP-pool race — but the per-request failure probability compounded across the ~hundreds of metadata fetches a clean install of alotta-files makes is high enough to flake CI scenarios that go through this path.
Scenarios this affects
Scenarios that don't use a lockfile (clean install, full resolution from package.json) issue hundreds of metadata fetches. Frozen-lockfile scenarios fetch zero metadata — they consume the lockfile's pinned snapshots directly — so they don't expose the gap. This is why the existing CI runs frozen-lockfile and frozen-lockfile-hot-cache cleanly, while attempts to add clean-install / full-resolution to the per-PR integrated-benchmark workflow flake on Benchmark 1: pacquet@HEAD's first hyperfine command.
Suggested shape of the fix
Reuse RetryOpts (or factor it up) so the metadata client gets the same fetch_retries / fetch_retry_factor / fetch_retry_mintimeout / fetch_retry_maxtimeout policy. Default = 2 retries, factor 10, mintimeout 10000 ms, maxtimeout 60000 ms — matching pnpm.
Retries should be scoped to genuinely transient errors (network-level reqwest::Error whose source is a hyper/IO error, plus 5xx / 408 / 429). 4xx other than 408/429 should still be fatal so a misspelt pkg name fails fast.
Honor Retry-After on 429 / 503 if pnpm does (worth a quick check of make-fetch-happen's policy).
Thread retries through both the full and the abbreviated metadata path (both currently bypass it).
Out of scope
The pool_idle_timeout value itself. Reducing it narrows the race but trades TCP-handshake overhead for robustness; the right fix is the retry layer that pnpm has. Tracking that here would conflate two changes.
Summary
fetch_full_metadatainpacquet-resolving-npm-resolverissues exactly onereqwest::Client::send().awaitand treats any error as fatal. Pnpm's TypeScript implementation routes every registry fetch throughmake-fetch-happen, which retries transient network errors (fetchRetries, default2, exponential backoff). The gap is a known follow-up — pacquet's ownConfig::fetch_retriesdoc comment at8695496says:This issue tracks that follow-up.
Where the gap is
The metadata fetcher at
pacquet/crates/resolving-npm-resolver/src/fetch_full_metadata.rs:73-78(8695496):No wrapper, no retry, no backoff. The two
install_package_*call sites that download tarballs do honor retries — they passretry_opts_from_config(config)into the tarball path — so the infrastructure exists; it just isn't threaded through to the metadata client.For comparison, pnpm's
fetchFromRegistry.tsat2a9bd897bfwraps every registry call (metadata and tarball) inmake-fetch-happenwith the samefetchRetries/fetchRetryFactor/fetchRetryMintimeout/fetchRetryMaxtimeoutpolicy.Symptom
The user-visible failure is a
Failed to fetch metadata from <url>: error sending request for url (<url>)error that aborts the entire install. The wrapped reqwest error covers transient TCP-level failures themake-fetch-happenequivalent would silently retry — chiefly stale keep-alive sockets, which pacquet's network crate documents as a known mode atpacquet/crates/network/src/lib.rs:130-140(8695496):The 4 s pool TTL narrows the race window but doesn't close it; on the tarball download path the
RetryOptswrapper absorbs the same class of error transparently. The metadata path doesn't, so a single stale-socket reuse fails the install.Reproduction
In a 16-run sample against a freshly-started verdaccio, 1 cold run failed with the symptom above; 15 subsequent warm-pool runs all succeeded with
rc=0. The bug isn't deterministic — it's a TCP-pool race — but the per-request failure probability compounded across the ~hundreds of metadata fetches a clean install ofalotta-filesmakes is high enough to flake CI scenarios that go through this path.Scenarios this affects
Scenarios that don't use a lockfile (clean install, full resolution from
package.json) issue hundreds of metadata fetches. Frozen-lockfile scenarios fetch zero metadata — they consume the lockfile's pinned snapshots directly — so they don't expose the gap. This is why the existing CI runsfrozen-lockfileandfrozen-lockfile-hot-cachecleanly, while attempts to addclean-install/full-resolutionto the per-PR integrated-benchmark workflow flake onBenchmark 1: pacquet@HEAD's first hyperfine command.Suggested shape of the fix
RetryOpts(or factor it up) so the metadata client gets the samefetch_retries/fetch_retry_factor/fetch_retry_mintimeout/fetch_retry_maxtimeoutpolicy. Default =2retries, factor10, mintimeout10000ms, maxtimeout60000ms — matching pnpm.reqwest::Errorwhose source is ahyper/IO error, plus 5xx / 408 / 429). 4xx other than 408/429 should still be fatal so a misspelt pkg name fails fast.Retry-Afteron 429 / 503 if pnpm does (worth a quick check ofmake-fetch-happen's policy).Out of scope
pool_idle_timeoutvalue itself. Reducing it narrows the race but trades TCP-handshake overhead for robustness; the right fix is the retry layer that pnpm has. Tracking that here would conflate two changes.Written by an agent (Claude Code, claude-opus-4-7).