fix: retry pacquet metadata fetches with a shared, config-driven retry policy#12029
Conversation
|
Important Review skippedReview was skipped due to path filters ⛔ Files ignored due to path filters (1)
CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR implements a retry mechanism for npm registry metadata fetches to handle transient network errors and retryable status codes. It adds Changesnpm metadata fetch retry mechanism
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Review Summary by QodoAdd retry logic for pacquet metadata fetches with exponential backoff
WalkthroughsDescription• Add retry logic for pacquet metadata fetches with exponential backoff - Retries on transient errors (408, 429, 5xx) and transport failures - Uses pnpm-compatible defaults: 2 retries, 10x factor, 10-60s delays • Refactor metadata request handling into shared send_metadata_request_with_retry helper - Preserves 304 Not Modified cache path and fail-fast for non-retryable 4xx • Update all metadata fetch call sites to pass retry options • Add comprehensive test coverage for retry behavior with mock registry Diagramflowchart LR
A["Metadata Request"] --> B["send_metadata_request_with_retry"]
B --> C{"Retryable Status?"}
C -->|408/429/5xx| D["Exponential Backoff"]
D --> B
C -->|304| E["Return Cached"]
C -->|Other| F["Return Response"]
B --> G["fetch_full_metadata"]
B --> H["fetch_full_metadata_cached"]
File Changes1. pacquet/crates/resolving-npm-resolver/src/fetch_full_metadata.rs
|
Code Review by Qodo
1. Sleep holds Response open
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pacquet/crates/resolving-npm-resolver/src/fetch_full_metadata.rs`:
- Around line 85-100: The retry branch in fetch_full_metadata.rs keeps the HTTP
Response alive across tokio::time::sleep, holding sockets; before sleeping in
the branch that matches Ok(response) if should_retry_status(...) { ... },
explicitly drop the response (call drop(response)) or otherwise release its
body/connection before calling tokio::time::sleep(delay).await so the socket is
returned to the pool during backoff; keep the rest of the logic (status, delay,
tracing::warn, attempt increment) unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 34b9c665-c0c4-4309-acf8-24de18cfc93d
📒 Files selected for processing (6)
pacquet/crates/resolving-npm-resolver/src/create_npm_resolution_verifier.rspacquet/crates/resolving-npm-resolver/src/fetch_full_metadata.rspacquet/crates/resolving-npm-resolver/src/fetch_full_metadata/tests.rspacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached.rspacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached/tests.rspacquet/crates/resolving-npm-resolver/src/pick_package.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Run benchmark on ubuntu-latest
- GitHub Check: Lint and Test (macos-latest)
- GitHub Check: Lint and Test (windows-latest)
- GitHub Check: Code Coverage
- GitHub Check: Compile & Lint
🧰 Additional context used
📓 Path-based instructions (1)
pacquet/**/*.rs
📄 CodeRabbit inference engine (pacquet/AGENTS.md)
pacquet/**/*.rs: When porting a function that firespnpm:<channel>events throughglobalLogger,logger.debug(), orstreamParser.write(), mirror the call site, payload, and ordering so the reporter parses pacquet's NDJSON the same way it parses pnpm's.
Declare a newtype wrapper for branded string types. Do not collapse the brand into a plainStringor&str.
If upstream always validates before construction, validate in pacquet's wrapper too. The wrapper must construct only viaTryFrom<String>and/orFromStr. Do not provide an infallible public constructor.
If upstream never validates, just brand for type-safety. Expose an infallibleFrom<String>(andFrom<&str>when convenient).
If upstream occasionally constructs without validation, exposefrom_str_uncheckedas an escape hatch alongside the validating constructor.
Match upstream serde behavior for branded types that cross JSON, YAML, or INI boundaries. Use#[serde(try_from = "String")]for deserialization and#[serde(into = "String")]for serialization.
Use#[derive(derive_more::From)]and#[derive(derive_more::Into)]for mechanical conversion impls. Fall back to manualimplonly when conversion needs custom logic.
String-literal unions should becomeenums, not newtype wrappers. Model closed sets of valid string values as enums.
Template literal types should be treated as branded strings with validation discipline from rules 2-5.
Choose owned vs. borrowed parameters to minimize copies. Widen to the most encompassing type (&Pathover&PathBuf,&strover&String) when it doesn't force extra copies.
PreferArc::clone(&x)/Rc::clone(&x)overx.clone()for reference-counted types, so the cost is visible at the call site.
Follow Rust API Guidelines for naming conventions.
Do not use star imports inside module bodies. Writeuse super::{Foo, bar}instead ofuse super::*;. Two forms stay allowed: external-crate preludes likeuse rayon::prelude::*;and root-of-module re-...
Files:
pacquet/crates/resolving-npm-resolver/src/pick_package.rspacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached/tests.rspacquet/crates/resolving-npm-resolver/src/create_npm_resolution_verifier.rspacquet/crates/resolving-npm-resolver/src/fetch_full_metadata/tests.rspacquet/crates/resolving-npm-resolver/src/fetch_full_metadata.rspacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached.rs
🧠 Learnings (3)
📚 Learning: 2026-05-20T19:40:55.051Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11774
File: pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs:0-0
Timestamp: 2026-05-20T19:40:55.051Z
Learning: In the pacquet Rust code, ensure the semver implementation uses the `node-semver` crate (not `nodejs-semver`). `node-semver`’s public API does not include a `satisfies_with_prerelease`-style method; prerelease-tolerant matching should be implemented inline by first calling `Range::satisfies`, and when it rejects a prerelease version, retry matching against a stripped `MAJOR.MINOR.PATCH` base of the prerelease version.
Applied to files:
pacquet/crates/resolving-npm-resolver/src/pick_package.rspacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached/tests.rspacquet/crates/resolving-npm-resolver/src/create_npm_resolution_verifier.rspacquet/crates/resolving-npm-resolver/src/fetch_full_metadata/tests.rspacquet/crates/resolving-npm-resolver/src/fetch_full_metadata.rspacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached.rs
📚 Learning: 2026-05-22T00:08:44.646Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11837
File: pacquet/crates/resolving-npm-resolver/src/pick_package.rs:33-51
Timestamp: 2026-05-22T00:08:44.646Z
Learning: In the pnpm/pnpm repo’s pacquet Rust crates, do not flag Unicode ellipsis characters (U+2026, `…`) in Rust doc comments (`///` / `/** */`) as a lint violation. The pacquet crate’s `dylint.toml` only enables `perfectionist::derive_ordering`, and the Dylint `unicode-ellipsis` rule is not enabled for this project—so `…` in doc comments is an intentional, repo-consistent style.
Applied to files:
pacquet/crates/resolving-npm-resolver/src/pick_package.rspacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached/tests.rspacquet/crates/resolving-npm-resolver/src/create_npm_resolution_verifier.rspacquet/crates/resolving-npm-resolver/src/fetch_full_metadata/tests.rspacquet/crates/resolving-npm-resolver/src/fetch_full_metadata.rspacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached.rs
📚 Learning: 2026-05-20T23:07:58.444Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11784
File: pacquet/crates/resolving-deps-resolver/src/hoist_peers.rs:120-133
Timestamp: 2026-05-20T23:07:58.444Z
Learning: When reviewing code in this pacquet Rust port, follow the upstream pnpm compatibility rule: only match pnpm’s behavior exactly. Do not propose review changes that intentionally deviate from pnpm’s documented/observed behavior, even if pnpm appears buggy. If you identify a real bug in pnpm behavior, the review should prioritize fixing it upstream in pnpm first, and avoid implementing a pnpm-behavior workaround here unless the same fix has already landed upstream.
Applied to files:
pacquet/crates/resolving-npm-resolver/src/pick_package.rspacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached/tests.rspacquet/crates/resolving-npm-resolver/src/create_npm_resolution_verifier.rspacquet/crates/resolving-npm-resolver/src/fetch_full_metadata/tests.rspacquet/crates/resolving-npm-resolver/src/fetch_full_metadata.rspacquet/crates/resolving-npm-resolver/src/fetch_full_metadata_cached.rs
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #12029 +/- ##
=======================================
Coverage 87.60% 87.61%
=======================================
Files 267 267
Lines 30661 30709 +48
=======================================
+ Hits 26860 26905 +45
- Misses 3801 3804 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Integrated-Benchmark Report (Linux)Scenario: Isolated linker: fresh restore, cold cache + cold store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 1.7008565663800002,
"stddev": 0.04048105207355616,
"median": 1.68733072318,
"user": 2.79752924,
"system": 2.0523605799999998,
"min": 1.65988682968,
"max": 1.76853877568,
"times": [
1.72587487968,
1.67216560168,
1.66581042368,
1.71199931568,
1.76853877568,
1.67102130268,
1.76214369668,
1.66862899368,
1.70249584468,
1.65988682968
]
},
{
"command": "pacquet@main",
"mean": 1.7940758516800002,
"stddev": 0.21079636886243083,
"median": 1.74607191918,
"user": 2.77313004,
"system": 2.0470192799999998,
"min": 1.66572127968,
"max": 2.37832811068,
"times": [
2.37832811068,
1.76127987868,
1.7496689166800001,
1.66572127968,
1.7852488016799999,
1.67206970168,
1.68756941368,
1.69422044068,
1.74247492168,
1.80417705168
]
}
]
}Scenario: Isolated linker: fresh restore, hot cache + hot store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 0.50676252356,
"stddev": 0.02377596452403384,
"median": 0.49770013116,
"user": 0.3804862799999999,
"system": 0.80807026,
"min": 0.49132852616,
"max": 0.57222759516,
"times": [
0.57222759516,
0.49668310216,
0.50928405116,
0.50242525716,
0.49588329716,
0.49474445416,
0.49860687816000004,
0.49132852616,
0.49679338416,
0.50964869016
]
},
{
"command": "pacquet@main",
"mean": 0.5094340914600001,
"stddev": 0.01890631277828608,
"median": 0.50728175316,
"user": 0.37489607999999996,
"system": 0.8132736599999999,
"min": 0.49124066016,
"max": 0.55727800516,
"times": [
0.55727800516,
0.50798754516,
0.50857796516,
0.49574118616,
0.51428841016,
0.50657596116,
0.49810742516,
0.49124066016,
0.49637924716000004,
0.51816450916
]
}
]
}Scenario: Isolated linker: fresh install, cold cache + cold store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 2.13610412836,
"stddev": 0.030450503527684827,
"median": 2.1413204393599994,
"user": 4.13886742,
"system": 1.9874393400000003,
"min": 2.09161885286,
"max": 2.1884890118599998,
"times": [
2.1499522718599997,
2.1430291098599996,
2.1593694168599997,
2.1339080998599997,
2.1036169648599996,
2.09161885286,
2.15358775186,
2.0978580348599998,
2.1884890118599998,
2.1396117688599996
]
},
{
"command": "pacquet@main",
"mean": 2.14563044846,
"stddev": 0.04797184534212599,
"median": 2.1359132708599997,
"user": 4.13206152,
"system": 2.02120714,
"min": 2.08951863786,
"max": 2.24220965386,
"times": [
2.1359675988599998,
2.08951863786,
2.1358589428599997,
2.11656991886,
2.2161868388599997,
2.24220965386,
2.1474125148599996,
2.13987467786,
2.13173744586,
2.1009682548599997
]
}
]
}Scenario: Isolated linker: fresh install, hot cache + hot store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 1.3696735833200002,
"stddev": 0.019117814286176447,
"median": 1.3695060262200003,
"user": 1.79752686,
"system": 1.1512684199999998,
"min": 1.3486341452200001,
"max": 1.41160602022,
"times": [
1.38575341922,
1.3562154212200002,
1.3699812932200002,
1.3486341452200001,
1.35263347922,
1.37278896322,
1.41160602022,
1.3690307592200002,
1.35269014922,
1.37740218322
]
},
{
"command": "pacquet@main",
"mean": 1.3844166993199998,
"stddev": 0.032273480836876096,
"median": 1.3785970602200002,
"user": 1.8052295599999997,
"system": 1.15744412,
"min": 1.34358925022,
"max": 1.4657012202200002,
"times": [
1.36478212522,
1.3696866872200002,
1.37408545122,
1.37021090022,
1.34358925022,
1.38310866922,
1.39690142522,
1.3864498952200002,
1.4657012202200002,
1.38965136922
]
}
]
} |
|
| Branch | pr/12029 |
| Testbed | pacquet |
Click to view all benchmark results
| Benchmark | Latency | Benchmark Result milliseconds (ms) (Result Δ%) | Upper Boundary milliseconds (ms) (Limit %) |
|---|---|---|---|
| isolated-linker.fresh-install.cold-cache.cold-store | 📈 view plot 🚷 view threshold | 2,136.10 ms(-9.17%)Baseline: 2,351.68 ms | 2,822.02 ms (75.69%) |
| isolated-linker.fresh-install.hot-cache.hot-store | 📈 view plot 🚷 view threshold | 1,369.67 ms(-10.23%)Baseline: 1,525.81 ms | 1,830.97 ms (74.81%) |
| isolated-linker.fresh-restore.cold-cache.cold-store | 📈 view plot 🚷 view threshold | 1,700.86 ms(-17.69%)Baseline: 2,066.37 ms | 2,479.64 ms (68.59%) |
| isolated-linker.fresh-restore.hot-cache.hot-store | 📈 view plot 🚷 view threshold | 506.76 ms(-22.65%)Baseline: 655.16 ms | 786.20 ms (64.46%) |
4b766a2 to
bf402b3
Compare
7169a3b to
8c8eca5
Compare
…l paths The metadata retry helper duplicated tarball's RetryOpts and hardcoded the default budget at every call site. Consolidate the retry primitive in pacquet-network and drive it from config on the install paths, matching pnpm — which wraps every registry request in @zkochan/retry under one fetch-retries budget. - Move RetryOpts (budget + exponential-backoff math) into pacquet-network alongside should_retry_status and a generic send_with_retry helper that returns (guard, Response). pacquet-tarball re-exports RetryOpts; the metadata fetchers use send_with_retry. Deletes the duplicate MetadataRetryOpts. - send_with_retry acquires the network permit per attempt and drops the response and guard before each backoff sleep, so a flapping registry can't pin sockets or park a concurrency permit during the wait. - Thread a config-sourced RetryOpts (via retry_opts_from_config) through the resolution verifier, NpmResolver, NamedRegistryResolver, and PickPackageContext, so the metadata and verify paths honor fetch-retries / fetch-retry-factor / fetch-retry-mintimeout / fetch-retry-maxtimeout instead of a hardcoded default. This also fixes a parity bug where a 5xx flap hung for 70s (10s + 60s default backoff) regardless of the user's fetch-retries setting. Test harnesses use a zero-retry budget so failure-path cases don't wait out the default backoff. pnpr is intentionally untouched: retry belongs to the install path (its /v1/install accelerator already retries through pacquet's resolver and tarball fetcher), while the verdaccio-style registry-proxy path must fail fast.
8c8eca5 to
4f98ca1
Compare
|
Congrats on merging your first pull request! 🎉🎉🎉 |
Problem
pacquetmetadata fetches made a single registry request, so a transientnetwork failure or a retryable HTTP response (
408/429/5xx) abortedresolution — even though the tarball path already followed pnpm's retry policy.
pnpm wraps every registry request, metadata and tarball, in
@zkochan/retryunder one
fetch-retriesbudget; pacquet only retried tarballs, and themetadata retry that was added in the first cut duplicated the tarball budget and
ignored the user's config.
Fixes #11841.
Solution
A single retry primitive now lives in
pacquet-networkand is driven by configon the install paths, matching pnpm exactly.
RetryOpts. The retry budget (fetch-retries/-factor/-mintimeout/-maxtimeout) plus its exponential-backoff mathmoves into
pacquet-network, next toshould_retry_statusand a genericsend_with_retry(client, url, opts, build_request) -> (guard, Response)helper.
pacquet-tarballre-exportsRetryOpts; the metadata fetchers usesend_with_retry. This removes the duplicateMetadataRetryOpts.send_with_retryacquires thenetwork permit per attempt and drops the response and its guard before each
backoff sleep, so a flapping registry can't pin sockets or hold a concurrency
permit while it waits. On the winning attempt the guard rides out with the
Responseso the caller keeps the permit through body streaming.RetryOpts(via the existingretry_opts_from_config) is threaded through theresolution verifier,
NpmResolver,NamedRegistryResolver, andPickPackageContext, so the metadata and verify paths honor the user'sfetch-retries*settings instead of a hardcoded default — the same way thetarball path already did.
Parity bug fixed
Because the verifier and resolver hardcoded the default retry budget, a
5xxflap on the main resolve/verify paths hung for ~70 s (
10 s + 60 sdefaultbackoff) regardless of the user's
fetch-retriessetting — a user who setfetch-retries=0to fail fast would still wait. Driving the budget from configfixes this; test harnesses use a zero-retry budget so failure-path cases don't
wait out the backoff.
Not pnpr
pnpris intentionally untouched. Retry is an install-time concern: pnpr's/v1/installaccelerator already retries through pacquet's resolver and tarballfetcher, while the verdaccio-style registry-proxy path must fail fast (and
serve stale on failure) rather than block a client request behind a 70 s
backoff. The shared
RetryOptsre-export keeps the accelerator's existingtarball-download retry compiling unchanged.
Testing
cargo check/cargo clippy --workspace --all-targets -- --deny warningscargo fmt --check,taplo format --checkRUSTDOCFLAGS="-D warnings" cargo doc --no-depscargo nextest run— network, tarball, resolving-npm-resolver, pnpr, and theaffected package-manager tests all pass with no slow cases.
Written by an agent (Claude Code, claude-opus-4-8).