fix(package-manager): stop pnpr install re-downloading prefetched tarballs#12245
Conversation
…balls
On the pnpr client path the resolve-time `TarballPrefetcher` fires a
background download for every `package` frame into the shared `MemCache`,
but the frozen materialization install (`InstallPackageBySnapshot`) fetched
each registry/tarball through `run_without_mem_cache`, so it never consulted
that cache. The two fan-outs ran concurrently with no in-flight
coordination: the materializer only avoided a second download when the
prefetch had already finished writing the tarball to the store before the
materializer reached it. On a fresh install of a ~1300-package fixture this
re-downloaded ~48% of tarballs — 1933 downloads for 1308 packages, 107 MB
transferred instead of 66 MB — and the redundant fetches thrashed the
network/extract slots so the download phase dribbled out to ~15s instead of
saturating the link.
Thread the install-scoped `tarball_mem_cache` down through
`InstallFrozenLockfile` -> `CreateVirtualStore` -> `InstallPackageBySnapshot`
and route the registry/tarball branch through `run_with_mem_cache` when a
cache is provided. The frozen path passes `Some` so the materializer parks
on (or reuses) the prefetcher's in-flight download; the fresh-lockfile path
passes `None` and keeps the standalone `run_without_mem_cache` (its
downloads already resolve through the resolve-time prefetcher into the store
and dedup against on-disk state). After the change the same install does
1308 downloads (0 duplicates) and the download phase saturates the link and
completes in ~2-5s.
Note: this only removes redundant work; it does not by itself make pnpr
faster than a local install on a fast/low-latency registry, where the
dominant cost is the remote resolve round-trip plus the serial
resolve -> write-lockfile -> materialize barrier rather than the tarball
bytes.
Known follow-up (reporting label): because the materializer now consumes the
prefetcher's downloads and the prefetcher reports through `SilentReporter`,
a fresh pnpr install now emits `pnpm:progress found_in_store` for the
prefetched packages instead of `fetched` (it labels freshly-downloaded
tarballs as "reused"). It is a label-only issue — no double-count, no
missing event — but it should be corrected so the summary line is accurate.
Two viable fixes, both more invasive than this change:
1. Share one `SharedReportedProgressKeys` between the prefetcher (emitting
`fetched` via the real reporter) and the materializer so the cache-hit
`found_in_store` is deduped away. Requires plumbing the set from the
CLI through `Install`, whose struct literal has ~70 construction sites.
2. Carry a `network_fetched` flag on `CacheValue::Available` (set by the
owner when `run_without_mem_cache` actually hit the network) so the
waiting cache-hit reports `fetched` vs `found_in_store` correctly.
Requires changing `run_without_mem_cache`'s return type (~16 call
sites).
---
Written by an agent (Claude Code, claude-opus-4-8).
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (5)
📜 Recent review details🧰 Additional context used📓 Path-based instructions (1)pacquet/**/*.rs📄 CodeRabbit inference engine (pacquet/AGENTS.md)
Files:
🧠 Learnings (23)📓 Common learnings📚 Learning: 2026-05-23T09:14:43.635ZApplied to files:
📚 Learning: 2026-06-04T14:40:29.444ZApplied to files:
📚 Learning: 2026-05-23T09:14:43.635ZApplied to files:
📚 Learning: 2026-05-25T14:58:11.105ZApplied to files:
📚 Learning: 2026-05-29T18:03:15.372ZApplied to files:
📚 Learning: 2026-05-23T16:55:36.507ZApplied to files:
📚 Learning: 2026-06-04T14:55:52.163ZApplied to files:
📚 Learning: 2026-05-29T18:03:15.372ZApplied to files:
📚 Learning: 2026-05-20T19:40:55.051ZApplied to files:
📚 Learning: 2026-05-22T00:08:44.646ZApplied to files:
📚 Learning: 2026-05-20T23:07:58.444ZApplied to files:
📚 Learning: 2026-05-29T18:03:24.797ZApplied to files:
📚 Learning: 2026-05-24T16:07:54.784ZApplied to files:
📚 Learning: 2026-06-02T13:18:30.659ZApplied to files:
📚 Learning: 2026-05-24T21:11:04.272ZApplied to files:
📚 Learning: 2026-05-24T21:11:04.272ZApplied to files:
📚 Learning: 2026-05-29T18:03:15.372ZApplied to files:
📚 Learning: 2026-06-04T20:24:32.096ZApplied to files:
📚 Learning: 2026-05-20T10:06:55.749ZApplied to files:
📚 Learning: 2026-05-19T19:23:00.981ZApplied to files:
📚 Learning: 2026-06-02T16:13:39.456ZApplied to files:
📚 Learning: 2026-05-20T20:41:50.322ZApplied to files:
🔇 Additional comments (7)
📝 WalkthroughWalkthroughThe PR threads a shared in-memory tarball cache ( ChangesTarball memory cache threading
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Micro-Benchmark ResultsLinux |
Integrated-Benchmark Report (Linux)Each scenario has pacquet rows (direct install) and pnpr rows (the same client through the pnpr install accelerator), so pnpr@HEAD vs pacquet@HEAD is the pnpr-vs-direct ratio. Cold-store scenarios wipe the client store between runs (warm server); hot-store scenarios keep it warm. The pacquet@HEAD rows feed the pacquet Bencher testbed; the pnpr@HEAD rows feed the pnpr testbed. Scenario: Isolated linker: fresh restore, cold cache + cold store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 9.976332235700001,
"stddev": 0.17666419189957885,
"median": 9.9313940981,
"user": 3.4983488599999992,
"system": 2.7236304399999995,
"min": 9.8338984851,
"max": 10.4518794111,
"times": [
10.0125223861,
9.992676285100002,
9.917582276100001,
9.8598999281,
10.4518794111,
9.9323609221,
9.8338984851,
9.8682719081,
9.930427274100001,
9.963803481100001
]
},
{
"command": "pacquet@main",
"mean": 10.043622227,
"stddev": 0.20972996987073758,
"median": 9.976091083100002,
"user": 3.45489176,
"system": 2.7816032399999995,
"min": 9.863337576100001,
"max": 10.428059649100001,
"times": [
9.8642694371,
10.0770491301,
9.884249286100001,
10.428059649100001,
10.027184375100001,
9.863337576100001,
10.108382760100001,
10.3780102411,
9.924997791100001,
9.8806820241
]
},
{
"command": "pnpr@HEAD",
"mean": 4.9914931869,
"stddev": 0.13593902258364335,
"median": 4.9470517261,
"user": 2.5210640599999996,
"system": 2.4521815399999998,
"min": 4.8613399121,
"max": 5.3023598381,
"times": [
5.1556012921,
4.9526372511,
4.8613399121,
5.3023598381,
4.9961597571,
4.9414662011,
4.9306087821,
4.9729588621,
4.9264576961,
4.8753422771
]
},
{
"command": "pnpr@main",
"mean": 4.960569936000001,
"stddev": 0.1259227377219873,
"median": 4.9071291531,
"user": 2.5205619599999993,
"system": 2.4209554399999997,
"min": 4.8502153301,
"max": 5.2104599611,
"times": [
5.2104599611,
4.8712999651,
4.8833086241,
4.9296748151,
5.1644120791,
4.9095048541,
4.9918030151,
4.8502153301,
4.8902672641,
4.9047534521000005
]
}
]
}Scenario: Isolated linker: fresh restore, hot cache + hot store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 0.5064935650800001,
"stddev": 0.031120830554653707,
"median": 0.49742577728000004,
"user": 0.38426522,
"system": 0.8033639599999999,
"min": 0.48531513128000003,
"max": 0.59178103428,
"times": [
0.59178103428,
0.51346389228,
0.50417347328,
0.49786142728000005,
0.48820101528000004,
0.49393778128000004,
0.50321956028,
0.48531513128000003,
0.49699012728,
0.48999220828000006
]
},
{
"command": "pacquet@main",
"mean": 0.49980125078000004,
"stddev": 0.008204428031389283,
"median": 0.49868174978,
"user": 0.38853731999999996,
"system": 0.80038386,
"min": 0.49164504628000005,
"max": 0.52113858328,
"times": [
0.52113858328,
0.49794244328000004,
0.50119815828,
0.49641630828000005,
0.50036135128,
0.49942105628000005,
0.49164504628000005,
0.49669602628000004,
0.50086911228,
0.49232442228000006
]
},
{
"command": "pnpr@HEAD",
"mean": 0.6173668904800002,
"stddev": 0.026864353841048374,
"median": 0.61156101078,
"user": 0.39931651999999995,
"system": 0.81074006,
"min": 0.58938044628,
"max": 0.6799440732800001,
"times": [
0.6799440732800001,
0.62398740628,
0.61112640428,
0.61199561728,
0.60614313028,
0.58938044628,
0.59224759028,
0.64298511928,
0.61475228128,
0.6011068362800001
]
},
{
"command": "pnpr@main",
"mean": 0.6515821297800001,
"stddev": 0.0947076080265515,
"median": 0.6114931797800001,
"user": 0.39271692,
"system": 0.8189499599999998,
"min": 0.58985799428,
"max": 0.9057812562800001,
"times": [
0.69161852228,
0.61543555228,
0.60503502828,
0.65995954228,
0.58985799428,
0.60755080728,
0.59815452628,
0.60517183628,
0.9057812562800001,
0.6372562322800001
]
}
]
}Scenario: Isolated linker: fresh install, cold cache + cold store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 5.167117290919999,
"stddev": 0.058402750885731984,
"median": 5.156676325020001,
"user": 3.9166654999999997,
"system": 2.0842009599999995,
"min": 5.09098846952,
"max": 5.29770129152,
"times": [
5.167026137520001,
5.29770129152,
5.12006428752,
5.2185132985200005,
5.09098846952,
5.169854321520001,
5.18930449952,
5.1423350585200005,
5.146326512520001,
5.129059032520001
]
},
{
"command": "pacquet@main",
"mean": 5.14284741242,
"stddev": 0.05839915577141343,
"median": 5.13084404502,
"user": 3.8650611999999995,
"system": 2.0623847599999996,
"min": 5.063181230520001,
"max": 5.272376754520001,
"times": [
5.11044046652,
5.13147792052,
5.13086582052,
5.121779749520001,
5.13082226952,
5.272376754520001,
5.14049264752,
5.063181230520001,
5.1145391805200004,
5.212498084520001
]
},
{
"command": "pnpr@HEAD",
"mean": 1.68264532882,
"stddev": 0.03474871442743423,
"median": 1.68764721452,
"user": 2.6333425,
"system": 1.9245152599999997,
"min": 1.6291404145200001,
"max": 1.73593873652,
"times": [
1.63534707952,
1.6925070985200001,
1.68207885952,
1.70596832852,
1.6291404145200001,
1.69698832952,
1.6506010075200002,
1.68278733052,
1.73593873652,
1.71509610352
]
},
{
"command": "pnpr@main",
"mean": 1.69318142662,
"stddev": 0.056936378535371285,
"median": 1.68604865602,
"user": 2.6286283,
"system": 1.9724707599999998,
"min": 1.6012787615200001,
"max": 1.78262499352,
"times": [
1.78262499352,
1.6012787615200001,
1.7355737385199999,
1.63842215152,
1.73640304452,
1.68155807452,
1.6709132175199999,
1.74918626752,
1.64531477952,
1.69053923752
]
}
]
}Scenario: Isolated linker: fresh install, hot cache + hot store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 1.25285844718,
"stddev": 0.015075952954788948,
"median": 1.25038394058,
"user": 1.59288284,
"system": 1.10989096,
"min": 1.23292559508,
"max": 1.28215570508,
"times": [
1.23659377708,
1.2452845400800001,
1.26663183308,
1.25429451308,
1.23292559508,
1.28215570508,
1.25180334508,
1.24376072208,
1.24896453608,
1.26616990508
]
},
{
"command": "pacquet@main",
"mean": 1.28262580688,
"stddev": 0.06718525284220123,
"median": 1.26027393058,
"user": 1.6148672400000001,
"system": 1.1018432599999997,
"min": 1.23097245808,
"max": 1.45977287908,
"times": [
1.26463835608,
1.25448195608,
1.23570689308,
1.27666572308,
1.45977287908,
1.3196041090800001,
1.27953003208,
1.24897615708,
1.25590950508,
1.23097245808
]
},
{
"command": "pnpr@HEAD",
"mean": 0.52157912498,
"stddev": 0.053938835324674944,
"median": 0.50363808308,
"user": 0.33747314,
"system": 0.75728246,
"min": 0.48705369808000004,
"max": 0.6707499130800001,
"times": [
0.49926782608000003,
0.50530648508,
0.51076178908,
0.5019696810800001,
0.5086871740800001,
0.50090700808,
0.49541765608000005,
0.48705369808000004,
0.6707499130800001,
0.53567001908
]
},
{
"command": "pnpr@main",
"mean": 0.5065021159800001,
"stddev": 0.008489111748034931,
"median": 0.50683951708,
"user": 0.33593884,
"system": 0.7677413599999999,
"min": 0.49193544808,
"max": 0.51713133508,
"times": [
0.51713133508,
0.5166256450800001,
0.49747675708000005,
0.49193544808,
0.50156451508,
0.50495534208,
0.50108399208,
0.5128897100800001,
0.50872369208,
0.51263472308
]
}
]
}Scenario: Isolated linker: fresh install, cold cache + hot storeResolution-only: cold packument cache (full re-resolve over the registry link) with a hot store (no tarball download), so this isolates pnpr offloading the client resolution to its warm server.
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 4.9385919974800006,
"stddev": 0.05597386995902876,
"median": 4.94110802598,
"user": 1.8946239200000001,
"system": 1.26454108,
"min": 4.87023677598,
"max": 5.06711869998,
"times": [
4.96466058098,
4.87939957398,
4.87023677598,
4.93671084998,
4.94550520198,
5.06711869998,
4.95441468898,
4.945972966979999,
4.88928800998,
4.93261262598
]
},
{
"command": "pacquet@main",
"mean": 4.94217820938,
"stddev": 0.034347822509429055,
"median": 4.94103661648,
"user": 1.89510522,
"system": 1.2672648800000001,
"min": 4.88727593498,
"max": 5.00365121598,
"times": [
4.88727593498,
4.93135496598,
4.94921558998,
4.91386009898,
4.90944117898,
4.93285764298,
4.97460705598,
4.95603199398,
4.96348641598,
5.00365121598
]
},
{
"command": "pnpr@HEAD",
"mean": 0.5183429866800001,
"stddev": 0.00841767306127987,
"median": 0.51729428748,
"user": 0.34274612000000004,
"system": 0.7843087799999999,
"min": 0.50707910898,
"max": 0.53758219998,
"times": [
0.51736658098,
0.51112800698,
0.51885485798,
0.51722199398,
0.53758219998,
0.52558788498,
0.50707910898,
0.51502044498,
0.51978108598,
0.51380770198
]
},
{
"command": "pnpr@main",
"mean": 0.5184197432800002,
"stddev": 0.005555037502617592,
"median": 0.51697366498,
"user": 0.34945152,
"system": 0.7769385799999998,
"min": 0.51235982098,
"max": 0.53183294998,
"times": [
0.52156776698,
0.51235982098,
0.52009982098,
0.51547717898,
0.51726813598,
0.51971291598,
0.53183294998,
0.51667919398,
0.51310211998,
0.51609752898
]
}
]
} |
Review Summary by QodoEliminate duplicate tarball downloads in pnpr frozen materialization
WalkthroughsDescription• Thread install-scoped tarball cache through frozen lockfile materialization • Reuse prefetcher's in-flight downloads instead of re-fetching tarballs • Eliminate ~48% duplicate downloads on pnpr fresh installs (~625 redundant fetches) • Saturate network link during download phase, reducing time from ~15s to ~2-5s Diagramflowchart LR
A["TarballPrefetcher<br/>background downloads"] -->|"shared MemCache"| B["InstallFrozenLockfile"]
B -->|"tarball_mem_cache: Some"| C["CreateVirtualStore"]
C -->|"tarball_mem_cache"| D["InstallPackageBySnapshot"]
D -->|"run_with_mem_cache"| E["DownloadTarballToStore<br/>reuses in-flight downloads"]
F["InstallWithFreshLockfile"] -->|"tarball_mem_cache: None"| C
F -->|"resolve-time prefetch"| E
File Changes1. pacquet/crates/package-manager/src/install.rs
|
…e prefetcher On the fresh-resolve install path the same tarball could be downloaded twice: once by the resolve-time `PrefetchingResolver` and again by `CreateVirtualStore`'s cold batch. The shared `tarball_mem_cache` infrastructure already exists (added in #12245 for the pnpr/frozen path), but the fresh-lockfile call site threaded `None`, so the cold batch went through `run_without_mem_cache` and never observed the in-flight prefetch — its only on-disk dedup is the store-index row, which the prefetcher's writer commits asynchronously, so a snapshot whose row hasn't committed yet re-downloaded the same bytes. Thread `Some(&tarball_mem_cache)` from the fresh-lockfile path so the cold batch reuses a finished prefetch immediately (`CacheValue::Available`) or briefly parks on the per-URL `Notify` for an in-flight one. Installs with no prefetcher keep passing `None`. Closes #12241
…e prefetcher On the fresh-resolve install path the same registry tarball could be downloaded twice: once by the resolve-time `PrefetchingResolver` and again by `CreateVirtualStore`'s cold batch. The shared `tarball_mem_cache` infrastructure already exists (added in #12245 for the pnpr/frozen path), but the fresh-lockfile call site threaded `None`, so the cold batch went through `run_without_mem_cache` and never observed the in-flight prefetch — its only on-disk dedup is the store-index row, which the prefetcher's writer commits asynchronously, so a snapshot whose row hasn't committed yet re-downloaded the same bytes. Thread `Some(&tarball_mem_cache)` from the fresh-lockfile path so the cold batch reuses a finished prefetch (`CacheValue::Available`) or briefly parks on the per-URL `Notify` for an in-flight one. Two refinements keep this safe: - Coordinate only registry resolutions. The `PrefetchingResolver` skips remote tarballs (they resolve with no `name_ver`); their only mem-cache entry comes from the resolver's download-to-resolve, keyed by `name@version`, while the lockfile and this pass address them by `name@<url>`. Reusing that entry would skip writing the `name@<url>` store-index row a later re-resolve needs to reuse the warm store (regressed `remote_tarball_reresolves_from_warm_store_without_refetch`). - Fall back to a standalone retried download when the best-effort prefetch failed (`SiblingFetchFailed`) instead of inheriting the failure, so a transient prefetch error no longer fails the install. Closes #12241
…ce fix) (#12243) * fix(package-manager): coordinate the fresh-resolve cold batch with the prefetcher On the fresh-resolve install path the same registry tarball could be downloaded twice: once by the resolve-time `PrefetchingResolver` and again by `CreateVirtualStore`'s cold batch. The shared `tarball_mem_cache` infrastructure already exists (added in #12245 for the pnpr/frozen path), but the fresh-lockfile call site threaded `None`, so the cold batch went through `run_without_mem_cache` and never observed the in-flight prefetch — its only on-disk dedup is the store-index row, which the prefetcher's writer commits asynchronously, so a snapshot whose row hasn't committed yet re-downloaded the same bytes. Thread `Some(&tarball_mem_cache)` from the fresh-lockfile path so the cold batch reuses a finished prefetch (`CacheValue::Available`) or briefly parks on the per-URL `Notify` for an in-flight one. Two refinements keep this safe: - Coordinate only registry resolutions. The `PrefetchingResolver` skips remote tarballs (they resolve with no `name_ver`); their only mem-cache entry comes from the resolver's download-to-resolve, keyed by `name@version`, while the lockfile and this pass address them by `name@<url>`. Reusing that entry would skip writing the `name@<url>` store-index row a later re-resolve needs to reuse the warm store (regressed `remote_tarball_reresolves_from_warm_store_without_refetch`). - Fall back to a standalone retried download when the best-effort prefetch failed (`SiblingFetchFailed`) instead of inheriting the failure, so a transient prefetch error no longer fails the install. Closes #12241 * refactor(package-manager): match resolution by reference in mem-cache guard Borrow `metadata.resolution` in the registry-only mem-cache guard so the non-binding `matches!` reads through the shared reference explicitly, addressing a review comment on #12243. No behavior change.
Problem
On the pnpr client path the resolve-time
TarballPrefetcherfires a background download for everypackageframe into the sharedMemCache, but the frozen materialization install (InstallPackageBySnapshot) fetched each registry/tarball throughrun_without_mem_cache— so it never consulted that cache. The two fan-outs ran concurrently with no in-flight coordination: the materializer only avoided a second download when the prefetch had already finished writing the tarball to the store before the materializer reached it.On a fresh install of a ~1300-package fixture this re-downloaded ~48% of tarballs:
The redundant fetches also thrashed the network/extract slots, so the download phase never saturated the link.
Fix
Thread the install-scoped
tarball_mem_cachedown throughInstallFrozenLockfile→CreateVirtualStore→InstallPackageBySnapshot, and route the registry/tarball branch throughrun_with_mem_cachewhen a cache is provided. The frozen path passesSome(materializer parks on / reuses the prefetcher's in-flight download); the fresh-lockfile path passesNoneand keepsrun_without_mem_cache(its downloads already resolve through the resolve-time prefetcher into the store and dedup against on-disk state).Scope / caveat
This removes redundant work; it does not by itself make pnpr faster than a local install on a fast/low-latency registry. There the dominant cost is the remote resolve round-trip plus the serial
resolve → write-lockfile → materializebarrier, not the tarball bytes. Measured wall-clock on a fast CDN: interleaved pnpr-on ~16.2s vs off ~14.4s (the gap narrowed ~1-2s but did not close). The bigger lever — letting materialization start before the full resolve stream lands — is a separate change.Known follow-up: progress label
Because the materializer now consumes the prefetcher's downloads and the prefetcher reports through
SilentReporter, a fresh pnpr install emitspnpm:progress found_in_storefor the prefetched packages instead offetched(it labels freshly-downloaded tarballs as "reused"). Label-only — no double-count, no missing event — but it should be corrected so the summary line is accurate. Two viable fixes, both more invasive than this PR:SharedReportedProgressKeysbetween the prefetcher (emittingfetchedvia the real reporter) and the materializer, so the cache-hitfound_in_storeis deduped away. Requires plumbing the set from the CLI throughInstall, whose struct literal has ~70 construction sites.network_fetchedflag onCacheValue::Available(set by the owner whenrun_without_mem_cacheactually hit the network) so the waiting cache-hit reportsfetchedvsfound_in_storecorrectly. Requires changingrun_without_mem_cache's return type (~16 call sites).Tests
All 51 frozen/snapshot +
install_package_by_snapshottests pass, plus bothpnpr_installintegration tests (install_via_pnpr_links_node_modulesexercises the changed path).cargo clippyclean.Written by an agent (Claude Code, claude-opus-4-8).
Summary by CodeRabbit