perf(tarball): parallelize per-file CAS writes within a tarball#12247
Conversation
`extract_tarball_entries` walked the tar in a single serial loop, hashing
and writing each file into the CAS one at a time on the one `spawn_blocking`
thread the extraction runs on. A package with many files (e.g. `core-js`,
which unpacks to thousands) therefore pinned a single core for the whole
extraction while the rest of the machine sat idle — most visibly at the
makespan tail, when one big package is the last extraction still running and
every other core is free.
Split extraction into two phases: a serial pass that walks the seekable tar
stream to validate + clean each regular-file path and capture a borrow of its
payload, then a parallel pass that hashes and writes each file into the CAS
across the rayon pool. `StoreDir::write_cas_file` is content-addressed and
already documented as safe to call concurrently (its shard-creation cache is
race-tolerant), so the output — the CAS files, the `{path → cafs path}` map,
and the `PackageFilesIndex` row — is byte-identical; result order is
preserved so the last-entry-wins behavior for duplicate paths is unchanged.
Small tarballs (under 32 files) stay on the serial path to avoid rayon's
per-job dispatch cost when there's nothing to gain.
On a fresh install of a ~1300-package fixture this cut the extraction tail
roughly in half: the largest package (`core-js@3`) finished extracting at
~10.7s before and ~5.5s after, and all extractions completed by ~5.5s instead
of ~10.7s. (Total install time on that fixture is dominated by the downstream
hardlink/import phase, so this speeds up extraction specifically rather than
the whole install.)
---
Written by an agent (Claude Code, claude-opus-4-8).
📝 WalkthroughWalkthroughThe refactoring defers CAS writes and per-entry indexing until after tar walking completes. Phase 1 stages entries and captures the bundled manifest; phase 2 batch-processes staged files using Rayon parallelism when beneficial; phase 3 reconstructs the output maps, preserving last-wins semantics for duplicates. ChangesTarball extraction staging and batch processing
🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Micro-Benchmark ResultsLinux |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pacquet/crates/tarball/src/lib.rs (1)
673-680: ⚡ Quick winAdd a regression test that forces the parallel branch with duplicate paths.
This refactor’s last-wins guarantee now depends on the batched write path preserving
pendingorder through the Rayon collection. A>= 32entry fixture with a duplicate filename would lock that contract down.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pacquet/crates/tarball/src/lib.rs` around lines 673 - 680, Add a regression test that exercises the parallel branch (use at least PARALLEL_EXTRACT_THRESHOLD entries, i.e. 32) and includes duplicate filenames in the pending list to assert the "last-wins" outcome; construct a pending Vec with ordered entries where the later duplicate should overwrite the earlier one, call the code path that invokes write_cas_entry (so the conditional using PARALLEL_EXTRACT_THRESHOLD triggers the par_iter branch), then assert the resulting written collection/store reflects the last entry for the duplicate path (compare file content or CafsFileInfo) to lock down the ordering contract.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@pacquet/crates/tarball/src/lib.rs`:
- Around line 673-680: Add a regression test that exercises the parallel branch
(use at least PARALLEL_EXTRACT_THRESHOLD entries, i.e. 32) and includes
duplicate filenames in the pending list to assert the "last-wins" outcome;
construct a pending Vec with ordered entries where the later duplicate should
overwrite the earlier one, call the code path that invokes write_cas_entry (so
the conditional using PARALLEL_EXTRACT_THRESHOLD triggers the par_iter branch),
then assert the resulting written collection/store reflects the last entry for
the duplicate path (compare file content or CafsFileInfo) to lock down the
ordering contract.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: fe7fd72f-ad4d-4793-ade1-20b06bc6ed81
📒 Files selected for processing (1)
pacquet/crates/tarball/src/lib.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run benchmark on ubuntu-latest
🧰 Additional context used
📓 Path-based instructions (1)
pacquet/**/*.rs
📄 CodeRabbit inference engine (pacquet/AGENTS.md)
pacquet/**/*.rs: Log emissions are part of matching pnpm — when porting a function that firespnpm:<channel>events throughglobalLogger,logger.debug(...), orstreamParser.write(...), mirror the call site, payload, and ordering so@pnpm/cli.default-reporterparses pacquet's NDJSON the same way
Declare a newtype wrapper for branded string types instead of collapsing the brand into a plainStringor&strin Rust
If upstream TypeScript always validates before construction of a branded string, validate in the Rust wrapper too viaTryFrom<String>and/orFromStrand do not provide an infallible public constructor
If upstream TypeScript never validates a branded string, just brand for type-safety in Rust by exposing an infallibleFrom<String>constructor
If upstream TypeScript occasionally constructs a branded string without validation, exposefrom_str_uncheckedin Rust as an escape hatch alongside the validating constructor
Match upstream serde behavior for branded strings crossing JSON, YAML, or INI boundaries by using#[serde(try_from = "String")]for deserialization and#[serde(into = "String")]for serialization
Derive simple conversions for branded strings using#[derive(derive_more::From)]and#[derive(derive_more::Into)]instead of handwritingimplblocks; use manualimplonly when conversion needs custom logic
Model TypeScript string literal unions (like'auto' | 'always' | 'never') as Rustenums instead of newtype wrappers, since the set of valid values is closed
Treat TypeScript string template literal types (like`${string}@${string}`) the same as branded string types in Rust, using a newtype wrapper with validation
Follow the code style guide inCODE_STYLE_GUIDE.md— imports, modules, naming, ownership and borrowing, parameter type selection, trait bounds, pattern matching,pipe-trait, error handling, test layout, and cloning ofArcandRc
Choose owned vs. borrowed parameters to minimize copies; widen to t...
Files:
pacquet/crates/tarball/src/lib.rs
🧠 Learnings (10)
📓 Common learnings
Learnt from: zkochan
Repo: pnpm/pnpm PR: 12181
File: worker/src/start.ts:504-520
Timestamp: 2026-06-04T06:04:05.107Z
Learning: In pnpm/pnpm's pnpr install accelerator, the `/v1/install` response has a two-level framing structure:
1. **Outer layer** (full HTTP body): `[u32 outer header length][outer header JSON][files payload]` — `fetchFromPnpmRegistry` (pnpr/client/src/fetchFromPnpmRegistry.ts) strips the outer layer with `body.subarray(4 + headerLength)` and passes the remaining bytes to `writeCafsFiles`.
2. **Inner layer** (files payload): the files payload itself starts with its own `[u32 inner json length][inner header JSON]` prefix (built by the server's `build_files_payload` / `empty_files_payload_prefix`), followed by `[64-byte digest][u32 size][1-byte exec][content]` frames and a 64-zero-byte end marker.
`writeCafsFiles` in `worker/src/start.ts` is correct to read `jsonLen = payload.readUInt32BE(0)` and start frames at `offset = 4 + jsonLen` — this skips the inner header. The same two-level structure is mirrored in the Rust reference client (`parse_inline_response` + `write_files_payload`). Do not fla...
Learnt from: CR
Repo: pnpm/pnpm PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-05-25T12:36:42.202Z
Learning: User-visible changes (CLI flags, defaults, environment variables, lockfile/manifest/state-file formats, error codes/messages, log emissions, store layout, hook semantics) in pnpm must be mirrored to pacquet in the same PR
📚 Learning: 2026-05-29T18:03:24.797Z
Learnt from: CR
Repo: pnpm/pnpm PR: 0
File: pnpr/AGENTS.md:0-0
Timestamp: 2026-05-29T18:03:24.797Z
Learning: Prefer existing pacquet-* crates over writing new code; check pacquet-tarball, pacquet-crypto-hash, pacquet-crypto-shasums-file, pacquet-package-manifest, pacquet-network, pacquet-registry, pacquet-fs, and pacquet-diagnostics before implementing non-trivial functionality
Applied to files:
pacquet/crates/tarball/src/lib.rs
📚 Learning: 2026-06-04T20:24:32.096Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 12198
File: pnpr/crates/pnpr/src/storage.rs:469-477
Timestamp: 2026-06-04T20:24:32.096Z
Learning: In `pnpr/crates/pnpr/src/storage.rs` (pnpm/pnpm repo, Rust), `Store::list_package_names` intentionally uses `fs::try_exists(...).await.unwrap_or(false)` and `if let Ok(mut inner) = fs::read_dir(...)` — NOT `?`-propagation — for per-entry checks. This is deliberate best-effort / verdaccio-style search behavior: (1) `try_exists(stray_file/package.json)` returns `ENOTDIR` (not `NotFound`) for a stray non-package file in the store root, so `?` would fail the entire search; (2) the `@`-scope `read_dir` would fail on a non-directory `@`-named entry; (3) switching to `DirEntry::file_type()` would stop following symlinked package dirs. Failures that DO propagate are preserved: opening the store root itself, and `next_entry()` during the walk. Do not suggest blanket `?`-propagation for these per-entry checks.
Applied to files:
pacquet/crates/tarball/src/lib.rs
📚 Learning: 2026-06-04T06:04:05.107Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 12181
File: worker/src/start.ts:504-520
Timestamp: 2026-06-04T06:04:05.107Z
Learning: In pnpm/pnpm's pnpr install accelerator, the `/v1/install` response has a two-level framing structure:
1. **Outer layer** (full HTTP body): `[u32 outer header length][outer header JSON][files payload]` — `fetchFromPnpmRegistry` (pnpr/client/src/fetchFromPnpmRegistry.ts) strips the outer layer with `body.subarray(4 + headerLength)` and passes the remaining bytes to `writeCafsFiles`.
2. **Inner layer** (files payload): the files payload itself starts with its own `[u32 inner json length][inner header JSON]` prefix (built by the server's `build_files_payload` / `empty_files_payload_prefix`), followed by `[64-byte digest][u32 size][1-byte exec][content]` frames and a 64-zero-byte end marker.
`writeCafsFiles` in `worker/src/start.ts` is correct to read `jsonLen = payload.readUInt32BE(0)` and start frames at `offset = 4 + jsonLen` — this skips the inner header. The same two-level structure is mirrored in the Rust reference client (`parse_inline_response` + `write_files_payload`). Do not fla...
Applied to files:
pacquet/crates/tarball/src/lib.rs
📚 Learning: 2026-05-25T14:58:11.105Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11931
File: pacquet/crates/resolving-npm-resolver/src/create_npm_resolution_verifier.rs:560-589
Timestamp: 2026-05-25T14:58:11.105Z
Learning: In `pacquet/crates/resolving-npm-resolver/src/create_npm_resolution_verifier.rs`, all per-`(registry, name[, version])` caches in `NpmResolutionVerifier` (`published_at`, `full_meta`, `full_meta_for_trust`, `abbreviated_meta`, `local_meta`) intentionally use the same pattern: lock → miss-check → release lock → await fetch/load → re-acquire lock → insert. This uniform pattern is deliberate; do not flag individual caches for using it. The known follow-up improvement (replacing the pattern with `tokio::sync::OnceCell` per key inside a `Mutex<HashMap<…>>`) is tracked as a future structural change to cover all five caches simultaneously.
Applied to files:
pacquet/crates/tarball/src/lib.rs
📚 Learning: 2026-05-23T09:14:43.635Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11867
File: pacquet/crates/package-manager/src/install_with_fresh_lockfile.rs:726-730
Timestamp: 2026-05-23T09:14:43.635Z
Learning: In `pacquet/crates/package-manager/src/install_with_fresh_lockfile.rs`, the fresh-lockfile path intentionally does not invoke `BuildModules` and discards `side_effects_maps_by_snapshot` from `CreateVirtualStoreOutput`. This is pre-existing, documented behavior (mirroring upstream `link.ts:167-170`): `importing_done` fires once extraction and symlink linking are complete, and the fresh-lockfile path does not run lifecycle scripts. The frozen-lockfile path wires `BuildModules` end-to-end as normal. Do not flag this omission as a bug; wiring lifecycle scripts into the fresh-lockfile path is tracked as future work separate from perf refactors.
Applied to files:
pacquet/crates/tarball/src/lib.rs
📚 Learning: 2026-05-20T21:18:55.266Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11778
File: pacquet/crates/resolving-local-resolver/src/parse_bare_specifier.rs:253-278
Timestamp: 2026-05-20T21:18:55.266Z
Learning: In `pacquet/crates/resolving-local-resolver/src/parse_bare_specifier.rs`, the `resolve_path` function intentionally short-circuits absolute specifiers verbatim (returns them unchanged without normalizing `..` components), mirroring the upstream TypeScript `resolvePath` in `resolving/local-resolver/src/parseBareSpecifier.ts` at ef87f3ccff. The OS resolves `..` at `fs.read` time. Do not suggest normalizing the absolute branch — it would invent behavior pnpm doesn't have, violating the pacquet AGENTS.md cardinal rule of fidelity to upstream.
Applied to files:
pacquet/crates/tarball/src/lib.rs
📚 Learning: 2026-05-20T19:40:55.051Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11774
File: pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs:0-0
Timestamp: 2026-05-20T19:40:55.051Z
Learning: In the pacquet Rust code, ensure the semver implementation uses the `node-semver` crate (not `nodejs-semver`). `node-semver`’s public API does not include a `satisfies_with_prerelease`-style method; prerelease-tolerant matching should be implemented inline by first calling `Range::satisfies`, and when it rejects a prerelease version, retry matching against a stripped `MAJOR.MINOR.PATCH` base of the prerelease version.
Applied to files:
pacquet/crates/tarball/src/lib.rs
📚 Learning: 2026-05-22T00:08:44.646Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11837
File: pacquet/crates/resolving-npm-resolver/src/pick_package.rs:33-51
Timestamp: 2026-05-22T00:08:44.646Z
Learning: In the pnpm/pnpm repo’s pacquet Rust crates, do not flag Unicode ellipsis characters (U+2026, `…`) in Rust doc comments (`///` / `/** */`) as a lint violation. The pacquet crate’s `dylint.toml` only enables `perfectionist::derive_ordering`, and the Dylint `unicode-ellipsis` rule is not enabled for this project—so `…` in doc comments is an intentional, repo-consistent style.
Applied to files:
pacquet/crates/tarball/src/lib.rs
📚 Learning: 2026-05-20T23:07:58.444Z
Learnt from: zkochan
Repo: pnpm/pnpm PR: 11784
File: pacquet/crates/resolving-deps-resolver/src/hoist_peers.rs:120-133
Timestamp: 2026-05-20T23:07:58.444Z
Learning: When reviewing code in this pacquet Rust port, follow the upstream pnpm compatibility rule: only match pnpm’s behavior exactly. Do not propose review changes that intentionally deviate from pnpm’s documented/observed behavior, even if pnpm appears buggy. If you identify a real bug in pnpm behavior, the review should prioritize fixing it upstream in pnpm first, and avoid implementing a pnpm-behavior workaround here unless the same fix has already landed upstream.
Applied to files:
pacquet/crates/tarball/src/lib.rs
🔇 Additional comments (1)
pacquet/crates/tarball/src/lib.rs (1)
438-474: LGTM!Also applies to: 524-662, 682-703
Integrated-Benchmark Report (Linux)Each scenario has pacquet rows (direct install) and pnpr rows (the same client through the pnpr install accelerator), so pnpr@HEAD vs pacquet@HEAD is the pnpr-vs-direct ratio. Cold-store scenarios wipe the client store between runs (warm server); hot-store scenarios keep it warm. The pacquet@HEAD rows feed the pacquet Bencher testbed; the pnpr@HEAD rows feed the pnpr testbed. Scenario: Isolated linker: fresh restore, cold cache + cold store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 10.15305668296,
"stddev": 0.10182340400711702,
"median": 10.11216729186,
"user": 3.6059113799999993,
"system": 4.395553719999999,
"min": 10.03211567836,
"max": 10.33094130036,
"times": [
10.19475652136,
10.07486729836,
10.32629012236,
10.33094130036,
10.14884015836,
10.11019093836,
10.09677898936,
10.10164217736,
10.11414364536,
10.03211567836
]
},
{
"command": "pacquet@main",
"mean": 10.02193943276,
"stddev": 0.12395331725879401,
"median": 9.978084820860001,
"user": 3.36657918,
"system": 4.267394819999999,
"min": 9.90642244536,
"max": 10.22277175536,
"times": [
9.92146659636,
10.22277175536,
9.95855224736,
9.99836254936,
9.90642244536,
9.93528379936,
9.92017655236,
10.14848433336,
9.99761739436,
10.21025665436
]
},
{
"command": "pnpr@HEAD",
"mean": 5.1626692061599995,
"stddev": 0.08371392044020967,
"median": 5.12696918386,
"user": 2.73484238,
"system": 4.042505220000001,
"min": 5.09856722536,
"max": 5.3479248433599995,
"times": [
5.13400173036,
5.28622033836,
5.15013158036,
5.3479248433599995,
5.11883498236,
5.09856722536,
5.11455629636,
5.1259512663599995,
5.12798710136,
5.12251669736
]
},
{
"command": "pnpr@main",
"mean": 5.04852455836,
"stddev": 0.05253407501085945,
"median": 5.03217273486,
"user": 2.4778984799999995,
"system": 3.8919188199999994,
"min": 5.00134041536,
"max": 5.17182077936,
"times": [
5.03149747536,
5.00792122836,
5.00615765436,
5.01710413236,
5.03284799436,
5.05052028636,
5.08849977636,
5.17182077936,
5.07753584136,
5.00134041536
]
}
]
}Scenario: Isolated linker: fresh restore, hot cache + hot store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 0.6727161420200002,
"stddev": 0.01600701583256559,
"median": 0.6713466896200001,
"user": 0.38210624000000004,
"system": 1.31715764,
"min": 0.6493276551200001,
"max": 0.6927292181200001,
"times": [
0.6896765321200001,
0.6927292181200001,
0.6781179801200001,
0.68320303412,
0.66361954712,
0.68976473512,
0.6645753991200001,
0.6493276551200001,
0.6636945151200001,
0.6524528041200001
]
},
{
"command": "pacquet@main",
"mean": 0.6790251945200001,
"stddev": 0.029567044382201157,
"median": 0.66924675062,
"user": 0.37753994,
"system": 1.31800454,
"min": 0.6497485161200001,
"max": 0.7374711111200001,
"times": [
0.7374711111200001,
0.6497485161200001,
0.7266614651200001,
0.6787752181200001,
0.6704182911200001,
0.6541672781200001,
0.6630080061200001,
0.67893630512,
0.66807521012,
0.66299054412
]
},
{
"command": "pnpr@HEAD",
"mean": 0.8034518379200002,
"stddev": 0.07656192707562987,
"median": 0.7692472856200001,
"user": 0.39216274,
"system": 1.3236017399999997,
"min": 0.7457819901200001,
"max": 0.9996492901200001,
"times": [
0.8541020881200001,
0.7617983531200001,
0.7659824901200001,
0.9996492901200001,
0.8224780311200001,
0.76437665212,
0.7725120811200001,
0.7457819901200001,
0.7918260581200001,
0.7560113451200001
]
},
{
"command": "pnpr@main",
"mean": 0.78653043812,
"stddev": 0.07034894497147766,
"median": 0.7551890911200001,
"user": 0.37835064000000007,
"system": 1.32344304,
"min": 0.74216280312,
"max": 0.9201077241200001,
"times": [
0.9201077241200001,
0.7512044571200001,
0.7676084561200001,
0.7460703381200001,
0.9184218371200001,
0.7605075181200001,
0.7499792891200001,
0.74216280312,
0.75917372512,
0.75006823312
]
}
]
}Scenario: Isolated linker: fresh install, cold cache + cold store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 5.3403243291,
"stddev": 0.03418814181014108,
"median": 5.3357790312999995,
"user": 3.7151861399999992,
"system": 3.3741702600000005,
"min": 5.2926780143,
"max": 5.3963232093,
"times": [
5.3497050703,
5.3054369653,
5.3781563753,
5.2926780143,
5.3727343723,
5.3301329333,
5.3401711823,
5.3065182883,
5.3313868803,
5.3963232093
]
},
{
"command": "pacquet@main",
"mean": 5.314160601999999,
"stddev": 0.040144265262532695,
"median": 5.308580321300001,
"user": 3.6935934399999995,
"system": 3.29832976,
"min": 5.2619899803,
"max": 5.3998589333,
"times": [
5.2619899803,
5.3118421733000005,
5.3290790373,
5.3324252763,
5.3998589333,
5.3053184693,
5.3453233363,
5.2746111173,
5.2992647023,
5.2818929943
]
},
{
"command": "pnpr@HEAD",
"mean": 2.0183562845000003,
"stddev": 0.02740664470225432,
"median": 2.0191614538,
"user": 2.53259494,
"system": 3.3599185599999997,
"min": 1.9735928763000001,
"max": 2.0579608283,
"times": [
2.0335341213,
1.9735928763000001,
2.0065524863,
2.0450049773,
2.0115940643,
2.0010406473,
2.0425031853,
2.0579608283,
1.9850508153000002,
2.0267288433
]
},
{
"command": "pnpr@main",
"mean": 2.0055636841,
"stddev": 0.0559111107042792,
"median": 1.9881875273,
"user": 2.44694124,
"system": 3.19423916,
"min": 1.9567277953000002,
"max": 2.1215397503,
"times": [
1.9567277953000002,
2.1215397503,
1.9672740953,
2.0278262913000002,
1.9598114903000001,
1.9961370703,
1.9802379843000002,
1.9642380293000001,
2.0815767653,
2.0002675693
]
}
]
}Scenario: Isolated linker: fresh install, hot cache + hot store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 1.40421160456,
"stddev": 0.023036377926355215,
"median": 1.40735798036,
"user": 1.5628231599999998,
"system": 1.7630129999999997,
"min": 1.35198998636,
"max": 1.43166520336,
"times": [
1.43166520336,
1.41897878736,
1.4043142013599998,
1.40172526436,
1.40300474336,
1.41040175936,
1.4106411723599999,
1.42693961736,
1.38245531036,
1.35198998636
]
},
{
"command": "pacquet@main",
"mean": 1.3972593978599999,
"stddev": 0.04890659554488921,
"median": 1.39307946186,
"user": 1.5455735599999998,
"system": 1.7752575,
"min": 1.34595938736,
"max": 1.52126646336,
"times": [
1.35852936036,
1.36045741736,
1.52126646336,
1.34595938736,
1.37987386836,
1.40846910136,
1.40604421236,
1.40583524436,
1.38811584836,
1.39804307536
]
},
{
"command": "pnpr@HEAD",
"mean": 0.6641498885600001,
"stddev": 0.037137718026627624,
"median": 0.6513343608600001,
"user": 0.32664326,
"system": 1.2511532,
"min": 0.6360337823600001,
"max": 0.7661683243600002,
"times": [
0.67004396736,
0.6665771433600001,
0.6502201043600001,
0.6360337823600001,
0.7661683243600002,
0.6481242333600001,
0.6510423363600001,
0.6463756023600001,
0.6516263853600001,
0.6552870063600001
]
},
{
"command": "pnpr@main",
"mean": 0.6670260052600001,
"stddev": 0.05815157499711235,
"median": 0.64638563386,
"user": 0.32426736,
"system": 1.2486983999999999,
"min": 0.6329316713600001,
"max": 0.8283806463600001,
"times": [
0.6482891353600001,
0.6329316713600001,
0.6424540753600001,
0.6396802453600001,
0.6444821323600001,
0.65022772636,
0.8283806463600001,
0.67750801936,
0.6643083833600001,
0.6419980173600001
]
}
]
}Scenario: Isolated linker: fresh install, cold cache + hot storeResolution-only: cold packument cache (full re-resolve over the registry link) with a hot store (no tarball download), so this isolates pnpr offloading the client resolution to its warm server.
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 4.9404086301,
"stddev": 0.021164435910854266,
"median": 4.9369614887,
"user": 1.6711996,
"system": 1.8634854399999998,
"min": 4.9146140527,
"max": 4.9943750867,
"times": [
4.9407150937,
4.9358562297,
4.9220174177,
4.9380667477,
4.9462401447,
4.9417158127,
4.9351079657,
4.9943750867,
4.9353777497,
4.9146140527
]
},
{
"command": "pacquet@main",
"mean": 4.982855574099999,
"stddev": 0.0396661248571493,
"median": 4.9693784802,
"user": 1.7421063,
"system": 1.89191494,
"min": 4.9342591927,
"max": 5.0451760377,
"times": [
4.9512037437,
4.9478029107,
4.9820144177,
5.0041191487,
4.9567425427,
5.0314896177,
4.9342591927,
4.9552071177,
5.0205410117,
5.0451760377
]
},
{
"command": "pnpr@HEAD",
"mean": 0.6604425195,
"stddev": 0.01576705755338339,
"median": 0.6605498462000001,
"user": 0.3320259,
"system": 1.2606485399999998,
"min": 0.6378699937,
"max": 0.6960283617,
"times": [
0.6378699937,
0.6463218087,
0.6593177907000001,
0.6711741007,
0.6639390847000001,
0.6617819017000001,
0.6515148397,
0.6536415857,
0.6960283617,
0.6628357277
]
},
{
"command": "pnpr@main",
"mean": 0.6665280901999999,
"stddev": 0.04129406951800342,
"median": 0.6549629047000001,
"user": 0.3220881,
"system": 1.2585762399999998,
"min": 0.6312259277000001,
"max": 0.7609289407000001,
"times": [
0.6398787587,
0.6536712627,
0.6383717657,
0.6381312247,
0.6312259277000001,
0.7609289407000001,
0.7174773717,
0.6664194087,
0.6629216947000001,
0.6562545467
]
}
]
} |
) The per-file CAS-write parallelism added in #12247 ran on rayon's global pool. But the install pipeline overlaps tarball extraction with linking each resolved package into `node_modules`, and the linker drives its per-package work through `rayon::join` / `par_iter` on that same global pool. When a batch of downloads finished at once (hundreds of tarballs entering extraction together), the extraction work queued ahead of the linker's jobs and stalled linking for seconds. Aligning the download/extract trace with the `imported` progress events on a ~1300-package fresh install showed the linker dropping to zero completions for ~1s right as an extraction surge landed, then grinding the rest out afterward — extraction had gotten faster, but it stuttered the concurrent linker, so the net win on the pipeline was lost. Route the parallel CAS writes through a dedicated rayon pool (sized to the core count; the work is CPU-bound SHA-512 + CAFS write) so an extraction burst can't monopolize the global pool the linker uses. The two phases now run concurrently without one starving the other: on the same fixture the linker no longer stalls (continuous completions through the extraction window) and the big-package extraction tail stays parallelized. Falls back to the global pool if the dedicated pool can't be built. --- Written by an agent (Claude Code, claude-opus-4-8).
Problem
extract_tarball_entrieswalked the tar in a single serial loop, hashing and writing each file into the content-addressed store one at a time, on the onespawn_blockingthread an extraction runs on. A package with many files (e.g.core-js, which unpacks to thousands) pinned a single core for its whole extraction while the rest of the machine sat idle — most visibly at the makespan tail, when one big package is the last extraction still running and every other core is free. This is why pnpm (which parallelizes extraction across workers) finishes big tarballs faster.Fix
Split extraction into two phases:
StoreDir::write_cas_fileis content-addressed and already documented as safe to call concurrently (its shard-creation cache is race-tolerant), so the output — CAS files, the{path → cafs path}map, and thePackageFilesIndexrow — is byte-identical. Result order is preserved, so last-entry-wins for duplicate paths is unchanged. Small tarballs (< 32 files) stay serial to skip rayon's per-job dispatch cost.Measured
Fresh install of a ~1300-package fixture (10-core machine):
core-js@3@ ~10.7score-js@3parallelized awayThe extraction tail roughly halved. Note: total install time on this fixture is dominated by the downstream hardlink/import phase, so this speeds up the extraction phase specifically rather than the whole install — but it's a clear win on machines/fixtures where extraction is a larger share (more cores, more big packages).
Tests
All 54
pacquet-tarballtests pass (extraction output is byte-identical); clippy clean.Written by an agent (Claude Code, claude-opus-4-8).
Summary by CodeRabbit