perf(tarball): parallelize per-file CAS writes within a tarball by zkochan · Pull Request #12247 · pnpm/pnpm

zkochan · 2026-06-06T17:18:27Z

Problem

extract_tarball_entries walked the tar in a single serial loop, hashing and writing each file into the content-addressed store one at a time, on the one spawn_blocking thread an extraction runs on. A package with many files (e.g. core-js, which unpacks to thousands) pinned a single core for its whole extraction while the rest of the machine sat idle — most visibly at the makespan tail, when one big package is the last extraction still running and every other core is free. This is why pnpm (which parallelizes extraction across workers) finishes big tarballs faster.

Fix

Split extraction into two phases:

Serial pass — walk the seekable tar stream to validate + clean each regular-file path and capture a borrow of its payload (cheap: header parsing only, no hashing/IO).
Parallel pass — hash and write each file into the CAS across the rayon pool.

StoreDir::write_cas_file is content-addressed and already documented as safe to call concurrently (its shard-creation cache is race-tolerant), so the output — CAS files, the {path → cafs path} map, and the PackageFilesIndex row — is byte-identical. Result order is preserved, so last-entry-wins for duplicate paths is unchanged. Small tarballs (< 32 files) stay serial to skip rayon's per-job dispatch cost.

Measured

Fresh install of a ~1300-package fixture (10-core machine):

	extraction tail	all extractions done
before	`core-js@3` @ ~10.7s	~10.7s
after	`core-js@3` parallelized away	~5.5s

The extraction tail roughly halved. Note: total install time on this fixture is dominated by the downstream hardlink/import phase, so this speeds up the extraction phase specifically rather than the whole install — but it's a clear win on machines/fixtures where extraction is a larger share (more cores, more big packages).

Tests

All 54 pacquet-tarball tests pass (extraction output is byte-identical); clippy clean.

Written by an agent (Claude Code, claude-opus-4-8).

Summary by CodeRabbit

Refactor
- Optimized tarball extraction process to handle large package files more efficiently through improved batch processing.
- Enhanced package.json manifest parsing and normalization during extraction for better reliability.

`extract_tarball_entries` walked the tar in a single serial loop, hashing and writing each file into the CAS one at a time on the one `spawn_blocking` thread the extraction runs on. A package with many files (e.g. `core-js`, which unpacks to thousands) therefore pinned a single core for the whole extraction while the rest of the machine sat idle — most visibly at the makespan tail, when one big package is the last extraction still running and every other core is free. Split extraction into two phases: a serial pass that walks the seekable tar stream to validate + clean each regular-file path and capture a borrow of its payload, then a parallel pass that hashes and writes each file into the CAS across the rayon pool. `StoreDir::write_cas_file` is content-addressed and already documented as safe to call concurrently (its shard-creation cache is race-tolerant), so the output — the CAS files, the `{path → cafs path}` map, and the `PackageFilesIndex` row — is byte-identical; result order is preserved so the last-entry-wins behavior for duplicate paths is unchanged. Small tarballs (under 32 files) stay on the serial path to avoid rayon's per-job dispatch cost when there's nothing to gain. On a fresh install of a ~1300-package fixture this cut the extraction tail roughly in half: the largest package (`core-js@3`) finished extracting at ~10.7s before and ~5.5s after, and all extractions completed by ~5.5s instead of ~10.7s. (Total install time on that fixture is dominated by the downstream hardlink/import phase, so this speeds up extraction specifically rather than the whole install.) --- Written by an agent (Claude Code, claude-opus-4-8).

coderabbitai · 2026-06-06T17:18:34Z

📝 Walkthrough

Walkthrough

The refactoring defers CAS writes and per-entry indexing until after tar walking completes. Phase 1 stages entries and captures the bundled manifest; phase 2 batch-processes staged files using Rayon parallelism when beneficial; phase 3 reconstructs the output maps, preserving last-wins semantics for duplicates.

Changes

Tarball extraction staging and batch processing

Layer / File(s)	Summary
Staging structures and write helper `pacquet/crates/tarball/src/lib.rs`	New `PendingFile` struct holds validated path, payload slice, and metadata; `write_cas_entry` helper writes one payload to CAS and produces `CafsFileInfo` with `checked_at` timestamp.
Phase 1: Tar walking with staging and manifest capture `pacquet/crates/tarball/src/lib.rs`	Setup introduces `pending` staging buffer and separate `manifest` variable; in-loop CAS write removed; `package.json` parsed as JSON, normalized via `normalize_bundled_manifest`, and regular files staged as `PendingFile` entries with payload slices.
Phase 2 & 3: Batch extraction and index assembly `pacquet/crates/tarball/src/lib.rs`	Phase 2 collects `write_cas_entry` results using Rayon `par_iter` when `pending.len() ≥ 32` else serial `iter`; phase 3 reconstructs `cas_paths` and `PackageFilesIndex.files` from results with last-wins semantics and duplicate warnings, then constructs final `PackageFilesIndex` with captured manifest.

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

pnpm/pnpm#12131: Prior refactor of extract_tarball_entries that modified tar entry payload handling and CAS write behavior; this PR builds on the same extraction-path refactoring.

Poem

🐰 Three phases dance with tarball grace,
Stage the files, then write with pace,
Rayon speeds the work along,
Bundle manifests both right and strong,
Pack it tight, the CAS is blessed!

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'perf(tarball): parallelize per-file CAS writes within a tarball' directly and specifically describes the main change: parallelizing CAS writes for tarball extraction.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch perf-parallel-tarball-extraction

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-06-06T17:23:31Z

Micro-Benchmark Results

Linux

group                          main                                   pr
-----                          ----                                   --
tarball/download_dependency    1.01      7.8±0.27ms   555.4 KB/sec    1.00      7.7±0.32ms   559.9 KB/sec

coderabbitai

🧹 Nitpick comments (1)

pacquet/crates/tarball/src/lib.rs (1)
673-680: ⚡ Quick win

Add a regression test that forces the parallel branch with duplicate paths.

This refactor’s last-wins guarantee now depends on the batched write path preserving pending order through the Rayon collection. A >= 32 entry fixture with a duplicate filename would lock that contract down.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pacquet/crates/tarball/src/lib.rs` around lines 673 - 680, Add a regression
test that exercises the parallel branch (use at least PARALLEL_EXTRACT_THRESHOLD
entries, i.e. 32) and includes duplicate filenames in the pending list to assert
the "last-wins" outcome; construct a pending Vec with ordered entries where the
later duplicate should overwrite the earlier one, call the code path that
invokes write_cas_entry (so the conditional using PARALLEL_EXTRACT_THRESHOLD
triggers the par_iter branch), then assert the resulting written
collection/store reflects the last entry for the duplicate path (compare file
content or CafsFileInfo) to lock down the ordering contract.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@pacquet/crates/tarball/src/lib.rs`:
- Around line 673-680: Add a regression test that exercises the parallel branch
(use at least PARALLEL_EXTRACT_THRESHOLD entries, i.e. 32) and includes
duplicate filenames in the pending list to assert the "last-wins" outcome;
construct a pending Vec with ordered entries where the later duplicate should
overwrite the earlier one, call the code path that invokes write_cas_entry (so
the conditional using PARALLEL_EXTRACT_THRESHOLD triggers the par_iter branch),
then assert the resulting written collection/store reflects the last entry for
the duplicate path (compare file content or CafsFileInfo) to lock down the
ordering contract.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: fe7fd72f-ad4d-4793-ade1-20b06bc6ed81

📥 Commits

Reviewing files that changed from the base of the PR and between c199198 and 15f2686.

📒 Files selected for processing (1)

pacquet/crates/tarball/src/lib.rs

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Run benchmark on ubuntu-latest

🧰 Additional context used

📓 Path-based instructions (1)

pacquet/**/*.rs

📄 CodeRabbit inference engine (pacquet/AGENTS.md)

pacquet/**/*.rs: Log emissions are part of matching pnpm — when porting a function that fires pnpm:<channel> events through globalLogger, logger.debug(...), or streamParser.write(...), mirror the call site, payload, and ordering so @pnpm/cli.default-reporter parses pacquet's NDJSON the same way
Declare a newtype wrapper for branded string types instead of collapsing the brand into a plain String or &str in Rust
If upstream TypeScript always validates before construction of a branded string, validate in the Rust wrapper too via TryFrom<String> and/or FromStr and do not provide an infallible public constructor
If upstream TypeScript never validates a branded string, just brand for type-safety in Rust by exposing an infallible From<String> constructor
If upstream TypeScript occasionally constructs a branded string without validation, expose from_str_unchecked in Rust as an escape hatch alongside the validating constructor
Match upstream serde behavior for branded strings crossing JSON, YAML, or INI boundaries by using #[serde(try_from = "String")] for deserialization and #[serde(into = "String")] for serialization
Derive simple conversions for branded strings using #[derive(derive_more::From)] and #[derive(derive_more::Into)] instead of handwriting impl blocks; use manual impl only when conversion needs custom logic
Model TypeScript string literal unions (like 'auto' | 'always' | 'never') as Rust enums instead of newtype wrappers, since the set of valid values is closed
Treat TypeScript string template literal types (like `${string}@${string}`) the same as branded string types in Rust, using a newtype wrapper with validation
Follow the code style guide in CODE_STYLE_GUIDE.md — imports, modules, naming, ownership and borrowing, parameter type selection, trait bounds, pattern matching, pipe-trait, error handling, test layout, and cloning of Arc and Rc
Choose owned vs. borrowed parameters to minimize copies; widen to t...

Files:

pacquet/crates/tarball/src/lib.rs

🧠 Learnings (10)

📓 Common learnings

Learnt from: zkochan
Repo: pnpm/pnpm PR: 12181
File: worker/src/start.ts:504-520
Timestamp: 2026-06-04T06:04:05.107Z
Learning: In pnpm/pnpm's pnpr install accelerator, the `/v1/install` response has a two-level framing structure:
1. **Outer layer** (full HTTP body): `[u32 outer header length][outer header JSON][files payload]` — `fetchFromPnpmRegistry` (pnpr/client/src/fetchFromPnpmRegistry.ts) strips the outer layer with `body.subarray(4 + headerLength)` and passes the remaining bytes to `writeCafsFiles`.
2. **Inner layer** (files payload): the files payload itself starts with its own `[u32 inner json length][inner header JSON]` prefix (built by the server's `build_files_payload` / `empty_files_payload_prefix`), followed by `[64-byte digest][u32 size][1-byte exec][content]` frames and a 64-zero-byte end marker.

`writeCafsFiles` in `worker/src/start.ts` is correct to read `jsonLen = payload.readUInt32BE(0)` and start frames at `offset = 4 + jsonLen` — this skips the inner header. The same two-level structure is mirrored in the Rust reference client (`parse_inline_response` + `write_files_payload`). Do not fla...

Learnt from: CR
Repo: pnpm/pnpm PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-05-25T12:36:42.202Z
Learning: User-visible changes (CLI flags, defaults, environment variables, lockfile/manifest/state-file formats, error codes/messages, log emissions, store layout, hook semantics) in pnpm must be mirrored to pacquet in the same PR

📚 Learning: 2026-05-29T18:03:24.797Z

Learnt from: CR
Repo: pnpm/pnpm PR: 0
File: pnpr/AGENTS.md:0-0
Timestamp: 2026-05-29T18:03:24.797Z
Learning: Prefer existing pacquet-* crates over writing new code; check pacquet-tarball, pacquet-crypto-hash, pacquet-crypto-shasums-file, pacquet-package-manifest, pacquet-network, pacquet-registry, pacquet-fs, and pacquet-diagnostics before implementing non-trivial functionality

Applied to files:

pacquet/crates/tarball/src/lib.rs

📚 Learning: 2026-06-04T20:24:32.096Z

Learnt from: zkochan
Repo: pnpm/pnpm PR: 12198
File: pnpr/crates/pnpr/src/storage.rs:469-477
Timestamp: 2026-06-04T20:24:32.096Z
Learning: In `pnpr/crates/pnpr/src/storage.rs` (pnpm/pnpm repo, Rust), `Store::list_package_names` intentionally uses `fs::try_exists(...).await.unwrap_or(false)` and `if let Ok(mut inner) = fs::read_dir(...)` — NOT `?`-propagation — for per-entry checks. This is deliberate best-effort / verdaccio-style search behavior: (1) `try_exists(stray_file/package.json)` returns `ENOTDIR` (not `NotFound`) for a stray non-package file in the store root, so `?` would fail the entire search; (2) the `@`-scope `read_dir` would fail on a non-directory `@`-named entry; (3) switching to `DirEntry::file_type()` would stop following symlinked package dirs. Failures that DO propagate are preserved: opening the store root itself, and `next_entry()` during the walk. Do not suggest blanket `?`-propagation for these per-entry checks.

Applied to files:

pacquet/crates/tarball/src/lib.rs

📚 Learning: 2026-06-04T06:04:05.107Z

Learnt from: zkochan
Repo: pnpm/pnpm PR: 12181
File: worker/src/start.ts:504-520
Timestamp: 2026-06-04T06:04:05.107Z
Learning: In pnpm/pnpm's pnpr install accelerator, the `/v1/install` response has a two-level framing structure:
1. **Outer layer** (full HTTP body): `[u32 outer header length][outer header JSON][files payload]` — `fetchFromPnpmRegistry` (pnpr/client/src/fetchFromPnpmRegistry.ts) strips the outer layer with `body.subarray(4 + headerLength)` and passes the remaining bytes to `writeCafsFiles`.
2. **Inner layer** (files payload): the files payload itself starts with its own `[u32 inner json length][inner header JSON]` prefix (built by the server's `build_files_payload` / `empty_files_payload_prefix`), followed by `[64-byte digest][u32 size][1-byte exec][content]` frames and a 64-zero-byte end marker.

`writeCafsFiles` in `worker/src/start.ts` is correct to read `jsonLen = payload.readUInt32BE(0)` and start frames at `offset = 4 + jsonLen` — this skips the inner header. The same two-level structure is mirrored in the Rust reference client (`parse_inline_response` + `write_files_payload`). Do not fla...

Applied to files:

pacquet/crates/tarball/src/lib.rs

📚 Learning: 2026-05-25T14:58:11.105Z

Learnt from: zkochan
Repo: pnpm/pnpm PR: 11931
File: pacquet/crates/resolving-npm-resolver/src/create_npm_resolution_verifier.rs:560-589
Timestamp: 2026-05-25T14:58:11.105Z
Learning: In `pacquet/crates/resolving-npm-resolver/src/create_npm_resolution_verifier.rs`, all per-`(registry, name[, version])` caches in `NpmResolutionVerifier` (`published_at`, `full_meta`, `full_meta_for_trust`, `abbreviated_meta`, `local_meta`) intentionally use the same pattern: lock → miss-check → release lock → await fetch/load → re-acquire lock → insert. This uniform pattern is deliberate; do not flag individual caches for using it. The known follow-up improvement (replacing the pattern with `tokio::sync::OnceCell` per key inside a `Mutex<HashMap<…>>`) is tracked as a future structural change to cover all five caches simultaneously.

Applied to files:

pacquet/crates/tarball/src/lib.rs

📚 Learning: 2026-05-23T09:14:43.635Z

Learnt from: zkochan
Repo: pnpm/pnpm PR: 11867
File: pacquet/crates/package-manager/src/install_with_fresh_lockfile.rs:726-730
Timestamp: 2026-05-23T09:14:43.635Z
Learning: In `pacquet/crates/package-manager/src/install_with_fresh_lockfile.rs`, the fresh-lockfile path intentionally does not invoke `BuildModules` and discards `side_effects_maps_by_snapshot` from `CreateVirtualStoreOutput`. This is pre-existing, documented behavior (mirroring upstream `link.ts:167-170`): `importing_done` fires once extraction and symlink linking are complete, and the fresh-lockfile path does not run lifecycle scripts. The frozen-lockfile path wires `BuildModules` end-to-end as normal. Do not flag this omission as a bug; wiring lifecycle scripts into the fresh-lockfile path is tracked as future work separate from perf refactors.

Applied to files:

pacquet/crates/tarball/src/lib.rs

📚 Learning: 2026-05-20T21:18:55.266Z

Learnt from: zkochan
Repo: pnpm/pnpm PR: 11778
File: pacquet/crates/resolving-local-resolver/src/parse_bare_specifier.rs:253-278
Timestamp: 2026-05-20T21:18:55.266Z
Learning: In `pacquet/crates/resolving-local-resolver/src/parse_bare_specifier.rs`, the `resolve_path` function intentionally short-circuits absolute specifiers verbatim (returns them unchanged without normalizing `..` components), mirroring the upstream TypeScript `resolvePath` in `resolving/local-resolver/src/parseBareSpecifier.ts` at ef87f3ccff. The OS resolves `..` at `fs.read` time. Do not suggest normalizing the absolute branch — it would invent behavior pnpm doesn't have, violating the pacquet AGENTS.md cardinal rule of fidelity to upstream.

Applied to files:

pacquet/crates/tarball/src/lib.rs

📚 Learning: 2026-05-20T19:40:55.051Z

Learnt from: zkochan
Repo: pnpm/pnpm PR: 11774
File: pacquet/crates/resolving-deps-resolver/src/resolve_peers.rs:0-0
Timestamp: 2026-05-20T19:40:55.051Z
Learning: In the pacquet Rust code, ensure the semver implementation uses the `node-semver` crate (not `nodejs-semver`). `node-semver`’s public API does not include a `satisfies_with_prerelease`-style method; prerelease-tolerant matching should be implemented inline by first calling `Range::satisfies`, and when it rejects a prerelease version, retry matching against a stripped `MAJOR.MINOR.PATCH` base of the prerelease version.

Applied to files:

pacquet/crates/tarball/src/lib.rs

📚 Learning: 2026-05-22T00:08:44.646Z

Learnt from: zkochan
Repo: pnpm/pnpm PR: 11837
File: pacquet/crates/resolving-npm-resolver/src/pick_package.rs:33-51
Timestamp: 2026-05-22T00:08:44.646Z
Learning: In the pnpm/pnpm repo’s pacquet Rust crates, do not flag Unicode ellipsis characters (U+2026, `…`) in Rust doc comments (`///` / `/** */`) as a lint violation. The pacquet crate’s `dylint.toml` only enables `perfectionist::derive_ordering`, and the Dylint `unicode-ellipsis` rule is not enabled for this project—so `…` in doc comments is an intentional, repo-consistent style.

Applied to files:

pacquet/crates/tarball/src/lib.rs

📚 Learning: 2026-05-20T23:07:58.444Z

Learnt from: zkochan
Repo: pnpm/pnpm PR: 11784
File: pacquet/crates/resolving-deps-resolver/src/hoist_peers.rs:120-133
Timestamp: 2026-05-20T23:07:58.444Z
Learning: When reviewing code in this pacquet Rust port, follow the upstream pnpm compatibility rule: only match pnpm’s behavior exactly. Do not propose review changes that intentionally deviate from pnpm’s documented/observed behavior, even if pnpm appears buggy. If you identify a real bug in pnpm behavior, the review should prioritize fixing it upstream in pnpm first, and avoid implementing a pnpm-behavior workaround here unless the same fix has already landed upstream.

Applied to files:

pacquet/crates/tarball/src/lib.rs

🔇 Additional comments (1)

pacquet/crates/tarball/src/lib.rs (1)

438-474: LGTM!

Also applies to: 524-662, 682-703

github-actions · 2026-06-06T17:54:16Z

Integrated-Benchmark Report (Linux)

Each scenario has pacquet rows (direct install) and pnpr rows (the same client through the pnpr install accelerator), so pnpr@HEAD vs pacquet@HEAD is the pnpr-vs-direct ratio. Cold-store scenarios wipe the client store between runs (warm server); hot-store scenarios keep it warm. The pacquet@HEAD rows feed the pacquet Bencher testbed; the pnpr@HEAD rows feed the pnpr testbed.

Scenario: Isolated linker: fresh restore, cold cache + cold store

Command	Mean [s]	Min [s]	Max [s]	Relative
`pacquet@HEAD`	10.153 ± 0.102	10.032	10.331	2.01 ± 0.03
`pacquet@main`	10.022 ± 0.124	9.906	10.223	1.99 ± 0.03
`pnpr@HEAD`	5.163 ± 0.084	5.099	5.348	1.02 ± 0.02
`pnpr@main`	5.049 ± 0.053	5.001	5.172	1.00

BENCHMARK_REPORT.json

{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 10.15305668296,
      "stddev": 0.10182340400711702,
      "median": 10.11216729186,
      "user": 3.6059113799999993,
      "system": 4.395553719999999,
      "min": 10.03211567836,
      "max": 10.33094130036,
      "times": [
        10.19475652136,
        10.07486729836,
        10.32629012236,
        10.33094130036,
        10.14884015836,
        10.11019093836,
        10.09677898936,
        10.10164217736,
        10.11414364536,
        10.03211567836
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 10.02193943276,
      "stddev": 0.12395331725879401,
      "median": 9.978084820860001,
      "user": 3.36657918,
      "system": 4.267394819999999,
      "min": 9.90642244536,
      "max": 10.22277175536,
      "times": [
        9.92146659636,
        10.22277175536,
        9.95855224736,
        9.99836254936,
        9.90642244536,
        9.93528379936,
        9.92017655236,
        10.14848433336,
        9.99761739436,
        10.21025665436
      ]
    },
    {
      "command": "pnpr@HEAD",
      "mean": 5.1626692061599995,
      "stddev": 0.08371392044020967,
      "median": 5.12696918386,
      "user": 2.73484238,
      "system": 4.042505220000001,
      "min": 5.09856722536,
      "max": 5.3479248433599995,
      "times": [
        5.13400173036,
        5.28622033836,
        5.15013158036,
        5.3479248433599995,
        5.11883498236,
        5.09856722536,
        5.11455629636,
        5.1259512663599995,
        5.12798710136,
        5.12251669736
      ]
    },
    {
      "command": "pnpr@main",
      "mean": 5.04852455836,
      "stddev": 0.05253407501085945,
      "median": 5.03217273486,
      "user": 2.4778984799999995,
      "system": 3.8919188199999994,
      "min": 5.00134041536,
      "max": 5.17182077936,
      "times": [
        5.03149747536,
        5.00792122836,
        5.00615765436,
        5.01710413236,
        5.03284799436,
        5.05052028636,
        5.08849977636,
        5.17182077936,
        5.07753584136,
        5.00134041536
      ]
    }
  ]
}

Scenario: Isolated linker: fresh restore, hot cache + hot store

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`pacquet@HEAD`	672.7 ± 16.0	649.3	692.7	1.00
`pacquet@main`	679.0 ± 29.6	649.7	737.5	1.01 ± 0.05
`pnpr@HEAD`	803.5 ± 76.6	745.8	999.6	1.19 ± 0.12
`pnpr@main`	786.5 ± 70.3	742.2	920.1	1.17 ± 0.11

BENCHMARK_REPORT.json

{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 0.6727161420200002,
      "stddev": 0.01600701583256559,
      "median": 0.6713466896200001,
      "user": 0.38210624000000004,
      "system": 1.31715764,
      "min": 0.6493276551200001,
      "max": 0.6927292181200001,
      "times": [
        0.6896765321200001,
        0.6927292181200001,
        0.6781179801200001,
        0.68320303412,
        0.66361954712,
        0.68976473512,
        0.6645753991200001,
        0.6493276551200001,
        0.6636945151200001,
        0.6524528041200001
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 0.6790251945200001,
      "stddev": 0.029567044382201157,
      "median": 0.66924675062,
      "user": 0.37753994,
      "system": 1.31800454,
      "min": 0.6497485161200001,
      "max": 0.7374711111200001,
      "times": [
        0.7374711111200001,
        0.6497485161200001,
        0.7266614651200001,
        0.6787752181200001,
        0.6704182911200001,
        0.6541672781200001,
        0.6630080061200001,
        0.67893630512,
        0.66807521012,
        0.66299054412
      ]
    },
    {
      "command": "pnpr@HEAD",
      "mean": 0.8034518379200002,
      "stddev": 0.07656192707562987,
      "median": 0.7692472856200001,
      "user": 0.39216274,
      "system": 1.3236017399999997,
      "min": 0.7457819901200001,
      "max": 0.9996492901200001,
      "times": [
        0.8541020881200001,
        0.7617983531200001,
        0.7659824901200001,
        0.9996492901200001,
        0.8224780311200001,
        0.76437665212,
        0.7725120811200001,
        0.7457819901200001,
        0.7918260581200001,
        0.7560113451200001
      ]
    },
    {
      "command": "pnpr@main",
      "mean": 0.78653043812,
      "stddev": 0.07034894497147766,
      "median": 0.7551890911200001,
      "user": 0.37835064000000007,
      "system": 1.32344304,
      "min": 0.74216280312,
      "max": 0.9201077241200001,
      "times": [
        0.9201077241200001,
        0.7512044571200001,
        0.7676084561200001,
        0.7460703381200001,
        0.9184218371200001,
        0.7605075181200001,
        0.7499792891200001,
        0.74216280312,
        0.75917372512,
        0.75006823312
      ]
    }
  ]
}

Scenario: Isolated linker: fresh install, cold cache + cold store

Command	Mean [s]	Min [s]	Max [s]	Relative
`pacquet@HEAD`	5.340 ± 0.034	5.293	5.396	2.66 ± 0.08
`pacquet@main`	5.314 ± 0.040	5.262	5.400	2.65 ± 0.08
`pnpr@HEAD`	2.018 ± 0.027	1.974	2.058	1.01 ± 0.03
`pnpr@main`	2.006 ± 0.056	1.957	2.122	1.00

BENCHMARK_REPORT.json

{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 5.3403243291,
      "stddev": 0.03418814181014108,
      "median": 5.3357790312999995,
      "user": 3.7151861399999992,
      "system": 3.3741702600000005,
      "min": 5.2926780143,
      "max": 5.3963232093,
      "times": [
        5.3497050703,
        5.3054369653,
        5.3781563753,
        5.2926780143,
        5.3727343723,
        5.3301329333,
        5.3401711823,
        5.3065182883,
        5.3313868803,
        5.3963232093
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 5.314160601999999,
      "stddev": 0.040144265262532695,
      "median": 5.308580321300001,
      "user": 3.6935934399999995,
      "system": 3.29832976,
      "min": 5.2619899803,
      "max": 5.3998589333,
      "times": [
        5.2619899803,
        5.3118421733000005,
        5.3290790373,
        5.3324252763,
        5.3998589333,
        5.3053184693,
        5.3453233363,
        5.2746111173,
        5.2992647023,
        5.2818929943
      ]
    },
    {
      "command": "pnpr@HEAD",
      "mean": 2.0183562845000003,
      "stddev": 0.02740664470225432,
      "median": 2.0191614538,
      "user": 2.53259494,
      "system": 3.3599185599999997,
      "min": 1.9735928763000001,
      "max": 2.0579608283,
      "times": [
        2.0335341213,
        1.9735928763000001,
        2.0065524863,
        2.0450049773,
        2.0115940643,
        2.0010406473,
        2.0425031853,
        2.0579608283,
        1.9850508153000002,
        2.0267288433
      ]
    },
    {
      "command": "pnpr@main",
      "mean": 2.0055636841,
      "stddev": 0.0559111107042792,
      "median": 1.9881875273,
      "user": 2.44694124,
      "system": 3.19423916,
      "min": 1.9567277953000002,
      "max": 2.1215397503,
      "times": [
        1.9567277953000002,
        2.1215397503,
        1.9672740953,
        2.0278262913000002,
        1.9598114903000001,
        1.9961370703,
        1.9802379843000002,
        1.9642380293000001,
        2.0815767653,
        2.0002675693
      ]
    }
  ]
}

Scenario: Isolated linker: fresh install, hot cache + hot store

Command	Mean [s]	Min [s]	Max [s]	Relative
`pacquet@HEAD`	1.404 ± 0.023	1.352	1.432	2.11 ± 0.12
`pacquet@main`	1.397 ± 0.049	1.346	1.521	2.10 ± 0.14
`pnpr@HEAD`	0.664 ± 0.037	0.636	0.766	1.00
`pnpr@main`	0.667 ± 0.058	0.633	0.828	1.00 ± 0.10

BENCHMARK_REPORT.json

{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 1.40421160456,
      "stddev": 0.023036377926355215,
      "median": 1.40735798036,
      "user": 1.5628231599999998,
      "system": 1.7630129999999997,
      "min": 1.35198998636,
      "max": 1.43166520336,
      "times": [
        1.43166520336,
        1.41897878736,
        1.4043142013599998,
        1.40172526436,
        1.40300474336,
        1.41040175936,
        1.4106411723599999,
        1.42693961736,
        1.38245531036,
        1.35198998636
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 1.3972593978599999,
      "stddev": 0.04890659554488921,
      "median": 1.39307946186,
      "user": 1.5455735599999998,
      "system": 1.7752575,
      "min": 1.34595938736,
      "max": 1.52126646336,
      "times": [
        1.35852936036,
        1.36045741736,
        1.52126646336,
        1.34595938736,
        1.37987386836,
        1.40846910136,
        1.40604421236,
        1.40583524436,
        1.38811584836,
        1.39804307536
      ]
    },
    {
      "command": "pnpr@HEAD",
      "mean": 0.6641498885600001,
      "stddev": 0.037137718026627624,
      "median": 0.6513343608600001,
      "user": 0.32664326,
      "system": 1.2511532,
      "min": 0.6360337823600001,
      "max": 0.7661683243600002,
      "times": [
        0.67004396736,
        0.6665771433600001,
        0.6502201043600001,
        0.6360337823600001,
        0.7661683243600002,
        0.6481242333600001,
        0.6510423363600001,
        0.6463756023600001,
        0.6516263853600001,
        0.6552870063600001
      ]
    },
    {
      "command": "pnpr@main",
      "mean": 0.6670260052600001,
      "stddev": 0.05815157499711235,
      "median": 0.64638563386,
      "user": 0.32426736,
      "system": 1.2486983999999999,
      "min": 0.6329316713600001,
      "max": 0.8283806463600001,
      "times": [
        0.6482891353600001,
        0.6329316713600001,
        0.6424540753600001,
        0.6396802453600001,
        0.6444821323600001,
        0.65022772636,
        0.8283806463600001,
        0.67750801936,
        0.6643083833600001,
        0.6419980173600001
      ]
    }
  ]
}

Scenario: Isolated linker: fresh install, cold cache + hot store

Resolution-only: cold packument cache (full re-resolve over the registry link) with a hot store (no tarball download), so this isolates pnpr offloading the client resolution to its warm server.

Command	Mean [s]	Min [s]	Max [s]	Relative
`pacquet@HEAD`	4.940 ± 0.021	4.915	4.994	7.48 ± 0.18
`pacquet@main`	4.983 ± 0.040	4.934	5.045	7.54 ± 0.19
`pnpr@HEAD`	0.660 ± 0.016	0.638	0.696	1.00
`pnpr@main`	0.667 ± 0.041	0.631	0.761	1.01 ± 0.07

BENCHMARK_REPORT.json

{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 4.9404086301,
      "stddev": 0.021164435910854266,
      "median": 4.9369614887,
      "user": 1.6711996,
      "system": 1.8634854399999998,
      "min": 4.9146140527,
      "max": 4.9943750867,
      "times": [
        4.9407150937,
        4.9358562297,
        4.9220174177,
        4.9380667477,
        4.9462401447,
        4.9417158127,
        4.9351079657,
        4.9943750867,
        4.9353777497,
        4.9146140527
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 4.982855574099999,
      "stddev": 0.0396661248571493,
      "median": 4.9693784802,
      "user": 1.7421063,
      "system": 1.89191494,
      "min": 4.9342591927,
      "max": 5.0451760377,
      "times": [
        4.9512037437,
        4.9478029107,
        4.9820144177,
        5.0041191487,
        4.9567425427,
        5.0314896177,
        4.9342591927,
        4.9552071177,
        5.0205410117,
        5.0451760377
      ]
    },
    {
      "command": "pnpr@HEAD",
      "mean": 0.6604425195,
      "stddev": 0.01576705755338339,
      "median": 0.6605498462000001,
      "user": 0.3320259,
      "system": 1.2606485399999998,
      "min": 0.6378699937,
      "max": 0.6960283617,
      "times": [
        0.6378699937,
        0.6463218087,
        0.6593177907000001,
        0.6711741007,
        0.6639390847000001,
        0.6617819017000001,
        0.6515148397,
        0.6536415857,
        0.6960283617,
        0.6628357277
      ]
    },
    {
      "command": "pnpr@main",
      "mean": 0.6665280901999999,
      "stddev": 0.04129406951800342,
      "median": 0.6549629047000001,
      "user": 0.3220881,
      "system": 1.2585762399999998,
      "min": 0.6312259277000001,
      "max": 0.7609289407000001,
      "times": [
        0.6398787587,
        0.6536712627,
        0.6383717657,
        0.6381312247,
        0.6312259277000001,
        0.7609289407000001,
        0.7174773717,
        0.6664194087,
        0.6629216947000001,
        0.6562545467
      ]
    }
  ]
}

github-actions · 2026-06-06T17:54:26Z

Bencher Report

Branch	pr/12247
Testbed	pacquet

🚨 2 Alerts

Benchmark	Measure Units	View	Benchmark Result (Result Δ%)	Upper Boundary (Limit %)
isolated-linker.fresh-install.cold-cache.cold-store	Latency seconds (s)	📈 plot 🚷 threshold 🚨 alert (🔔)	5.34 s (+132.25%) Baseline: 2.30 s	2.76 s (193.54%)
isolated-linker.fresh-restore.cold-cache.cold-store	Latency seconds (s)	📈 plot 🚷 threshold 🚨 alert (🔔)	10.15 s (+126.60%) Baseline: 4.48 s	5.38 s (188.83%)

Click to view all benchmark results

Benchmark	Latency	Benchmark Result milliseconds (ms) (Result Δ%)	Upper Boundary milliseconds (ms) (Limit %)
isolated-linker.fresh-install.cold-cache.cold-store	📈 view plot 🚷 view threshold 🚨 view alert (🔔)	5,340.32 ms (+132.25%) Baseline: 2,299.43 ms	2,759.32 ms (193.54%)
isolated-linker.fresh-install.cold-cache.hot-store	📈 view plot 🚷 view threshold	4,940.41 ms
isolated-linker.fresh-install.hot-cache.hot-store	📈 view plot 🚷 view threshold	1,404.21 ms (+5.63%) Baseline: 1,329.33 ms	1,595.19 ms (88.03%)
isolated-linker.fresh-restore.cold-cache.cold-store	📈 view plot 🚷 view threshold 🚨 view alert (🔔)	10,153.06 ms (+126.60%) Baseline: 4,480.61 ms	5,376.73 ms (188.83%)
isolated-linker.fresh-restore.hot-cache.hot-store	📈 view plot 🚷 view threshold	672.72 ms (-1.08%) Baseline: 680.04 ms	816.05 ms (82.44%)

🐰 View full continuous benchmarking report in Bencher

) The per-file CAS-write parallelism added in #12247 ran on rayon's global pool. But the install pipeline overlaps tarball extraction with linking each resolved package into `node_modules`, and the linker drives its per-package work through `rayon::join` / `par_iter` on that same global pool. When a batch of downloads finished at once (hundreds of tarballs entering extraction together), the extraction work queued ahead of the linker's jobs and stalled linking for seconds. Aligning the download/extract trace with the `imported` progress events on a ~1300-package fresh install showed the linker dropping to zero completions for ~1s right as an extraction surge landed, then grinding the rest out afterward — extraction had gotten faster, but it stuttered the concurrent linker, so the net win on the pipeline was lost. Route the parallel CAS writes through a dedicated rayon pool (sized to the core count; the work is CPU-bound SHA-512 + CAFS write) so an extraction burst can't monopolize the global pool the linker uses. The two phases now run concurrently without one starving the other: on the same fixture the linker no longer stalls (continuous completions through the extraction window) and the big-package extraction tail stays parallelized. Falls back to the global pool if the dedicated pool can't be built. --- Written by an agent (Claude Code, claude-opus-4-8).

zkochan marked this pull request as ready for review June 6, 2026 17:45

coderabbitai Bot reviewed Jun 6, 2026

View reviewed changes

coderabbitai Bot approved these changes Jun 6, 2026

View reviewed changes

zkochan merged commit bea64b2 into main Jun 6, 2026
27 of 28 checks passed

zkochan deleted the perf-parallel-tarball-extraction branch June 6, 2026 17:55

zkochan mentioned this pull request Jun 6, 2026

perf(tarball): run parallel CAS writes on a dedicated rayon pool #12248

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

perf(tarball): parallelize per-file CAS writes within a tarball#12247

perf(tarball): parallelize per-file CAS writes within a tarball#12247
zkochan merged 1 commit into
mainfrom
perf-parallel-tarball-extraction

zkochan commented Jun 6, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 6, 2026 •

edited

Loading

Walkthrough

Changes

Possibly related PRs

Poem

Uh oh!

github-actions Bot commented Jun 6, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

github-actions Bot commented Jun 6, 2026

Uh oh!

github-actions Bot commented Jun 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

zkochan commented Jun 6, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Measured

Tests

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Possibly related PRs

Poem

Uh oh!

github-actions Bot commented Jun 6, 2026

Micro-Benchmark Results

Linux

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 6, 2026

Integrated-Benchmark Report (Linux)

Scenario: Isolated linker: fresh restore, cold cache + cold store

Scenario: Isolated linker: fresh restore, hot cache + hot store

Scenario: Isolated linker: fresh install, cold cache + cold store

Scenario: Isolated linker: fresh install, hot cache + hot store

Scenario: Isolated linker: fresh install, cold cache + hot store

Uh oh!

github-actions Bot commented Jun 6, 2026

Bencher Report

🚨 2 Alerts

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

zkochan commented Jun 6, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 6, 2026 •

edited

Loading