perf(pnpr): collapse the install-accelerator cold path to one round trip#12178
Conversation
📝 WalkthroughWalkthroughThis PR implements a one-round-trip protocol redesign for the pnpr install accelerator, replacing a two-request exchange (install + separate file fetch) with a single combined response containing an inline JSON header and gzip-compressed binary file frames. The client now requests ChangesInline Files Installation Protocol
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Micro-Benchmark ResultsLinux |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #12178 +/- ##
==========================================
- Coverage 87.58% 87.47% -0.11%
==========================================
Files 269 269
Lines 30814 30848 +34
==========================================
- Hits 26987 26983 -4
- Misses 3827 3865 +38 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Integrated-Benchmark Report (Linux)Each scenario has pacquet rows (direct install) and pnpr rows (the same client through the pnpr install accelerator), so pnpr@HEAD vs pacquet@HEAD is the pnpr-vs-direct ratio. Cold-store scenarios wipe the client store between runs (warm server); hot-store scenarios keep it warm. The pacquet@HEAD rows feed the pacquet Bencher testbed; the pnpr@HEAD rows feed the pnpr testbed. Scenario: Isolated linker: fresh restore, cold cache + cold store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 4.8108497562600006,
"stddev": 0.0660192894710442,
"median": 4.79593533296,
"user": 2.52109074,
"system": 3.7050714000000005,
"min": 4.72008294246,
"max": 4.97434067146,
"times": [
4.79843669146,
4.72008294246,
4.82780433746,
4.79089227646,
4.81959774046,
4.79343397446,
4.788421452460001,
4.97434067146,
4.764631946460001,
4.83085552946
]
},
{
"command": "pacquet@main",
"mean": 4.81887776776,
"stddev": 0.08010608452704074,
"median": 4.8114920019600005,
"user": 2.51504884,
"system": 3.7055178,
"min": 4.72170864946,
"max": 4.99609992546,
"times": [
4.84212845146,
4.7930794714600005,
4.8299045324600005,
4.99609992546,
4.764125866460001,
4.83630287546,
4.78601897646,
4.72170864946,
4.73545423846,
4.88395469046
]
},
{
"command": "pnpr@HEAD",
"mean": 2.00710041246,
"stddev": 0.043549508187170106,
"median": 2.00524857446,
"user": 2.6792136399999995,
"system": 3.2302145,
"min": 1.95771586146,
"max": 2.0837555004599997,
"times": [
2.0474821934599996,
1.96393236246,
2.02328779646,
1.97566959246,
1.98720935246,
1.96203317946,
2.0310109494599997,
2.03890733646,
1.95771586146,
2.0837555004599997
]
},
{
"command": "pnpr@main",
"mean": 2.0141192374599997,
"stddev": 0.05809513350638525,
"median": 2.0044372279599996,
"user": 2.66271184,
"system": 3.2493279,
"min": 1.91592678846,
"max": 2.1177160364599996,
"times": [
2.00764991946,
2.1177160364599996,
1.98471888346,
2.00016841846,
2.0012245364599996,
1.96797142346,
2.09303407446,
2.0202330744599997,
1.91592678846,
2.03254921946
]
}
]
}Scenario: Isolated linker: fresh restore, hot cache + hot store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 0.6755881103199999,
"stddev": 0.036486399163733606,
"median": 0.66528869632,
"user": 0.37408302,
"system": 1.3232981799999999,
"min": 0.65147287582,
"max": 0.77798704882,
"times": [
0.77798704882,
0.66062973482,
0.66932140582,
0.66736731482,
0.66321007782,
0.67193833382,
0.66156986382,
0.67088314882,
0.66150129882,
0.65147287582
]
},
{
"command": "pacquet@main",
"mean": 0.68665188552,
"stddev": 0.06613271726144067,
"median": 0.66603082732,
"user": 0.3770805199999999,
"system": 1.33122678,
"min": 0.65057836282,
"max": 0.87374747082,
"times": [
0.67562961482,
0.66460419582,
0.66745745882,
0.65057836282,
0.66304280182,
0.67224706582,
0.67392602182,
0.66281238582,
0.66247347682,
0.87374747082
]
},
{
"command": "pnpr@HEAD",
"mean": 0.6835262657200001,
"stddev": 0.04688303765875721,
"median": 0.66618790282,
"user": 0.36823821999999995,
"system": 1.32630098,
"min": 0.64791634082,
"max": 0.80081605882,
"times": [
0.72244101982,
0.66889021682,
0.65832030982,
0.69427660782,
0.66348558882,
0.67046992982,
0.65437142482,
0.65427515982,
0.64791634082,
0.80081605882
]
},
{
"command": "pnpr@main",
"mean": 0.7382476039200001,
"stddev": 0.07727440499550689,
"median": 0.71597193982,
"user": 0.39034262000000003,
"system": 1.3109495800000002,
"min": 0.67942669482,
"max": 0.93955736182,
"times": [
0.76851654082,
0.68541026982,
0.71261966482,
0.67942669482,
0.75751603882,
0.71932421482,
0.69791068682,
0.73903460082,
0.93955736182,
0.68315996582
]
}
]
}Scenario: Isolated linker: fresh install, cold cache + cold store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 2.20051870156,
"stddev": 0.03806395691216713,
"median": 2.19915778006,
"user": 3.52249176,
"system": 2.9745311,
"min": 2.14701857006,
"max": 2.27018531006,
"times": [
2.27018531006,
2.1487907600600002,
2.19798224706,
2.22195095406,
2.2003333130600002,
2.21890382806,
2.14701857006,
2.23506118106,
2.18299958406,
2.1819612680600002
]
},
{
"command": "pacquet@main",
"mean": 2.1832347667600005,
"stddev": 0.027604929881300987,
"median": 2.1781010095599997,
"user": 3.53364756,
"system": 2.9653338,
"min": 2.15275805406,
"max": 2.2263232090600003,
"times": [
2.15275805406,
2.1931049270600003,
2.15920226906,
2.16863724306,
2.1597834000600002,
2.15668577206,
2.21461816106,
2.2263232090600003,
2.21366985606,
2.18756477606
]
},
{
"command": "pnpr@HEAD",
"mean": 2.1660056197600004,
"stddev": 0.024000803077224263,
"median": 2.16327776156,
"user": 3.5253906600000002,
"system": 2.941669,
"min": 2.11625871306,
"max": 2.19191666606,
"times": [
2.18186724606,
2.19191666606,
2.16149921506,
2.11625871306,
2.15782689206,
2.18919037006,
2.16318318906,
2.16337233406,
2.14356793006,
2.1913736420600003
]
},
{
"command": "pnpr@main",
"mean": 2.21066815246,
"stddev": 0.022931452272449936,
"median": 2.20813315456,
"user": 3.5498807599999993,
"system": 2.9737479999999996,
"min": 2.18560263806,
"max": 2.26230642306,
"times": [
2.19368093806,
2.18906622106,
2.19575215606,
2.2229912510600003,
2.21720665006,
2.22380893806,
2.21445902606,
2.20180728306,
2.26230642306,
2.18560263806
]
}
]
}Scenario: Isolated linker: fresh install, hot cache + hot store
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 1.3131113968800001,
"stddev": 0.013723077729394993,
"median": 1.31233022258,
"user": 1.42181854,
"system": 1.71113218,
"min": 1.28520247858,
"max": 1.33236761258,
"times": [
1.3288113685799998,
1.32413430158,
1.31337711658,
1.31783167358,
1.3069655945799998,
1.28520247858,
1.31128332858,
1.30555543458,
1.33236761258,
1.3055850595799998
]
},
{
"command": "pacquet@main",
"mean": 1.36440192348,
"stddev": 0.04309798029904765,
"median": 1.3518748340799998,
"user": 1.4701926399999998,
"system": 1.7531281799999998,
"min": 1.31998220558,
"max": 1.46106682558,
"times": [
1.32647274358,
1.31998220558,
1.39202496058,
1.38092520058,
1.46106682558,
1.3471934665799998,
1.39069172458,
1.35655620158,
1.34141855258,
1.32768735358
]
},
{
"command": "pnpr@HEAD",
"mean": 1.32454432278,
"stddev": 0.018098513176581787,
"median": 1.32109996958,
"user": 1.4145899400000002,
"system": 1.71765238,
"min": 1.3028236305799998,
"max": 1.35588239058,
"times": [
1.31704780858,
1.30648292658,
1.32420509558,
1.3048966175799999,
1.33522587258,
1.31799484358,
1.35588239058,
1.3028236305799998,
1.3460597915799999,
1.3348242505799999
]
},
{
"command": "pnpr@main",
"mean": 1.33749999048,
"stddev": 0.03772052130560947,
"median": 1.3212697050799997,
"user": 1.4591723399999998,
"system": 1.7225953800000002,
"min": 1.31077939758,
"max": 1.43890743658,
"times": [
1.3220864315799998,
1.3165202215799998,
1.34104195358,
1.43890743658,
1.31669959758,
1.3204529785799999,
1.3426163195799998,
1.32016251158,
1.31077939758,
1.3457330565799999
]
}
]
} |
|
| Branch | pr/12178 |
| Testbed | pacquet |
🚨 1 Alert
| Benchmark | Measure Units | View | Benchmark Result (Result Δ%) | Upper Boundary (Limit %) |
|---|---|---|---|---|
| isolated-linker.fresh-restore.cold-cache.cold-store | Latency seconds (s) | 📈 plot 🚷 threshold 🚨 alert (🔔) | 4.81 s(+107.60%)Baseline: 2.32 s | 2.78 s (173.00%) |
Click to view all benchmark results
| Benchmark | Latency | Benchmark Result milliseconds (ms) (Result Δ%) | Upper Boundary milliseconds (ms) (Limit %) |
|---|---|---|---|
| isolated-linker.fresh-install.cold-cache.cold-store | 📈 view plot 🚷 view threshold | 2,200.52 ms(-4.48%)Baseline: 2,303.67 ms | 2,764.41 ms (79.60%) |
| isolated-linker.fresh-install.hot-cache.hot-store | 📈 view plot 🚷 view threshold | 1,313.11 ms(-11.23%)Baseline: 1,479.28 ms | 1,775.13 ms (73.97%) |
| isolated-linker.fresh-restore.cold-cache.cold-store | 📈 view plot 🚷 view threshold 🚨 view alert (🔔) | 4,810.85 ms(+107.60%)Baseline: 2,317.41 ms | 2,780.89 ms (173.00%) |
| isolated-linker.fresh-restore.hot-cache.hot-store | 📈 view plot 🚷 view threshold | 675.59 ms(+4.01%)Baseline: 649.57 ms | 779.48 ms (86.67%) |
Collapse the install-accelerator cold path from three sequential round trips (handshake + /v1/install + /v1/files) to one. With a new `inlineFiles` request flag, /v1/install returns a single gzipped body — a length-prefixed JSON header (lockfile, stats, store-index entries, or verification violations) followed by the missing files' contents as the binary frames /v1/files already serves. The pacquet client sends the flag, skips the handshake, and writes the inlined files straight to its CAFS with no follow-up fetch. The legacy NDJSON + /v1/files path is unchanged for clients that don't set the flag. At 50ms one-way latency a warm-store install drops from ~381ms to ~135ms (2.8x). See #12165.
The install accelerator gzipped its file payloads (both `/v1/files` and the inlined install body) at level 1. Level 6 — the gzip default — shrinks the payload ~16% on real package contents (51.1% -> 43.0% of the uncompressed size in a fixture measurement), while level 9 buys under a percent more for several times the CPU. Fewer bytes means fewer TCP slow-start round trips once the server is across a latency link, which is the cost this accelerator exists to cut. Experiment 2 of #12165.
The inlineFiles install response previously buffered: the server fetched every uncached tarball into its store, computed the diff, built the whole body, then replied. Now the response is a gzip stream of length-prefixed frames (lockfile, store-index entries, files, stats, error). A producer resolves each client-missing package — reading the file index straight from the download for uncached packages, or from a store snapshot for ones the server already holds — and streams its frames the moment that package is ready, while later tarballs are still downloading. The client parses frames and writes files to its CAFS as they arrive. This lets the server's upstream tarball fetch overlap the transfer to the client instead of running strictly before it, the win the install accelerator targets for large cold-server installs. On the small in-repo fixtures the overlap is within noise (too little to transfer); the real effect is measured by the CI install-accelerator benchmark. No regression on warm or cold paths in local measurement. Drops the per-entry base64 on store-index entries (now sent as raw bytes in 'I' frames). Adds 'DownloadTarballToStore::run_without_mem_cache_with_index' so the streaming path gets the file index a fresh fetch already computed without reading it back through the open writer. Experiment 4 of #12165.
Reverts the streaming install response from bb7d652. Experiment 4 overlapped the server's upstream tarball fetch with streaming files to the client. That only pays off on a bandwidth-constrained link, where transferring files takes enough wall-clock time to hide under the fetch. The public npm registry is CDN-backed and does not throttle bandwidth — it is latency-bound — and a pnpr accelerator is typically deployed close to its clients, so neither the fetch nor the transfer is bandwidth-bound in practice. There is essentially nothing to overlap. Measured against a synthetic large fixture behind an artificial 8 MiB/s cap (the only way to even create the scenario), streaming was *slower* than the buffered response: per-package gzip flushing transfers more bytes, which is strictly worse on the bandwidth-limited link it is meant to help, and the producer stalled fetches while blocked on backpressure. The win that matters against npm is round-trip reduction (the inlined single-round-trip response), not fetch/transfer overlap. Reverting keeps that win and drops the streaming complexity (the tarball API addition, the store-index Clone derives, the streaming module, and the framed client protocol). See #12165.
Review Summary by QodoCollapse pnpr install to single round trip with inlined files
WalkthroughsDescription• Collapse install-accelerator cold path from 3 round trips to 1 - Eliminates handshake and separate /v1/files fetch - Server inlines missing files directly in install response • Implement inlineFiles request flag for combined response protocol - Single gzipped body with length-prefixed JSON header + binary frames - Header carries lockfile, stats, store-index entries, violations • Increase gzip compression level from 1 to 6 for file payloads - Reduces payload size ~16% with minimal CPU overhead • Refactor response building to support both NDJSON and inline protocols - Legacy NDJSON path unchanged for backward compatibility - Shared build_files_payload helper frames files identically Diagramflowchart LR
Client["Client Request<br/>inlineFiles: true"]
Server["Server<br/>Resolve + Diff"]
Header["JSON Header<br/>lockfile, stats, index"]
Files["Binary Frames<br/>missing files"]
Response["Gzipped Response<br/>header + files"]
Client -->|POST /v1/install| Server
Server -->|build| Header
Server -->|build| Files
Header -->|combine| Response
Files -->|combine| Response
Response -->|1 RTT| Client
File Changes1. pacquet/crates/pnpr-client/src/lib.rs
|
`POST /v1/files` served any CAFS file by digest with no authentication and no package identity, so the access gate on `/v1/install` (which is per package) couldn't cover it — it had to be removed, not gated. It was already superseded by the single-response inline path (#12178). * Server: `/v1/install` always answers with the inline gzipped body (lockfile + stats + store-index entries + the missing files' contents); the NDJSON two-trip path, the `/v1/files` route, `handle_files`, and the `FilesRequest`/`is_valid_sha512_hex` helpers are gone. * TS client + worker: `@pnpm/pnpr.client` now does the one inline request and hands the file frames to `@pnpm/worker`'s `writeCafsFiles`, which writes them to the CAFS; the `fetchAndWriteCafsFiles` /v1/files fetcher is replaced. Closes the second half of the install-accelerator access work (#12184); file-bearing responses are now both inline-only and access-gated. --- Written by an agent (Claude Code, claude-opus-4-8).
`POST /v1/files` served any CAFS file by digest with no authentication and no package identity, so the access gate on `/v1/install` (which is per package) couldn't cover it — it had to be removed, not gated. It was already superseded by the single-response inline path (#12178). * Server: `/v1/install` always answers with the inline gzipped body (lockfile + stats + store-index entries + the missing files' contents); the NDJSON two-trip path, the `/v1/files` route, `handle_files`, and the `FilesRequest`/`is_valid_sha512_hex` helpers are gone. * TS client + worker: `@pnpm/pnpr.client` now does the one inline request and hands the file frames to `@pnpm/worker`'s `writeCafsFiles`, which writes them to the CAFS; the `fetchAndWriteCafsFiles` /v1/files fetcher is replaced. Error bodies are decompressed before being surfaced, since the server also gzips its JSON error responses (e.g. an access denial). Closes the second half of the install-accelerator access work (#12184); file-bearing responses are now both inline-only and access-gated. --- Written by an agent (Claude Code, claude-opus-4-8).
`POST /v1/files` served any CAFS file by digest with no authentication and no package identity, so the access gate on `/v1/install` (which is per package) couldn't cover it — it had to be removed, not gated. It was already superseded by the single-response inline path (#12178). * Server: `/v1/install` always answers with the inline gzipped body (lockfile + stats + store-index entries + the missing files' contents); the NDJSON two-trip path, the `/v1/files` route, `handle_files`, and the `FilesRequest`/`is_valid_sha512_hex` helpers are gone. * TS client + worker: `@pnpm/pnpr.client` now does the one inline request and hands the file frames to `@pnpm/worker`'s `writeCafsFiles`, which writes them to the CAFS; the `fetchAndWriteCafsFiles` /v1/files fetcher is replaced. Error bodies are decompressed before being surfaced, since the server also gzips its JSON error responses (e.g. an access denial). Verified end to end by `pnpm/test/install/pnpmRegistry.ts` (11 tests: install / add / remove / workspace through a real pnpr server). Closes the second half of the install-accelerator access work (#12184); file-bearing responses are now both inline-only and access-gated. --- Written by an agent (Claude Code, claude-opus-4-8).
…icated /v1/files (#12181) * fix(pnpr): authorize served packages against pnpr's policy in /v1/install A content-addressed digest in the install-accelerator store is shared across packages and says nothing about access, so the store's possession of a package's bytes is not a capability to receive them. `/v1/install` served files for any package found in the store, including ones reached only on the cache-hit / frozen-lockfile path where no access check happened — letting a caller who knows a private package's digest pull bytes the registry routes would 401 on. Check every served package against pnpr's own `packages:` policy before serving — the same decision `serve_packument` / `serve_tarball` make, in process, with no network round trip (so a warm shared server keeps its resolution advantage). `serve_install` resolves the caller's identity from `Authorization`; `deny_unauthorized_packages` denies the install (401 anonymous / 403 authenticated-but-outside-the-allowed-set) when any served package is not readable by the caller. This authorizes against pnpr's own surface, the authority for everything the store can hold today (pnpr fetches anonymously, so cached content is pnpr-hosted or publicly fetchable). When credential forwarding lands, packages the client resolved from external registries under its own token carry no pnpr policy and will need per-caller re-verification against the owning registry (TTL-cached) — noted at the check and tracked in #12184. The raw `/v1/files` endpoint is still unauthenticated; removing it (it is superseded by the inline single-response path) is a follow-up (#12184) that also ports the TS `@pnpm/pnpr.client` + worker off the two-trip path. --- Written by an agent (Claude Code, claude-opus-4-8). * fix(pnpr): remove the unauthenticated /v1/files endpoint `POST /v1/files` served any CAFS file by digest with no authentication and no package identity, so the access gate on `/v1/install` (which is per package) couldn't cover it — it had to be removed, not gated. It was already superseded by the single-response inline path (#12178). * Server: `/v1/install` always answers with the inline gzipped body (lockfile + stats + store-index entries + the missing files' contents); the NDJSON two-trip path, the `/v1/files` route, `handle_files`, and the `FilesRequest`/`is_valid_sha512_hex` helpers are gone. * TS client + worker: `@pnpm/pnpr.client` now does the one inline request and hands the file frames to `@pnpm/worker`'s `writeCafsFiles`, which writes them to the CAFS; the `fetchAndWriteCafsFiles` /v1/files fetcher is replaced. Error bodies are decompressed before being surfaced, since the server also gzips its JSON error responses (e.g. an access denial). Verified end to end by `pnpm/test/install/pnpmRegistry.ts` (11 tests: install / add / remove / workspace through a real pnpr server). Closes the second half of the install-accelerator access work (#12184); file-bearing responses are now both inline-only and access-gated.
What
Reduces the cost of a pnpr install-accelerator install against a remote server, where latency — not bandwidth — dominates (#12165). Lands two of the issue's four experiments and records the other two as evaluated-and-declined.
The old flow was three sequential round trips:
GET /-/pnpr— handshakePOST /v1/install— resolve, return NDJSON (Ddigests,Iindex entries,Llockfile)POST /v1/files— fetch the missing files' contentsAt 100ms RTT the bulk of an install is spent waiting on round trips, not transferring data. Two of the three trips are avoidable.
How
ad788234c9). A newinlineFilesrequest flag: when set,POST /v1/installreturns a single gzipped body — a length-prefixed JSON header (lockfile, stats, store-index entries, or verification violations) followed by the missing files' contents as the same binary frames/v1/filesserves. The Rust client (pacquet-pnpr-client) drops the handshake from the hot path, sendsinlineFiles: true, parses the combined response, and writes the files straight into its CAFS — no follow-up fetch. A sharedbuild_files_payloadframes files identically for both endpoints; the previously-uncompressed lockfile is now compressed with the rest.550177c768). The file-bearing responses (/v1/filesand the inlined install body) move from gzip level 1 to level 6 (the gzip default).The legacy NDJSON +
/v1/filespath is left untouched for clients that don't set the flag (e.g. the TypeScript client), so nothing breaks.Results
A/B latency probe — same server, same latency-injecting TCP proxy at 50ms one-way, warm server store, 5-run average:
≈2.8× faster, ~245ms saved — matching the ~2 round trips eliminated. The gain scales with network latency, which is the remote-server cost the issue targets.
Exp 2's gzip bump shrinks the payload ~16% on real package contents (51.1% → 43.0% of uncompressed); level 9 added under a percent for several times the CPU, so level 6 is the sweet spot.
Experiments evaluated
bb7d652b90→ee1a17bd2a) — overlap only pays off on a bandwidth-constrained link; the public npm registry is CDN-backed and latency-bound, and a pnpr accelerator sits close to its clients, so there is nothing to overlap. Under an artificial bandwidth cap the streamed response was slower (per-package gzip flushing transfers more bytes; backpressure stalled fetches)The throughline: against the real npm registry, latency and round trips are the bottleneck, so the round-trip elimination (Exp 1) is the win that matters. Exp 3 and Exp 4 target bandwidth/upload-size costs that don't bind in practice.
Tests
pnpr+pacquet-pnpr-clienttests pass, including the end-to-end integration tests (resolve+download, multi-file, warm-store, lockfile-only, input-lockfile verification accept/reject, trustLockfile) exercising the inline protocol through a real server.pnpr_installtests pass (full install + node_modules linking).cargo clippy --workspace --all-targets -- --deny warningsand the pre-push dylint/fmt/taplo checks are clean.Written by an agent (Claude Code, claude-opus-4-8).
Summary by CodeRabbit
Documentation
Performance Improvement
Tests