Summary
pnpr's install-accelerator protocol was first prototyped in TypeScript as pnpm-agent (RFC pnpm/rfcs#9) and benchmarked on Node.js. The wire protocol we ported to Rust still carries the remote-npm-registry framing of that prototype, and that framing leaves a lot on the table for the case pnpr is actually built for: a remote server across a network from the client.
This issue lays out a measured plan to redesign the protocol. Breaking changes are acceptable — pnpr is pre-prod/experimental, and we're free to drop pnpm-CLI compatibility and support only the pacquet client.
The problem
For a remote pnpr, one pacquet install makes three sequential round trips, each paying full RTT:
| Step |
Uploads |
Downloads |
GET /-/pnpr (handshake) |
— |
tiny |
POST /v1/install |
full integrity list + full lockfile (JSON) |
lockfile (JSON) + base64'd index entries + missing digests (NDJSON) |
POST /v1/files |
all missing digests |
all file bytes — gzip level 1, buffered into one batch |
Structural issues, all inherited from the Node.js remote-registry framing:
- ~3 RTTs before/around the payload. Two of them are avoidable.
- No overlap. The server fully resolves → fetches every uncached tarball into its store → computes the whole diff → builds the entire response → then replies. Then
/v1/files reads all files → gzips the whole batch in memory → then replies. The client waits idle, decompresses everything, writes everything, then links. Server-disk-read, network, and client-disk-write never overlap.
- Wire bloat. gzip level 1 (low ratio) on the file payload; base64 (+33%) on the index entries; the full integrity list re-uploaded every install (hundreds of KB on a big tree); the lockfile sent in both directions.
Relevant code (at 2b788d53fd):
Evidence: a network-injecting benchmark
pnpr's tests run on loopback, where RTT ≈ 0 and bandwidth ≈ ∞ — which hides exactly these costs. So the first deliverable is a benchmark harness (pacquet/tasks/pnpr-benchmark) that puts a latency/bandwidth-injecting TCP proxy between the pnpr client and an in-process pnpr server, counts bytes each way, and sweeps RTTs. The slope of wall-time vs. RTT measures the number of serial round trips the protocol costs.
Baseline of the current /v1 protocol (hermetic fixtures, warm server store, unlimited bandwidth):
client rtt wall(ms) up(KiB) down(KiB) files
cold 0ms 28.4 1.5 5.0 4
cold 20ms 101.7 1.5 5.0 4
cold 50ms 207.4 1.5 5.0 4
cold 100ms 372.5 1.5 5.0 4
cold → wall-time rises ~3.43 ms per ms of RTT (≈ 3.4 serial round trips)
warm 0ms 25.0 1.8 1.3 0
warm 20ms 74.4 1.8 1.3 0
warm 50ms 137.0 1.8 1.3 0
warm 100ms 247.4 1.8 1.3 0
warm → wall-time rises ~2.21 ms per ms of RTT (≈ 2.2 serial round trips)
So at a 100 ms WAN RTT a cold install spends ~340 ms of its ~372 ms purely on round trips. The cold path costs ~3.4 RTTs (handshake + install + files + TCP connect); the warm path ~2.2 (no files to fetch, so /v1/files is skipped). This is the cost the redesign targets.
Byte/compression deltas don't show on the tiny fixtures; the harness takes --registry/--deps to drive a real tree for production-scale byte numbers.
Run it with:
cargo run --release --bin pnpr-benchmark -- --rtt-ms 0,20,50,100 --iterations 7
The plan — four experiments, each measured vs. baseline
1. One streaming round trip (headline)
Fold /v1/files into the install response and stream file blobs inline (framed: file bytes, index entries, then lockfile + stats), and negotiate the protocol via a request header instead of the GET /-/pnpr handshake. Removes 2 RTTs and the redundant digest re-upload, and lets the client write files as bytes arrive. Expected: cold path slope ~3.4 → ~1 RTT.
2. Better compression
Whole-stream compression instead of per-batch gzip level 1. Test higher gzip levels first (no new dep), then zstd (approved for the workspace) and compare ratio/speed. Targets wire bytes (the bandwidth-bound regime).
3. Bloom-filter store-state upload
Replace the full integrity-list upload (~90 bytes/entry) with a compact probabilistic filter (~1–2 bytes/entry) of the client's store integrities. The server streams everything not-definitely-present; the client drops the rare false-positive duplicate. Big upload win for warm clients with large stores.
4. Pipeline resolve → fetch → stream
Overlap the server's own upstream tarball fetch with streaming to the client, instead of fetching every tarball into the store before responding. Biggest cold-install win (when the server cache is also cold), most invasive.
Notes
- The local two-store byte-copy is a test artifact; production is remote, so shared-store/hardlink shortcuts are out of scope.
- Compatibility with the pnpm (TS) CLI and the
/v1 protocol may be dropped; pacquet is the only client we need to keep working.
Written by an agent (Claude Code, claude-opus-4-8).
Summary
pnpr's install-accelerator protocol was first prototyped in TypeScript aspnpm-agent(RFC pnpm/rfcs#9) and benchmarked on Node.js. The wire protocol we ported to Rust still carries the remote-npm-registry framing of that prototype, and that framing leaves a lot on the table for the casepnpris actually built for: a remote server across a network from the client.This issue lays out a measured plan to redesign the protocol. Breaking changes are acceptable —
pnpris pre-prod/experimental, and we're free to drop pnpm-CLI compatibility and support only the pacquet client.The problem
For a remote
pnpr, onepacquet installmakes three sequential round trips, each paying full RTT:GET /-/pnpr(handshake)POST /v1/installPOST /v1/filesStructural issues, all inherited from the Node.js remote-registry framing:
/v1/filesreads all files → gzips the whole batch in memory → then replies. The client waits idle, decompresses everything, writes everything, then links. Server-disk-read, network, and client-disk-write never overlap.Relevant code (at
2b788d53fd):pacquet/crates/pnpr-client/src/lib.rs/v1/files:pnpr/crates/pnpr/src/install_accelerator.rsEvidence: a network-injecting benchmark
pnpr's tests run on loopback, where RTT ≈ 0 and bandwidth ≈ ∞ — which hides exactly these costs. So the first deliverable is a benchmark harness (pacquet/tasks/pnpr-benchmark) that puts a latency/bandwidth-injecting TCP proxy between the pnpr client and an in-process pnpr server, counts bytes each way, and sweeps RTTs. The slope of wall-time vs. RTT measures the number of serial round trips the protocol costs.Baseline of the current
/v1protocol (hermetic fixtures, warm server store, unlimited bandwidth):So at a 100 ms WAN RTT a cold install spends ~340 ms of its ~372 ms purely on round trips. The cold path costs ~3.4 RTTs (handshake + install + files + TCP connect); the warm path ~2.2 (no files to fetch, so
/v1/filesis skipped). This is the cost the redesign targets.Run it with:
The plan — four experiments, each measured vs. baseline
1. One streaming round trip (headline)
Fold
/v1/filesinto the install response and stream file blobs inline (framed: file bytes, index entries, then lockfile + stats), and negotiate the protocol via a request header instead of theGET /-/pnprhandshake. Removes 2 RTTs and the redundant digest re-upload, and lets the client write files as bytes arrive. Expected: cold path slope ~3.4 → ~1 RTT.2. Better compression
Whole-stream compression instead of per-batch gzip level 1. Test higher gzip levels first (no new dep), then zstd (approved for the workspace) and compare ratio/speed. Targets wire bytes (the bandwidth-bound regime).
3. Bloom-filter store-state upload
Replace the full integrity-list upload (~90 bytes/entry) with a compact probabilistic filter (~1–2 bytes/entry) of the client's store integrities. The server streams everything not-definitely-present; the client drops the rare false-positive duplicate. Big upload win for warm clients with large stores.
4. Pipeline resolve → fetch → stream
Overlap the server's own upstream tarball fetch with streaming to the client, instead of fetching every tarball into the store before responding. Biggest cold-install win (when the server cache is also cold), most invasive.
Notes
/v1protocol may be dropped; pacquet is the only client we need to keep working.Written by an agent (Claude Code, claude-opus-4-8).