feat: pnpm agent — server-side resolution for faster installs#11251
Conversation
d9158ce to
f1c7163
Compare
7bf7631 to
a2643ab
Compare
Add an opt-in pnpm agent server that resolves dependencies server-side and streams only the files missing from the client's store. Server (@pnpm/agent.server): - Multi-process HTTP server (Node.js cluster, 9 workers) - SQLite-backed metadata cache — resolution in ~1s vs ~3.4s with .jsonl - Streaming NDJSON /v1/install — file digests emitted as packages resolve - Gzip-compressed streaming /v1/files — no buffering on server or worker - Binary protocol with server-provided digests (no client rehashing) Client (@pnpm/agent.client): - Streaming NDJSON parser dispatches worker batches during resolution - Worker-thread streaming HTTP + gzip decompress + CAFS writes - Pre-packed msgpack store index entries written directly to SQLite - Pipelined headless install via wrapped store controller Config: `agent: "http://host:port"` in pnpm-workspace.yaml
Two bugs that caused headless install to re-download from npm: 1. Client parsed I line key incorrectly: the key format is "integrity\tpkgId\tbase64" but the parser split at the first tab, giving only the integrity hash as the key. Fixed to split at the last tab so key = "integrity\tpkgId". 2. After writing index entries, the WAL wasn't checkpointed. Other SQLite connections (in worker threads) used stale WAL indexes. Added StoreIndex.checkpoint() which runs PRAGMA wal_checkpoint. Also removed gzip from /v1/files streaming (was causing corrupt header errors) and removed debug logging.
…rectly The wrapped fetchPackage now calls readPkgFromCafs with verifyStoreIntegrity: false instead of going through the real fetchPackage (which has verifyStoreIntegrity baked in at creation time). This skips stat + hash checks for all 33K files that were just written by the agent.
Pipe the whole response through createGzip(level 1) on the server, createGunzip on the client. Better compression ratio (cross-file context), simpler code (no per-file gzip flag), and one decompress pipe instead of thousands of gunzipSync calls. 274MB uncompressed → ~80MB over the wire. Negligible overhead on localhost, significant savings on remote servers.
The resolver can skip npm registry requests when cached metadata satisfies the minimum release age constraint.
The `|| importerId === '.'` fallback in the importer matching predicate caused any project to match when iterating the root importer, so in a workspace where the root project was not the first entry in `projects[]` the root snapshot could be written under a non-root project's key. Lockfile importer IDs are already the relative paths from lockfileDir (tmpDir) to each project dir, which equal the requested project.dir values by construction — no remapping is needed.
There was a problem hiding this comment.
Pull request overview
This PR introduces an opt-in “pnpm agent” architecture that moves dependency resolution to a local/remote server and overlaps resolution with store-aware file transfers, aiming to reduce install time by downloading only missing CAFS content and pre-seeding the client store index.
Changes:
- Adds new
@pnpm/agent.server(HTTP server) and@pnpm/agent.client(NDJSON-based install protocol + parallel file fetching via worker threads). - Integrates the agent into the standard install flow behind a new
agentconfig option, plus supporting config plumbing. - Extends store index + resolver plumbing (raw index access/keys/checkpoint; injectable npm-resolver meta cache) and adds tests.
Reviewed changes
Copilot reviewed 46 out of 47 changed files in this pull request and generated 17 comments.
Show a summary per file
| File | Description |
|---|---|
| worker/src/types.ts | Adds worker message types for CAFS writes and agent file fetching. |
| worker/src/start.ts | Implements worker-side /v1/files fetch + CAFS write path and direct CAFS writes. |
| worker/src/index.ts | Exposes writeCafsFiles, fetchAndWriteCafsFiles, and import concurrency setter. |
| store/index/src/index.ts | Adds raw reads, key iteration, and WAL checkpoint helper for agent workflows. |
| store/cafs/src/index.ts | Exports contentPathFromHex for CAFS path derivation in workers. |
| resolving/npm-resolver/src/index.ts | Allows injecting a pre-populated metadata cache (e.g. SQLite-backed) into resolver. |
| pnpm/tsconfig.json | Adds TS project reference to agent/server. |
| pnpm/test/install/pnpmRegistry.ts | Adds e2e-ish install tests validating agent usage via --config.agent. |
| pnpm/package.json | Adds @pnpm/agent.server dependency for tests. |
| pnpm-workspace.yaml | Includes agent/* packages in the workspace. |
| pnpm-lock.yaml | Adds lockfile entries for new agent packages and deps-installer changes. |
| installing/deps-installer/tsconfig.json | Adds TS project reference to agent/client. |
| installing/deps-installer/src/install/index.ts | Adds agent-enabled install path (installFromPnpmRegistry) and workspace support. |
| installing/deps-installer/src/install/extendInstallOptions.ts | Adds agent?: string to install options. |
| installing/deps-installer/package.json | Adds dependencies needed for agent install path (@pnpm/agent.client, @pnpm/store.index). |
| installing/commands/src/recursive.ts | Plumbs agent through recursive command options. |
| installing/commands/src/installDeps.ts | Plumbs agent through install-deps command options. |
| installing/commands/src/install.ts | Adds agent to rc option types and install command options. |
| core/types/src/package.ts | Adds agent?: string to PnpmSettings. |
| config/reader/src/types.ts | Registers config key type for agent. |
| config/reader/src/getOptionsFromRootManifest.ts | Allows reading agent from pnpm settings. |
| config/reader/src/configFileKey.ts | Excludes agent from global config file keys. |
| config/reader/src/Config.ts | Adds agent?: string to config shape. |
| agent/server/tsconfig.lint.json | Adds lint tsconfig for agent server. |
| agent/server/tsconfig.json | Adds TS config + references for agent server package. |
| agent/server/test/tsconfig.json | Adds test tsconfig for agent server. |
| agent/server/test/integration.ts | Adds integration tests for server/client interaction. |
| agent/server/test/diff.ts | Adds unit tests for diff computation logic. |
| agent/server/src/protocol.ts | Implements (exported) binary protocol encoder (currently not wired to server endpoints). |
| agent/server/src/metadataStore.ts | Implements SQLite-backed metadata cache for resolver. |
| agent/server/src/index.ts | Exports server APIs. |
| agent/server/src/fileStore.ts | Adds SQLite-backed file store for bulk file reads. |
| agent/server/src/diff.ts | Adds integrity index builder + missing-file diff computation. |
| agent/server/src/createRegistryServer.ts | Implements /v1/install (NDJSON) and /v1/files (gzip stream) endpoints. |
| agent/server/src/bin.ts | Adds clustered CLI entrypoint for the agent server. |
| agent/server/package.json | Defines new @pnpm/agent.server package. |
| agent/server/README.md | Documents agent server (currently diverges from implementation). |
| agent/client/tsconfig.lint.json | Adds lint tsconfig for agent client. |
| agent/client/tsconfig.json | Adds TS config + references for agent client package. |
| agent/client/test/tsconfig.json | Adds test tsconfig for agent client. |
| agent/client/test/protocol.ts | Adds tests for binary protocol decoding. |
| agent/client/src/protocol.ts | Adds binary protocol decoding implementation (currently not used by install flow). |
| agent/client/src/index.ts | Exports client APIs. |
| agent/client/src/fetchFromPnpmRegistry.ts | Implements NDJSON /v1/install streaming + worker-dispatched /v1/files fetching. |
| agent/client/package.json | Defines new @pnpm/agent.client package. |
| agent/client/README.md | Documents agent client (currently diverges from implementation). |
| agent/client/.gitignore | Ignores local store/ under client package. |
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const headlessOpts = { | ||
| ...opts, | ||
| // Skip re-verifying files just written from the agent — they're | ||
| // guaranteed correct (server verified, no rehashing needed). | ||
| verifyStoreIntegrity: false, | ||
| storeController: wrappedStoreController, |
There was a problem hiding this comment.
headlessOpts sets verifyStoreIntegrity: false for the entire install when opts.agent is configured. This disables integrity verification even for packages/files that were already present in the store before the agent run, expanding the trust boundary to the whole store. Consider scoping the skip to only the files written by the agent (or keeping global verification on and selectively short-circuiting verification for freshly downloaded digests).
There was a problem hiding this comment.
Same reasoning as the comment on line 1939 — this is part of the agent-mode trust boundary tracked by the TODO in the PR description. Agent mode assumes a trusted server and a content-addressed store where file paths encode their sha512. Narrowing verification to only agent-written digests requires threading that set into headless install; doing it in a follow-up rather than expanding this PR further.
- Delete dead binary-protocol code (`encodeResponse`, `decodeResponse`) and related types — the agent uses NDJSON + gzip-streamed /v1/files now - Remove unused `port` option from `RegistryServerOptions` - Delay `writeHead(200)` until after JSON parsing on /v1/install and /v1/files so parse failures return a 400 with a JSON error body - On late failures, emit `E`-line NDJSON instead of a bogus JSON body - Use real `computeDiff` stats in the final `L` line - Validate sha512 digests on /v1/files before trusting them in binary headers (rejects invalid hex with 400) - Normalize agent base URL with trailing `/` + relative paths so path prefixes are preserved (client + worker) - Client: handle server-emitted `E` lines and reject on unexpected stream end without a lockfile - Worker: reject non-2xx /v1/files responses; fail if the stream ends without the 64-byte end marker or leaves unparsed bytes - Worker: validate `setImportConcurrency` input (positive integer) - Close `realServer` and clean up tmp dir in pnpmRegistry e2e afterAll - Update agent server and client READMEs (config key `agent`, NDJSON+gzip protocol)
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 44 out of 45 changed files in this pull request and generated 7 comments.
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Reject YAML-unsafe characters (control chars, quotes, backtick, backslash) in `project.dir` on /v1/install; a crafted dir could otherwise break out of the single-quoted scalar emitted into pnpm-workspace.yaml - Close the agent-install StoreIndex in a finally so a failing fetchFromPnpmRegistry doesn't leak a SQLite handle (on Windows that also blocks store cleanup) - Validate /v1/files stream entries against the set of digests the worker actually requested — a misbehaving agent can't stream extra or unrelated entries and write unbounded files into CAFS - Wrap the /v1/files stream's `processBuffer()` in try/catch so write/mkdir errors reject the worker Promise instead of surfacing as uncaughtException and crashing the worker thread - Retry `StoreIndex.checkpoint()` under SQLITE_BUSY for consistency with the rest of the DB ops in this file - Counting proxy in pnpmRegistry e2e test now guards undefined `req.url` and handles `proxyReq` errors with a 502 response
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 44 out of 45 changed files in this pull request and generated 4 comments.
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Close `opts.storeController` in the agent install path's finally so pending StoreIndex writes are flushed on both success and failure, matching the normal install lifecycle - Reject non-array `digests` in /v1/files with a 400 before the iteration so a malformed request body can't reach the stream loop - Normalize workspace project `dir` to POSIX separators on the client — `path.relative()` returns backslashes on Windows which the agent server now rejects as unsafe YAML characters - Worker /v1/files stream now fails if the server ended cleanly but omitted any requested digest (tracked via `requestedDigests`); previously this silently succeeded and left the CAFS incomplete
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 44 out of 45 changed files in this pull request and generated 4 comments.
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Filter `readStoreIntegrities()` to actual SRI prefixes (sha512-, sha256-, sha1-) so non-integrity StoreIndex keys (e.g. git-hosted entries) aren't sent over to the agent server - Worker /v1/files: when EEXIST hits an existing CAFS file, verify the on-disk size matches the received content and atomically rewrite if it doesn't — guards against truncated files left by a crashed previous process (which would otherwise be silently trusted by the agent path that skips integrity verification) - `setImportConcurrency` disposers now restore only when their own limiter is still active, so overlapping installs in the same process don't clobber each other's overrides - Plumb real `InstallationResultStats` from `headlessInstall()` out of the agent install path; `mutateModules` now returns those instead of all-zeros + a non-existent `updated` field
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 44 out of 45 changed files in this pull request and generated 2 comments.
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for (const file of files) { | ||
| if (!file.endsWith('.jsonl')) continue | ||
| const pkgName = decodeURIComponent(file.replace('.jsonl', '')) | ||
| const cacheKey = isFullMeta ? `${pkgName}:full` : pkgName | ||
|
|
||
| if (this.has(cacheKey)) continue | ||
|
|
There was a problem hiding this comment.
decodeURIComponent() can throw if it encounters a malformed percent-encoding in the filename. Since this code is scanning the filesystem (which may contain partially-written or unexpected filenames), a single bad .jsonl filename would currently abort the whole import. Consider wrapping the decode in a try/catch (or falling back to the raw filename) so importFromCacheDir() remains robust against corrupt entries.
Reading d.digest/d.executable on a null or malformed entry would throw and surface as a 500 after headers may have been written. Now each entry is validated (must be an object with a valid sha512 hex digest and boolean executable) and any mismatch is rejected with a 400.
Previously the agent path only handled `mutation === 'install'` — `pnpm add` and `pnpm remove` would fall through to the local resolver. Extend it to cover all three mutation types: - `install`: unchanged. - `uninstallSome` (pnpm remove): apply `removeDeps()` to the manifest client-side before sending it to the agent, so the resulting lockfile naturally excludes the dropped deps. - `installSome` (pnpm add): parse `dependencySelectors` and merge them into the manifest (unspecified versions default to "latest"), send to the agent, then copy the resolved specifier from each lockfile importer entry back into the client manifest. The save prefix, catalog substitution, and normalization all happen on the server during its resolution pass — we just adopt the result. Update flags (`update`/`updateMatching`/`updateToLatest`) still fall through to the normal client-side resolver since they need client-side lockfile manipulation that the agent protocol doesn't expose yet. For partial workspace operations (e.g. `pnpm --filter X add foo`) the agent now receives every workspace project — mutated ones with their pre-processed manifest, the rest with their current manifest — so the returned lockfile contains every importer and headlessInstall doesn't crash on missing entries.
The agent server's plain-install path writes the user's raw spec
('latest') into the lockfile importer's specifier rather than the
normalized save-prefix range. For 'pnpm add foo' via the agent this
meant the manifest ended up with 'foo: \"latest\"' instead of
'foo: \"^X.Y.Z\"'.
Fix: track whether each installSome selector had a user-provided
version. For deps where the user didn't specify a version, compute
the save-prefix spec from the resolved version in the lockfile
importer's dependencies map using the mutation's pinnedVersion. Deps
with an explicit user spec keep the user's spec verbatim.
Adds the following e2e tests for the agent path:
- pnpm add without a version (exercises the round-trip)
- pnpm add -D (devDependencies targeting)
- pnpm add with multiple selectors
- pnpm --filter remove in a workspace
Drop the workspace-internal name and dotted suffix in favor of an unscoped name that doubles as the bin name. Set initial version 0.0.1-0 and remove `private: true` so it can be published. - Update workspace consumers (pnpm CLI, agent/server tests, e2e test) to import from 'pnpm-agent'. - Update @pnpm/agent.client README JSDoc reference. - Add 'pnpm-agent' to the meta-updater's with-registry preset list so its tests still get the registry-mock setup. - Add agent/server/Dockerfile that installs pnpm-agent globally via pnpm and runs it. Image only works once pnpm-agent is published to npm.
| storeIndex.close() | ||
| }) | ||
|
|
||
| return server |
There was a problem hiding this comment.
random question: what's the performance like? how many install requests per second can the server handle concurrently?
There was a problem hiding this comment.
I don't know yet. I was only testing with single client and only the worst scenario.
| } | ||
|
|
||
| // Create package.json for each project | ||
| await Promise.all(projects.map(async (project) => { |
There was a problem hiding this comment.
one of my concern is that now for each install request, there could be significant amount of I/O at the server side 🤔 which could limit how much concurrency the server could perform
There was a problem hiding this comment.
We store the files in sqlite db. So, the amount of I/O should be minimized with hot cache. But I don't have a answer right now. I was only concentrating on performance with single client for now.
Swap node:22-slim + corepack for the official pnpm base image. Node is now installed inside the container via `pnpm runtime set node 22 -g`, matching the intended flow of the GHCR image (which ships pnpm but no Node runtime). Drops the manual PNPM_HOME/PATH setup — the base image already sets PNPM_HOME=/pnpm and puts /pnpm/bin on PATH, so both `node` and `pnpm-agent` are resolved out of the same global bin dir.
Summary
Adds an opt-in pnpm agent server that resolves dependencies server-side and streams only the files missing from the client's content-addressable store.
@pnpm/agent.server— multi-process HTTP server (Node.jscluster) with SQLite-backed metadata and file caches@pnpm/agent.client— streams an NDJSON response, dispatches worker threads to fetch files while the server is still resolvingagentinpnpm-workspace.yaml(opt-in)How it works
POST /v1/installwith dependencies + store integritiesinstall({ lockfileOnly: true }), with a SQLite-backedPackageMetaCachefor fast repeat resolutionstoreController.requestPackagelooks up its files and immediately streams digests the client is missing (NDJSONDlines)POST /v1/files— file downloads overlap with server-side resolutionIlines) and lockfile (Lline)fetchPackagethat callsreadPkgFromCafswithverifyStoreIntegrity: false(files are trusted from the agent)/v1/filesresponse is gzip-streamed (274MB → ~80MB) — server pipes throughcreateGzip, worker pipes throughcreateGunzip, parsing and writing files to CAFS as data arrivesPerformance
1351-package project, cold local store, warm server (localhost):
Key optimizations
/v1/install— file digests stream during resolution, downloads start before resolution finishes/v1/files— whole-stream gzip (274MB → ~80MB), significant savings on remote serversfetchPackagecallsreadPkgFromCafswithverifyStoreIntegrity: falsewriteFileSyncwithwx— no stat + temp + renameUsage
Start the server:
Configure in
pnpm-workspace.yaml:See
RFC.mdfor the full design document.Test plan
TODO
verifyStoreIntegrity: falseis passed toreadPkgFromCafsfor all packages, including ones that existed before the agent run.