Skip to content

feat: pnpm agent — server-side resolution for faster installs#11251

Merged
zkochan merged 31 commits into
mainfrom
pnpm-agent
Apr 20, 2026
Merged

feat: pnpm agent — server-side resolution for faster installs#11251
zkochan merged 31 commits into
mainfrom
pnpm-agent

Conversation

@zkochan

@zkochan zkochan commented Apr 13, 2026

Copy link
Copy Markdown
Member

Summary

Adds an opt-in pnpm agent server that resolves dependencies server-side and streams only the files missing from the client's content-addressable store.

  • @pnpm/agent.server — multi-process HTTP server (Node.js cluster) with SQLite-backed metadata and file caches
  • @pnpm/agent.client — streams an NDJSON response, dispatches worker threads to fetch files while the server is still resolving
  • New config: agent in pnpm-workspace.yaml (opt-in)

How it works

  1. Client reads integrity hashes from its local store index
  2. Sends POST /v1/install with dependencies + store integrities
  3. Server resolves the dependency tree using pnpm's install({ lockfileOnly: true }), with a SQLite-backed PackageMetaCache for fast repeat resolution
  4. As each package resolves, a wrapped storeController.requestPackage looks up its files and immediately streams digests the client is missing (NDJSON D lines)
  5. Client reads the stream line by line; digest batches fill up and dispatch worker threads to POST /v1/files — file downloads overlap with server-side resolution
  6. After resolution, server sends index entries (I lines) and lockfile (L line)
  7. Client writes index entries to store, then runs headless install with a wrapped fetchPackage that calls readPkgFromCafs with verifyStoreIntegrity: false (files are trusted from the agent)
  8. /v1/files response is gzip-streamed (274MB → ~80MB) — server pipes through createGzip, worker pipes through createGunzip, parsing and writing files to CAFS as data arrives

Performance

1351-package project, cold local store, warm server (localhost):

Scenario Time
Vanilla pnpm install (cold OS cache) ~48s
Vanilla pnpm install (warm OS cache) ~34s
With pnpm agent (consistent) ~33s

Key optimizations

  1. SQLite metadata cache — server-side resolution drops from ~3.4s to ~0.9s
  2. SQLite file store — consistent read performance regardless of OS file cache state
  3. Streaming /v1/install — file digests stream during resolution, downloads start before resolution finishes
  4. Gzip-streamed /v1/files — whole-stream gzip (274MB → ~80MB), significant savings on remote servers
  5. Worker-thread streaming HTTP — workers pipe gzip → parse → write to CAFS as data arrives, no buffering
  6. No rehashing — server-provided digests used directly, skipping 33K SHA-512 computations
  7. No re-verification — wrapped fetchPackage calls readPkgFromCafs with verifyStoreIntegrity: false
  8. Direct writeFileSync with wx — no stat + temp + rename
  9. Pre-packed msgpack — server sends raw store index buffers, client writes directly to SQLite
  10. WAL checkpoint — ensures store index entries written by agent are visible to headless install's worker threads

Usage

Start the server:

node agent/server/lib/bin.js

Configure in pnpm-workspace.yaml:

agent: http://localhost:4873

See RFC.md for the full design document.

Test plan

  • 3 client protocol round-trip tests (encode/decode binary format)
  • 6 server diff computation unit tests
  • 4 server integration tests (full server+client flow with registry-mock)
  • 1 CLI e2e test (starts server, runs pnpm install, verifies server was used)
  • Manual benchmarking on real projects (localhost and remote server)

TODO

  • Only skip store integrity verification for files just written by the agent, not pre-existing files. Currently verifyStoreIntegrity: false is passed to readPkgFromCafs for all packages, including ones that existed before the agent run.

zkochan added 14 commits April 17, 2026 17:53
Add an opt-in pnpm agent server that resolves dependencies server-side
and streams only the files missing from the client's store.

Server (@pnpm/agent.server):
- Multi-process HTTP server (Node.js cluster, 9 workers)
- SQLite-backed metadata cache — resolution in ~1s vs ~3.4s with .jsonl
- Streaming NDJSON /v1/install — file digests emitted as packages resolve
- Gzip-compressed streaming /v1/files — no buffering on server or worker
- Binary protocol with server-provided digests (no client rehashing)

Client (@pnpm/agent.client):
- Streaming NDJSON parser dispatches worker batches during resolution
- Worker-thread streaming HTTP + gzip decompress + CAFS writes
- Pre-packed msgpack store index entries written directly to SQLite
- Pipelined headless install via wrapped store controller

Config: `agent: "http://host:port"` in pnpm-workspace.yaml
Two bugs that caused headless install to re-download from npm:

1. Client parsed I line key incorrectly: the key format is
   "integrity\tpkgId\tbase64" but the parser split at the first tab,
   giving only the integrity hash as the key. Fixed to split at the
   last tab so key = "integrity\tpkgId".

2. After writing index entries, the WAL wasn't checkpointed. Other
   SQLite connections (in worker threads) used stale WAL indexes.
   Added StoreIndex.checkpoint() which runs PRAGMA wal_checkpoint.

Also removed gzip from /v1/files streaming (was causing corrupt
header errors) and removed debug logging.
…rectly

The wrapped fetchPackage now calls readPkgFromCafs with
verifyStoreIntegrity: false instead of going through the real
fetchPackage (which has verifyStoreIntegrity baked in at creation
time). This skips stat + hash checks for all 33K files that were
just written by the agent.
Pipe the whole response through createGzip(level 1) on the server,
createGunzip on the client. Better compression ratio (cross-file
context), simpler code (no per-file gzip flag), and one decompress
pipe instead of thousands of gunzipSync calls.

274MB uncompressed → ~80MB over the wire. Negligible overhead on
localhost, significant savings on remote servers.
The resolver can skip npm registry requests when cached metadata
satisfies the minimum release age constraint.
The `|| importerId === '.'` fallback in the importer matching predicate
caused any project to match when iterating the root importer, so in a
workspace where the root project was not the first entry in `projects[]`
the root snapshot could be written under a non-root project's key.

Lockfile importer IDs are already the relative paths from lockfileDir
(tmpDir) to each project dir, which equal the requested project.dir
values by construction — no remapping is needed.
@zkochan zkochan marked this pull request as ready for review April 17, 2026 16:37
Copilot AI review requested due to automatic review settings April 17, 2026 16:37

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces an opt-in “pnpm agent” architecture that moves dependency resolution to a local/remote server and overlaps resolution with store-aware file transfers, aiming to reduce install time by downloading only missing CAFS content and pre-seeding the client store index.

Changes:

  • Adds new @pnpm/agent.server (HTTP server) and @pnpm/agent.client (NDJSON-based install protocol + parallel file fetching via worker threads).
  • Integrates the agent into the standard install flow behind a new agent config option, plus supporting config plumbing.
  • Extends store index + resolver plumbing (raw index access/keys/checkpoint; injectable npm-resolver meta cache) and adds tests.

Reviewed changes

Copilot reviewed 46 out of 47 changed files in this pull request and generated 17 comments.

Show a summary per file
File Description
worker/src/types.ts Adds worker message types for CAFS writes and agent file fetching.
worker/src/start.ts Implements worker-side /v1/files fetch + CAFS write path and direct CAFS writes.
worker/src/index.ts Exposes writeCafsFiles, fetchAndWriteCafsFiles, and import concurrency setter.
store/index/src/index.ts Adds raw reads, key iteration, and WAL checkpoint helper for agent workflows.
store/cafs/src/index.ts Exports contentPathFromHex for CAFS path derivation in workers.
resolving/npm-resolver/src/index.ts Allows injecting a pre-populated metadata cache (e.g. SQLite-backed) into resolver.
pnpm/tsconfig.json Adds TS project reference to agent/server.
pnpm/test/install/pnpmRegistry.ts Adds e2e-ish install tests validating agent usage via --config.agent.
pnpm/package.json Adds @pnpm/agent.server dependency for tests.
pnpm-workspace.yaml Includes agent/* packages in the workspace.
pnpm-lock.yaml Adds lockfile entries for new agent packages and deps-installer changes.
installing/deps-installer/tsconfig.json Adds TS project reference to agent/client.
installing/deps-installer/src/install/index.ts Adds agent-enabled install path (installFromPnpmRegistry) and workspace support.
installing/deps-installer/src/install/extendInstallOptions.ts Adds agent?: string to install options.
installing/deps-installer/package.json Adds dependencies needed for agent install path (@pnpm/agent.client, @pnpm/store.index).
installing/commands/src/recursive.ts Plumbs agent through recursive command options.
installing/commands/src/installDeps.ts Plumbs agent through install-deps command options.
installing/commands/src/install.ts Adds agent to rc option types and install command options.
core/types/src/package.ts Adds agent?: string to PnpmSettings.
config/reader/src/types.ts Registers config key type for agent.
config/reader/src/getOptionsFromRootManifest.ts Allows reading agent from pnpm settings.
config/reader/src/configFileKey.ts Excludes agent from global config file keys.
config/reader/src/Config.ts Adds agent?: string to config shape.
agent/server/tsconfig.lint.json Adds lint tsconfig for agent server.
agent/server/tsconfig.json Adds TS config + references for agent server package.
agent/server/test/tsconfig.json Adds test tsconfig for agent server.
agent/server/test/integration.ts Adds integration tests for server/client interaction.
agent/server/test/diff.ts Adds unit tests for diff computation logic.
agent/server/src/protocol.ts Implements (exported) binary protocol encoder (currently not wired to server endpoints).
agent/server/src/metadataStore.ts Implements SQLite-backed metadata cache for resolver.
agent/server/src/index.ts Exports server APIs.
agent/server/src/fileStore.ts Adds SQLite-backed file store for bulk file reads.
agent/server/src/diff.ts Adds integrity index builder + missing-file diff computation.
agent/server/src/createRegistryServer.ts Implements /v1/install (NDJSON) and /v1/files (gzip stream) endpoints.
agent/server/src/bin.ts Adds clustered CLI entrypoint for the agent server.
agent/server/package.json Defines new @pnpm/agent.server package.
agent/server/README.md Documents agent server (currently diverges from implementation).
agent/client/tsconfig.lint.json Adds lint tsconfig for agent client.
agent/client/tsconfig.json Adds TS config + references for agent client package.
agent/client/test/tsconfig.json Adds test tsconfig for agent client.
agent/client/test/protocol.ts Adds tests for binary protocol decoding.
agent/client/src/protocol.ts Adds binary protocol decoding implementation (currently not used by install flow).
agent/client/src/index.ts Exports client APIs.
agent/client/src/fetchFromPnpmRegistry.ts Implements NDJSON /v1/install streaming + worker-dispatched /v1/files fetching.
agent/client/package.json Defines new @pnpm/agent.client package.
agent/client/README.md Documents agent client (currently diverges from implementation).
agent/client/.gitignore Ignores local store/ under client package.
Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread worker/src/start.ts
Comment thread agent/server/src/protocol.ts Outdated
Comment thread agent/server/src/createRegistryServer.ts Outdated
Comment thread pnpm/test/install/pnpmRegistry.ts Outdated
Comment thread agent/client/src/fetchFromPnpmRegistry.ts Outdated
Comment on lines +1953 to +1958
const headlessOpts = {
...opts,
// Skip re-verifying files just written from the agent — they're
// guaranteed correct (server verified, no rehashing needed).
verifyStoreIntegrity: false,
storeController: wrappedStoreController,

Copilot AI Apr 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

headlessOpts sets verifyStoreIntegrity: false for the entire install when opts.agent is configured. This disables integrity verification even for packages/files that were already present in the store before the agent run, expanding the trust boundary to the whole store. Consider scoping the skip to only the files written by the agent (or keeping global verification on and selectively short-circuiting verification for freshly downloaded digests).

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reasoning as the comment on line 1939 — this is part of the agent-mode trust boundary tracked by the TODO in the PR description. Agent mode assumes a trusted server and a content-addressed store where file paths encode their sha512. Narrowing verification to only agent-written digests requires threading that set into headless install; doing it in a follow-up rather than expanding this PR further.

Comment thread agent/server/README.md
Comment thread agent/client/src/fetchFromPnpmRegistry.ts
Comment thread agent/server/src/createRegistryServer.ts
Comment thread agent/client/README.md Outdated
zkochan added 2 commits April 17, 2026 19:18
- Delete dead binary-protocol code (`encodeResponse`, `decodeResponse`)
  and related types — the agent uses NDJSON + gzip-streamed /v1/files now
- Remove unused `port` option from `RegistryServerOptions`
- Delay `writeHead(200)` until after JSON parsing on /v1/install and
  /v1/files so parse failures return a 400 with a JSON error body
- On late failures, emit `E`-line NDJSON instead of a bogus JSON body
- Use real `computeDiff` stats in the final `L` line
- Validate sha512 digests on /v1/files before trusting them in binary
  headers (rejects invalid hex with 400)
- Normalize agent base URL with trailing `/` + relative paths so path
  prefixes are preserved (client + worker)
- Client: handle server-emitted `E` lines and reject on unexpected
  stream end without a lockfile
- Worker: reject non-2xx /v1/files responses; fail if the stream ends
  without the 64-byte end marker or leaves unparsed bytes
- Worker: validate `setImportConcurrency` input (positive integer)
- Close `realServer` and clean up tmp dir in pnpmRegistry e2e afterAll
- Update agent server and client READMEs (config key `agent`,
  NDJSON+gzip protocol)

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 44 out of 45 changed files in this pull request and generated 7 comments.

Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread installing/deps-installer/src/install/index.ts Outdated
Comment thread pnpm/test/install/pnpmRegistry.ts
Comment thread worker/src/start.ts
Comment thread worker/src/start.ts
Comment thread store/index/src/index.ts
Comment thread agent/server/src/diff.ts
Comment thread agent/server/src/createRegistryServer.ts
zkochan added 2 commits April 17, 2026 21:00
- Reject YAML-unsafe characters (control chars, quotes, backtick,
  backslash) in `project.dir` on /v1/install; a crafted dir could
  otherwise break out of the single-quoted scalar emitted into
  pnpm-workspace.yaml
- Close the agent-install StoreIndex in a finally so a failing
  fetchFromPnpmRegistry doesn't leak a SQLite handle (on Windows
  that also blocks store cleanup)
- Validate /v1/files stream entries against the set of digests the
  worker actually requested — a misbehaving agent can't stream extra
  or unrelated entries and write unbounded files into CAFS
- Wrap the /v1/files stream's `processBuffer()` in try/catch so
  write/mkdir errors reject the worker Promise instead of surfacing
  as uncaughtException and crashing the worker thread
- Retry `StoreIndex.checkpoint()` under SQLITE_BUSY for consistency
  with the rest of the DB ops in this file
- Counting proxy in pnpmRegistry e2e test now guards undefined
  `req.url` and handles `proxyReq` errors with a 502 response

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 44 out of 45 changed files in this pull request and generated 4 comments.

Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread installing/deps-installer/src/install/index.ts
Comment thread agent/server/src/createRegistryServer.ts Outdated
Comment thread installing/deps-installer/src/install/index.ts
Comment thread worker/src/start.ts
zkochan added 2 commits April 17, 2026 23:18
- Close `opts.storeController` in the agent install path's finally so
  pending StoreIndex writes are flushed on both success and failure,
  matching the normal install lifecycle
- Reject non-array `digests` in /v1/files with a 400 before the
  iteration so a malformed request body can't reach the stream loop
- Normalize workspace project `dir` to POSIX separators on the client
  — `path.relative()` returns backslashes on Windows which the agent
  server now rejects as unsafe YAML characters
- Worker /v1/files stream now fails if the server ended cleanly but
  omitted any requested digest (tracked via `requestedDigests`);
  previously this silently succeeded and left the CAFS incomplete

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 44 out of 45 changed files in this pull request and generated 4 comments.

Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread agent/client/src/fetchFromPnpmRegistry.ts Outdated
Comment thread worker/src/start.ts
Comment thread worker/src/index.ts
Comment thread installing/deps-installer/src/install/index.ts Outdated
- Filter `readStoreIntegrities()` to actual SRI prefixes (sha512-,
  sha256-, sha1-) so non-integrity StoreIndex keys (e.g. git-hosted
  entries) aren't sent over to the agent server
- Worker /v1/files: when EEXIST hits an existing CAFS file, verify
  the on-disk size matches the received content and atomically
  rewrite if it doesn't — guards against truncated files left by a
  crashed previous process (which would otherwise be silently
  trusted by the agent path that skips integrity verification)
- `setImportConcurrency` disposers now restore only when their own
  limiter is still active, so overlapping installs in the same
  process don't clobber each other's overrides
- Plumb real `InstallationResultStats` from `headlessInstall()` out
  of the agent install path; `mutateModules` now returns those
  instead of all-zeros + a non-existent `updated` field

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 44 out of 45 changed files in this pull request and generated 2 comments.

Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread agent/server/src/createRegistryServer.ts
Comment on lines +93 to +99
for (const file of files) {
if (!file.endsWith('.jsonl')) continue
const pkgName = decodeURIComponent(file.replace('.jsonl', ''))
const cacheKey = isFullMeta ? `${pkgName}:full` : pkgName

if (this.has(cacheKey)) continue

Copilot AI Apr 17, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decodeURIComponent() can throw if it encounters a malformed percent-encoding in the filename. Since this code is scanning the filesystem (which may contain partially-written or unexpected filenames), a single bad .jsonl filename would currently abort the whole import. Consider wrapping the decode in a try/catch (or falling back to the raw filename) so importFromCacheDir() remains robust against corrupt entries.

Copilot uses AI. Check for mistakes.
zkochan added 4 commits April 18, 2026 00:20
Reading d.digest/d.executable on a null or malformed entry would throw
and surface as a 500 after headers may have been written. Now each
entry is validated (must be an object with a valid sha512 hex digest
and boolean executable) and any mismatch is rejected with a 400.
Previously the agent path only handled `mutation === 'install'` — `pnpm
add` and `pnpm remove` would fall through to the local resolver.
Extend it to cover all three mutation types:

- `install`: unchanged.
- `uninstallSome` (pnpm remove): apply `removeDeps()` to the manifest
  client-side before sending it to the agent, so the resulting
  lockfile naturally excludes the dropped deps.
- `installSome` (pnpm add): parse `dependencySelectors` and merge
  them into the manifest (unspecified versions default to "latest"),
  send to the agent, then copy the resolved specifier from each
  lockfile importer entry back into the client manifest. The save
  prefix, catalog substitution, and normalization all happen on the
  server during its resolution pass — we just adopt the result.

Update flags (`update`/`updateMatching`/`updateToLatest`) still fall
through to the normal client-side resolver since they need
client-side lockfile manipulation that the agent protocol doesn't
expose yet.

For partial workspace operations (e.g. `pnpm --filter X add foo`) the
agent now receives every workspace project — mutated ones with their
pre-processed manifest, the rest with their current manifest — so the
returned lockfile contains every importer and headlessInstall doesn't
crash on missing entries.
The agent server's plain-install path writes the user's raw spec
('latest') into the lockfile importer's specifier rather than the
normalized save-prefix range. For 'pnpm add foo' via the agent this
meant the manifest ended up with 'foo: \"latest\"' instead of
'foo: \"^X.Y.Z\"'.

Fix: track whether each installSome selector had a user-provided
version. For deps where the user didn't specify a version, compute
the save-prefix spec from the resolved version in the lockfile
importer's dependencies map using the mutation's pinnedVersion. Deps
with an explicit user spec keep the user's spec verbatim.

Adds the following e2e tests for the agent path:
- pnpm add without a version (exercises the round-trip)
- pnpm add -D (devDependencies targeting)
- pnpm add with multiple selectors
- pnpm --filter remove in a workspace
@zkochan zkochan requested a review from a team April 19, 2026 00:10
@zkochan zkochan added this to the v11.0 milestone Apr 19, 2026
Drop the workspace-internal name and dotted suffix in favor of an
unscoped name that doubles as the bin name. Set initial version
0.0.1-0 and remove `private: true` so it can be published.

- Update workspace consumers (pnpm CLI, agent/server tests, e2e
  test) to import from 'pnpm-agent'.
- Update @pnpm/agent.client README JSDoc reference.
- Add 'pnpm-agent' to the meta-updater's with-registry preset list
  so its tests still get the registry-mock setup.
- Add agent/server/Dockerfile that installs pnpm-agent globally via
  pnpm and runs it. Image only works once pnpm-agent is published
  to npm.
Comment thread agent/server/src/createRegistryServer.ts
storeIndex.close()
})

return server

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

random question: what's the performance like? how many install requests per second can the server handle concurrently?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know yet. I was only testing with single client and only the worst scenario.

}

// Create package.json for each project
await Promise.all(projects.map(async (project) => {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one of my concern is that now for each install request, there could be significant amount of I/O at the server side 🤔 which could limit how much concurrency the server could perform

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We store the files in sqlite db. So, the amount of I/O should be minimized with hot cache. But I don't have a answer right now. I was only concentrating on performance with single client for now.

zkochan added 3 commits April 19, 2026 19:13
Swap node:22-slim + corepack for the official pnpm base image. Node
is now installed inside the container via `pnpm runtime set node 22
-g`, matching the intended flow of the GHCR image (which ships pnpm
but no Node runtime).

Drops the manual PNPM_HOME/PATH setup — the base image already sets
PNPM_HOME=/pnpm and puts /pnpm/bin on PATH, so both `node` and
`pnpm-agent` are resolved out of the same global bin dir.
@zkochan zkochan merged commit ccc606e into main Apr 20, 2026
12 checks passed
@zkochan zkochan deleted the pnpm-agent branch April 20, 2026 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants