Problem
pnpr currently stores proxied upstream cache and locally-published packages in the same on-disk store, with no way to tell them apart.
Both flows go through the single Cache abstraction (v11/pnpr/crates/pnpr/src/cache.rs), which writes a Verdaccio-shaped tree rooted at config.storage:
<storage>/
<package>/
package.json # packument
<name>-<version>.tgz # tarballs, flat
- Proxied tarballs/packuments are written here by
serve_tarball / load_packument_bytes (server.rs).
- Published tarballs/packuments are written here by
publish_package (server.rs), via the same Cache::reserve_tarball_paths / write_packument.
There is no marker on disk distinguishing a proxied foo-1.0.0.tgz from a published one. At the packument level it's worse: merge_manifest (publish.rs) seeds the published package.json from the upstream packument and unions versions/dist-tags into it, so a single package.json interleaves published and proxied versions in one file. The doc comment on Config::storage even states the directory doubles as both cache and source of truth.
Consequences
- Cannot clear the proxy cache safely. There is no supported operation to drop just the mirrored upstream artifacts — deleting
<storage>/<pkg>/ removes published packages too.
- Published packages share a lifecycle with disposable cache. A naive "clear the cache" wipes the source of truth. Published packages must never be able to disappear.
- Awkward server operations. Backups, durable/replicated volumes, and upgrades all have to treat the entire (potentially huge, fully reconstructible) proxy cache as if it were precious data.
Proposed solution
Physically separate the two stores so they have independent lifecycles.
|
Published store |
Proxy cache |
| Contents |
Locally published tarballs + packument fragments |
Mirrored upstream tarballs/packuments |
| Durability |
Durable, backed-up. Source of truth. |
Disposable — safe to wipe/GC anytime |
| Backing |
Persistent volume or object store (S3/GCS) |
Local SSD / ephemeral scratch |
| Eviction |
Never evicted |
TTL + size-cap GC is fine |
Implementation sketch:
- Give
Cache two roots (published_root + cache_root). Route publish_package to published_root; route the proxy-fetch paths to cache_root. On read, check published-first, then cache.
- Stop persisting a blended
package.json. Store the locally-published packument fragment separately and compose the served packument at request time (published versions overlaid on the cached upstream packument).
- Config: add a published-store path alongside
storage (keep storage as the cache root for back-compat, or introduce explicit published_storage + cache_storage).
Server / deployment benefits
- Published store → durable PVC or S3-compatible object storage; proxy cache →
emptyDir / node-local scratch.
- Upgrades retain published packages trivially: swap the binary/image, point it at the same
published_root; the cache can start cold and self-heals on first request.
- DR: back up only the published store; the cache never needs backing up.
This is pnpr-only (no pacquet-side port, no changeset needed).
Written by an agent (Claude Code, claude-opus-4-8).
Problem
pnpr currently stores proxied upstream cache and locally-published packages in the same on-disk store, with no way to tell them apart.
Both flows go through the single
Cacheabstraction (v11/pnpr/crates/pnpr/src/cache.rs), which writes a Verdaccio-shaped tree rooted atconfig.storage:serve_tarball/load_packument_bytes(server.rs).publish_package(server.rs), via the sameCache::reserve_tarball_paths/write_packument.There is no marker on disk distinguishing a proxied
foo-1.0.0.tgzfrom a published one. At the packument level it's worse:merge_manifest(publish.rs) seeds the publishedpackage.jsonfrom the upstream packument and unions versions/dist-tags into it, so a singlepackage.jsoninterleaves published and proxied versions in one file. The doc comment onConfig::storageeven states the directory doubles as both cache and source of truth.Consequences
<storage>/<pkg>/removes published packages too.Proposed solution
Physically separate the two stores so they have independent lifecycles.
Implementation sketch:
Cachetwo roots (published_root+cache_root). Routepublish_packagetopublished_root; route the proxy-fetch paths tocache_root. On read, check published-first, then cache.package.json. Store the locally-published packument fragment separately and compose the served packument at request time (published versions overlaid on the cached upstream packument).storage(keepstorageas the cache root for back-compat, or introduce explicitpublished_storage+cache_storage).Server / deployment benefits
emptyDir/ node-local scratch.published_root; the cache can start cold and self-heals on first request.This is pnpr-only (no pacquet-side port, no changeset needed).
Written by an agent (Claude Code, claude-opus-4-8).