Skip to content

pnpr: make remaining local state pluggable for stateless / serverless deployment #12199

Description

@zkochan

Motivation

#12198 moves pnpr's hosted store (published packages + static content) into an S3-compatible object store (S3 / Cloudflare R2 / MinIO / …). That removes the biggest blocker to running pnpr as multiple stateless replicas, but it isn't sufficient on its own: pnpr still keeps other state on local disk that a horizontally-scaled or serverless runtime won't preserve or share.

This issue tracks making the remaining state pluggable so pnpr can run as a stateless, horizontally-scalable service (the realistic near-term target is container-serverless with autoscaling — Cloud Run / Fly / Container Apps / Fargate scale-to-zero; true FaaS is a larger lift, see below).

Organizing principle: every per-instance disk dependency becomes a config-selected backend, defaulting to today's local behavior, following the pattern #12198 established (build the client once at config-load time into an Arc<dyn …> handle so request handlers stay infallible).

Two backend "kinds"

Everything pnpr keeps locally is one of:

Components

Component Today (code) Kind Externalize via
Hosted store storage.rs HostedStore {Fs|S3} Blob ✅ done (#12198)
Proxy cache storage.rs cached: Store (fs-only) Blob Generalize to {Fs|S3}, reuse S3Store
Accelerator CAS (pnpr-store) install_accelerator.rs store_dir Blob Content-addressed → S3 maps directly
Auth: users + tokens auth.rs UserStore / TokenStore Record ✅ done (#12206) — UserBackend/TokenBackend traits + libsql/Turso impl
Grant table install_accelerator/grant_table.rs Record Trait + networked-SQLite impl
Public-packages install_accelerator/public_packages.rs Record Trait + networked-SQLite impl
Verdict cache install_accelerator/verdict_cache.rs Record (pure cache) Leave ephemeral, or externalize

Blob side (cache + accelerator CAS)

Reuses #12198. Two fs-isms to handle in the proxy cache:

  • TTL freshnessStore::read_fresh_packument reads file mtime; the S3 variant uses ObjectMeta.last_modified (object_store exposes it).
  • tee_to_cache (streaming.rs) writes incrementally to a local file then renames. S3 can't append, so stage to local tmp and upload on stream completion (multipart for large tarballs) — same stage-then-upload shape the publish path already uses.

No new external service — cheapest large win.

Record side (auth + grants)

Narrow async traits, keeping current impls as defaults:

trait UserBackend  { async fn get(..); async fn upsert_or_login(..); async fn count(); }
trait TokenBackend { async fn issue(..); async fn lookup(..); async fn list(..); async fn revoke(..); }
trait GrantBackend { async fn is_granted(..); async fn record(..); async fn clear_package(..); }

Impls: InMemory, LocalSqlite (today, rusqlite), and a networked SQLite impl. Selected by config, built into Arc<dyn …> at load time. (Multiple impls behind dynasync-trait.)

DB choice: SQLite-compatible, not Postgres

pnpr is already built on SQLite (rusqlite backs tokens, grants, public-packages, verdict cache), so the schemas and queries already exist. Standardizing on SQLite-compatible services keeps the SQL and makes an all-Cloudflare stack coherent: R2 (blobs) + D1 (records) + compute.

Important nuance: "SQLite" ≠ "rusqlite" once networked. rusqlite opens a local file; a networked SQLite service needs a different driver. Keep the SQL, swap the driver:

Option Reach it via Works with current stack?
Cloudflare D1 Workers binding, or REST API (SQL over HTTP) REST API: yes (Cloud Run/Lambda). Workers binding: only inside a Worker.
Turso / libSQL libsql crate (network + embedded replicas) Yes — any tokio runtime
LiteFS (Fly) FUSE-replicated file, rusqlite unchanged, single writer Yes
rqlite Raft + HTTP Yes

Caveats:

  • D1 is happiest inside a Worker, and Workers can't run the current tokio/axum/rusqlite stack (WASM rewrite). From Cloud Run/Lambda you'd use D1's REST API (per-query HTTP).
  • Auth/token lookup is on the hot path (≈ every request). A pure-HTTP-to-D1 backend taxes every call. Mitigate with a short-TTL in-process token cache or Turso embedded replicas (local-fast reads).
  • Eventual consistency on replicas — fine for grants/public-packages (grant table is already best-effort, clear-on-discovery); the one to watch is token-revocation lag. A config knob (replica vs primary read for auth) covers it.

Recommendation: abstract behind the record-backend trait; default LocalSqlite. For the container-serverless target reachable today, Turso/libSQL is the pragmatic pick (works with the current stack; embedded replicas solve the hot-path auth read). D1 is the north-star for the Workers/edge path, gated on the larger stack rewrite.

Config shape (each block absent ⇒ local default)

storage: ./storage
s3: { bucket: pkgs, ... }          # hosted blobs        (done)
cache:
  s3: { bucket: cache, ... }       # proxy-cache blobs   (phase 2)
accelerator:
  store:
    s3: { bucket: accel, ... }     # accelerator CAS     (phase 2)
backend:
  libsql:
    url: ${PNPR_LIBSQL_URL}
    auth_token: ${PNPR_LIBSQL_TOKEN}
  # or: d1: { account_id, database_id, api_token }

Phased rollout

  1. Hosted → S3 — ✅ done (feat(pnpr): store hosted packages in an S3-compatible object store #12198).
  2. Proxy cache + accelerator CAS → S3 — reuses object_store, no new external service. After this the only local state left is the SQLite record stores. Biggest bang, lowest risk — proposed next step.
  3. Auth (users + tokens) → networked SQLite — ✅ done (feat(pnpr): config-selectable networked-SQLite auth backend (#12199 phase 3) #12206). UserBackend / TokenBackend async traits; local (htpasswd + SQLite) / in-memory / libsql/Turso impls selected by a backend.libsql: block. Embedded-replica read acceleration (replicaPath / syncIntervalSecs) for the hot-path token lookup; strict atomic registration cap.
  4. Grants + public-packages → networked SQLite; leave verdict-cache ephemeral (pure cache, degrades to lower hit-rate).
  5. Deployment glue — Cloud Run needs ~a Dockerfile + readiness probe; Lambda needs lambda_http + response-streaming handling; Workers is a separate, larger effort.

Cross-cutting caveat (independent of where bytes live)

With N replicas, the read-modify-write flows are the real distributed-systems risk, not storage:

  • Publish merges the incoming manifest into the existing packument (publish.rs::merge_manifest); partial-unpublish rewrites it. Two replicas publishing the same package concurrently = last-write-wins on the packument object → a lost version.
  • Single-instance half ✅ done (feat(pnpr): config-selectable networked-SQLite auth backend (#12199 phase 3) #12206): a striped per-package lock serializes the read-modify-write packument flows (publish, dist-tag, partial-unpublish) on one instance. Cross-replica still pending: S3 conditional writes (If-Match / ETag) or a DB-level conditional update. pnpm clients usually serialize per package, but a shared registry shouldn't rely on that.

Also: keep cold-start cost down — the install accelerator is already lazily built via OnceLock; keep that laziness and use a small connection pool.


Written by an agent (Claude Code, claude-opus-4-8).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions