Skip to content

Reconstruct tarball URLs client-side so pnpr can omit dist.tarball from abbreviated packuments #12164

Description

@zkochan

Background

#12163 shrinks the abbreviated (application/vnd.npm.install-v1+json) packuments pnpr serves by dropping fields the resolvers never read. The next candidate is dist.tarball, which on a large packument like react's is ~8% of the payload (~249 KB across 2817 versions).

dist.tarball is highly derivable: npm's canonical form is {registry}/{name}/-/{unscoped-name}-{version}.tgz, and pnpr already rewrites every tarball to the deterministic {public_url}/{pkg}/-/{basename} shape. In a scan of react and @types/node, every version's basename followed the convention with zero deviations. Since a client using pnpr has pnpr's URL as its registry, it could reconstruct exactly the URL pnpr currently hands it.

Blocker

Neither client can do this today:

  • pnpm builds the resolution as tarball: normalizeRegistryUrl(pickedPackage.dist.tarball) (resolving/npm-resolver/src/index.ts:589, :749) with no fallback — a missing dist.tarball breaks resolution.
  • pacquet declares dist.tarball as a required String (pacquet/crates/registry/src/package_distribution.rs); a missing value is a hard deserialization failure.

Plan (clients first, then pnpr)

  1. pnpm: when dist.tarball is absent, reconstruct {registry}/{name}/-/{unscoped-name}-{version}.tgz before normalizeRegistryUrl.
  2. pacquet: make dist.tarball Option<String> and apply the same reconstruction in the npm resolver.
  3. pnpr: only then omit dist.tarball from the abbreviated form — and emit it explicitly only when the basename deviates from the convention, so non-standard tarball filenames (which exist registry-wide, unlike in react/@types/node) still resolve correctly.

Notes / edge cases

  • The convention does not hold for the entire registry — some older or republished packages have non-conventional tarball filenames. The explicit URL is the only thing that handles those, so pnpr must keep emitting dist.tarball whenever the basename is non-conventional. Reconstruction is purely an optimization for the common case.
  • Security touchpoint: the lockfile verifier binds the resolved tarball to the packument's dist.tarball (resolving/npm-resolver/src/createNpmResolutionVerifier.ts, ERR_PNPM_TARBALL_URL_MISMATCH). Reconstruction must stay consistent on both sides of that check.
  • Per the repo's parity rule, the pnpm and pacquet reconstruction must match exactly (same URL string, same encoding for scoped names).

Scope

This is intentionally separate from #12163 (which only trims fields the resolvers already ignore). The tarball drop edits the resolver in both stacks, so it should land as its own change once steps 1–2 are in place.


Written by an agent (Claude Code, claude-opus-4-8).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions