fix(pnpr): preserve non-canonical upstream tarball filenames#12593
Conversation
The proxied-tarball integrity fix (#12570, GHSA-5f9g-98vq-2jxw) started reconstructing each rewritten dist.tarball filename from its version key (`<name>-<version>.tgz`). That assumes upstream tarball names are always canonical, which breaks packages like esprima-fb whose real npm tarball is `esprima-fb-3001.0001.0000-dev-harmony-fb.tgz` for version `3001.1.0-dev-harmony-fb`: the rewritten URL no longer matched what npm hosts, so the client recorded the wrong lockfile URL and the proxied fetch 404'd. Preserve the upstream basename verbatim in the rewrite again, and resolve a tarball request's version and dist.integrity by matching the requested filename against each version's dist.tarball basename instead of parsing the version out of the filename. OSV screening re-runs against the resolved version when it differs from the filename-derived one. The GHSA protection is unchanged: served bytes are still verified against the SRI declared by the version that owns that tarball name, so a preserved name cannot smuggle in bytes of another provenance. Tests that encoded the canonical-reconstruction behavior are updated to assert basename preservation, with new coverage for non-canonical names. Also fix a pre-existing compile break in the pnpr server test target: `UserStore::in_memory` gained a `MaxUsers` argument (#12581) but one test call site was not updated.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughTarball URL rewriting is refactored to preserve the upstream ChangesTarball Basename Preservation and Version Resolution
Sequence Diagram(s)sequenceDiagram
participant Client
participant serve_tarball
participant ensure_osv_allowed
participant expected_tarball_dist
participant packument
Client->>serve_tarball: GET /pkg/-/foo-1.0.0.tgz
serve_tarball->>serve_tarball: parse_tarball_name → name_version (e.g. "1.0.0")
serve_tarball->>ensure_osv_allowed: check name_version
serve_tarball->>packument: load packument
serve_tarball->>expected_tarball_dist: match "foo-1.0.0.tgz" vs dist.tarball basenames
expected_tarball_dist->>packument: iterate versions, extract tarball_basename
expected_tarball_dist-->>serve_tarball: TarballDist { version: "2.0.0", integrity } or error
alt resolved version ≠ name_version
serve_tarball->>ensure_osv_allowed: re-check resolved version
end
serve_tarball-->>Client: serve tarball bytes or not_found / bad_gateway
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
PR Summary by QodoFix pnpr tarball proxying to preserve upstream non-canonical filenames Description
Diagram
High-Level Assessment
Files changed (5)
|
Code Review by Qodo
1.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pnpr/crates/pnpr/src/server.rs`:
- Around line 990-997: The current implementation uses find() to locate the
first version with a matching tarball basename in the versions iterator, but
this silently accepts the first match without validating uniqueness. This is a
security issue because if multiple versions declare the same tarball basename,
the SRI/integrity check could be performed against the wrong version. Replace
the find() logic with a check that either rejects the request if multiple
versions claim the same tarball basename, or verifies that all matching versions
agree on the same tarball URL before proceeding. This ensures the served bytes
are validated against the correct declaring version as required by the
REVIEW_GUIDE.
In `@pnpr/crates/pnpr/src/upstream.rs`:
- Around line 395-398: The current implementation extracts the basename from
tarball_value using rsplit('/').next() which includes query strings and
fragments, causing mismatches when the tarball is later served in serve_tarball.
Create a centralized helper function that extracts the URL path basename by
first stripping everything after the first occurrence of '?' or '#' characters,
then extracting the final path segment after the last '/'. Use this helper
function to properly extract basename in the tarball_value assignment block and
also apply it in the expected_tarball_dist function to ensure consistent and
correct basename extraction across all locations where dist.tarball is
processed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 84675763-4624-4cca-b212-18549a1ef7c1
📒 Files selected for processing (5)
pnpr/crates/pnpr/src/server.rspnpr/crates/pnpr/src/server/tests.rspnpr/crates/pnpr/src/upstream.rspnpr/crates/pnpr/src/upstream/tests.rspnpr/crates/pnpr/tests/server.rs
|
Code review by qodo was updated up to the latest commit 187639b |
|
Code review by qodo was updated up to the latest commit 187639b |
|
Code review by qodo was updated up to the latest commit 324d952 |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #12593 +/- ##
==========================================
+ Coverage 88.14% 88.16% +0.01%
==========================================
Files 326 326
Lines 46523 46556 +33
==========================================
+ Hits 41006 41044 +38
+ Misses 5517 5512 -5 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
Code review by qodo was updated up to the latest commit 324d952 |
|
Code review by qodo was updated up to the latest commit 4559f27 |
1 similar comment
|
Code review by qodo was updated up to the latest commit 4559f27 |
Summary
The proxied-tarball integrity fix (#12570, GHSA-5f9g-98vq-2jxw) started reconstructing each rewritten
dist.tarballfilename from its version key (<name>-<version>.tgz). That assumes upstream tarball filenames are always canonical, which breaks packages whose real tarball name differs — e.g.esprima-fb@3001.1.0-dev-harmony-fb, whose npm tarball isesprima-fb-3001.0001.0000-dev-harmony-fb.tgz. The client recorded the wrong lockfile URL, and the proxied fetch 404'd against the upstream.This regressed two tests on
main(@pnpm/installing.deps-installer→test/lockfile.ts):save tarball URL when it is non-standardkeep non-standard tarball URL when lockfileIncludeTarballUrl is falseThis PR restores faithful proxying of non-canonical tarball names while keeping the GHSA integrity protection intact.
Squash Commit Body
Checklist
pacquet/port, or the description notes what still needs porting. — N/A: the change is confined to thepnprregistry server; no pnpm/pacquet CLI surface is affected.pnpm changeset) if this PR changes any published package. — N/A:pnpris a Rust binary, not a published npm package.Written by an agent (Claude Code, claude-opus-4-8).
Summary by CodeRabbit
Bug Fixes
dist.tarballbasename declared in the packument and ensuring integrity/provenance checks align with the declaring version, even when packuments swap filenames.New Features
Tests