Skip to content

fix(registry): verify tarball SHA on publish#11976

Merged
zkochan merged 6 commits into
mainfrom
feat/11975
May 28, 2026
Merged

fix(registry): verify tarball SHA on publish#11976
zkochan merged 6 commits into
mainfrom
feat/11975

Conversation

@zkochan

@zkochan zkochan commented May 26, 2026

Copy link
Copy Markdown
Member

Summary

Closes #11975.

pnpm-registry's PUT /:pkg endpoint accepted tarballs without verifying that their bytes matched the integrity declared in the packument. A buggy or partial-upload client could land a tarball whose hash didn't match dist.integrity / dist.shasum, and a malicious client could deliberately decouple the advertised hash from the bytes — both classes of silent-corruption bug that only surface when downstream pnpm install later fails with EINTEGRITY.

This change hashes every attachment before any I/O happens and rejects mismatches up-front so the bad bytes never reach the cache. The decode + hash + write runs in one streaming pass so the full decoded payload never lives in memory at once.

What lands

  • stream_decode_verify_and_write in publish.rs — pulls base64 chunks out of base64::read::DecoderReader, feeds each 64 KiB chunk to an ssri IntegrityChecker (SHA-512), an optional IntegrityOpts(Sha1) for legacy shasum, and the on-disk tmp file in lockstep. Verifies declared length, integrity, and shasum at the end; any failure removes the tmp file before returning.
  • All failure modes surface as 400 BAD_REQUEST with an EINTEGRITY: prefix in the body so pnpm / npm clients can recognize the error code.
  • extract_attachments no longer base64-decodes eagerly; it returns PendingAttachment { filename, data, declared_length } and lets the streaming path consume data directly.
  • PackageName::parse_tarball_name returns (canonical, version) so the publish handler can pull the matching versions[v].dist block out of the body. canonicalize_tarball_name becomes a thin wrapper.
  • Cache::reserve_tarball_paths + finalize_tarball_slot expose a tmp/final path pair so the publish handler can do the actual write inside tokio::task::spawn_blocking (sync std::fs) and finalize via async tokio::fs::rename afterward.
  • publish_package calls stream_decode_verify_and_write before any tarball lands at its final path. On any failure it removes every in-progress tmp file so a rejected publish leaves no on-disk artifact.

Memory win

On a 100 MiB tarball publish, the old flow held the full payload in three places simultaneously (HTTP body Bytes + serde_json owned base64 String + decoded Vec<u8>). The streaming flow drops the decoded Vec<u8> entirely — only the base64 string and a 64 KiB working buffer remain. Roughly 100 MiB of peak heap saved per concurrent publish.

Test plan

  • cargo test -p pnpm-registry — 115 tests pass (60 unit + 29 publish/auth + 9 + 17 server).
  • New unit tests in publish.rs for stream_decode_verify_and_write: matching integrity passes; mismatched integrity / missing integrity / missing dist entry / mismatched shasum / mismatched length all fail with the right error, and the tmp file is removed.
  • Integration tests in tests/auth_publish.rs:
    • publish_rejects_integrity_mismatch_and_leaves_no_artifacts — corrupt integrity → 400 EINTEGRITY, no packument or tarball on disk.
    • publish_rejects_shasum_mismatch — valid integrity but bad shasum → 400 EINTEGRITY, nothing on disk.
    • publish_rejects_missing_integrity_fielddist.integrity removed → 400 with "EINTEGRITY: dist.integrity is required", nothing on disk.
  • cargo clippy -p pnpm-registry --all-targets -- --deny warnings — clean.
  • cargo fmt -p pnpm-registry --check — clean.

Follow-up left for separate PRs

  • Read-side verification when serving a cached tarball back to a client — called out as a separate gap in pnpr backend parity with verdaccio — Tracking Issue #11973.
  • True end-to-end streaming would require a SAX-style JSON parser to avoid materializing the base64 string inside serde_json::Value — out of scope here; the current change already removes the largest in-memory copy.

Written by an agent (Claude Code, claude-opus-4-7).

Summary by CodeRabbit

  • New Features

    • Tarball uploads are streamed, integrity-verified (including legacy shasum), length-checked, and written via a staged atomic flow for safer publishes.
  • Bug Fixes

    • Publish now validates attachment filenames against per-version metadata and removes partial files on failure; rejects publishes with bad or missing integrity.
  • Tests

    • Expanded tests covering integrity mismatches, missing integrity, shasum mismatches, length checks, multi-attachment failures.
  • Chores

    • Added a dependency to support integrity parsing and verification.

Review Change Stack

Hash incoming publish tarball bytes and reject the request if the
computed SHA-512 doesn't match the `dist.integrity` declared in the
packument, or if `dist.shasum` is present and its SHA-1 doesn't
match. A body that arrives without `dist.integrity` is rejected
outright — npm always emits it, and accepting a publish whose hash
the registry never verified is the silent-corruption vector that
issue #11975 calls out. Both failure modes surface as a 400 with an
`EINTEGRITY:` prefix so pnpm / npm clients recognize the error.

The check runs *before* any cache write, so a bad upload never lands
on disk and can't replicate to downstream consumers via a subsequent
fetch.
@coderabbitai

coderabbitai Bot commented May 26, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds ssri dependency; parse tarball filenames for version; stage attachments as base64 PendingAttachment; stream-decode while verifying dist.integrity (ssri) and optional legacy dist.shasum (SHA-1); reserve/finalize atomic tarball slots; update tests for success and multiple EINTEGRITY failure cases.

Changes

Tarball integrity verification on publish

Layer / File(s) Summary
Tarball name parsing with version extraction
registry/crates/pnpm-registry/src/package_name.rs
parse_tarball_name is introduced; canonicalize_tarball_name now derives its result from it, returning canonical <basename>-<version>.tgz and the extracted version.
Staged attachments and ssri dependency
registry/crates/pnpm-registry/Cargo.toml, registry/crates/pnpm-registry/src/publish.rs
Add ssri dependency. extract_attachments now yields PendingAttachment (base64 data + declared length) without decoding; attachments are deferred for streaming verification.
Streaming decode, ssri verification, and helpers
registry/crates/pnpm-registry/src/publish.rs
Add stream_decode_verify_and_write which parses dist.integrity via ssri, streams base64 decode while hashing, validates optional legacy dist.shasum (SHA-1), checks declared length, removes partial files on error, syncs on success, and includes sha1_hex_from_integrity_opts and expanded unit tests.
Verification integration into publish flow
registry/crates/pnpm-registry/src/server.rs
publish_package parses attachment names to find per-version dist, reserves tarball tmp/final slots via Cache, streams verified writes in spawn_blocking, finalizes slots on success, and cleans up temp slots on failure.
Reserve/finalize tarball slots
registry/crates/pnpm-registry/src/cache.rs
Add TarballSlot, reserve_tarball_paths, and finalize_tarball_slot to create unique tmp/final paths and atomically promote tmp files into the cache.
Tests and test helpers
registry/crates/pnpm-registry/tests/auth_publish.rs
Add integration tests for integrity mismatch, shasum mismatch, and missing integrity; compute dist.integrity and dist.shasum from actual tarball bytes and add sri_sha512 / sha1_hex helpers.

Sequence Diagram

sequenceDiagram
  participant Client
  participant publish_package
  participant Cache
  participant stream_verify_write
  participant ssri
  participant FS

  Client->>publish_package: POST publish body with `_attachments` (base64)
  publish_package->>publish_package: parse filenames -> (canonical, version)
  publish_package->>publish_package: lookup `incoming["versions"][version]["dist"]`
  publish_package->>Cache: reserve_tarball_paths(name, canonical)
  publish_package->>stream_verify_write: (base64_data, declared_length, dist, tmp_path)
  stream_verify_write->>ssri: parse dist.integrity
  stream_verify_write->>stream_verify_write: stream base64 decode, compute ssri digest & optional sha1
  stream_verify_write->>FS: write decoded bytes to tmp_path
  stream_verify_write->>ssri: compare computed digest vs declared
  alt integrity matches
    stream_verify_write-->>publish_package: Ok(bytes_written)
    publish_package->>Cache: finalize_tarball_slot(tmp_path -> final_path)
  else integrity mismatch or missing
    stream_verify_write-->>publish_package: Err(EINTEGRITY)
    publish_package->>FS: cleanup_tmp_slots(tmp_path)
  end
  publish_package-->>Client: 200 OK or 400 EINTEGRITY
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I nibble bytes and count each hop,
ssri hums as hashes pop.
If shasum wobbles or integrity strays,
I stash no tarball, refuse the praise.
Hop—publish safe, no corrupted maze.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly and specifically describes the main objective: verifying tarball SHA on publish, which aligns with the primary change in the changeset.
Linked Issues check ✅ Passed All coding requirements from issue #11975 are implemented: stream-hash tarballs with SHA-512/SHA-1, compare digests to packument values, verify declared length, reject with 400 EINTEGRITY on mismatch/missing integrity, ensure verification before final writes, remove tmp artifacts on failure.
Out of Scope Changes check ✅ Passed All changes are directly scoped to #11975: tarball verification during publish. No unrelated refactoring, dependency upgrades beyond ssri, or out-of-scope features are present.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/11975

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@qodo-free-for-open-source-projects

Copy link
Copy Markdown

Review Summary by Qodo

Verify tarball integrity on publish before cache write

🐞 Bug fix

Grey Divider

Walkthroughs

Description
• Verify tarball SHA-512 and SHA-1 integrity before writing to cache
• Reject publishes with missing or mismatched dist.integrity field
• Return 400 with EINTEGRITY: prefix for client recognition
• Prevent silent-corruption bugs from bad uploads replicating downstream
Diagram
flowchart LR
  A["Incoming publish request"] --> B["Extract attachments"]
  B --> C["Parse tarball filename"]
  C --> D["Look up versions[v].dist"]
  D --> E["Verify SHA-512 integrity"]
  E --> F{Integrity match?}
  F -->|No| G["Return 400 EINTEGRITY"]
  F -->|Yes| H["Verify SHA-1 shasum"]
  H --> I{Shasum match?}
  I -->|No| G
  I -->|Yes| J["Write to cache"]

Loading

Grey Divider

File Changes

1. registry/crates/pnpm-registry/src/package_name.rs ✨ Enhancement +10/-1

Extract version from tarball filename parsing

• Refactored canonicalize_tarball_name to use new parse_tarball_name method
• Added parse_tarball_name that returns both canonical filename and version
• Version extraction enables publish handler to look up versions[v].dist for verification

registry/crates/pnpm-registry/src/package_name.rs


2. registry/crates/pnpm-registry/src/publish.rs 🐞 Bug fix +126/-1

Implement tarball integrity verification logic

• Added verify_attachment function to validate SHA-512 and SHA-1 digests
• Implemented compute_shasum_hex to convert SHA-1 to 40-char hex format
• All integrity failures return EINTEGRITY: prefixed errors for client recognition
• Added comprehensive unit tests for matching/mismatched integrity, missing fields, and shasum
 validation

registry/crates/pnpm-registry/src/publish.rs


3. registry/crates/pnpm-registry/src/server.rs 🐞 Bug fix +16/-3

Integrate integrity verification into publish handler

• Updated publish_package to call verify_attachment before writing to disk
• Uses parse_tarball_name to extract version for dist lookup
• Verification happens before any cache I/O to prevent bad uploads from persisting
• Added comments explaining the integrity check placement and purpose

registry/crates/pnpm-registry/src/server.rs


View more (2)
4. registry/crates/pnpm-registry/tests/auth_publish.rs 🧪 Tests +118/-4

Add integration tests for integrity verification

• Added publish_rejects_integrity_mismatch_and_leaves_no_artifacts integration test
• Added publish_rejects_shasum_mismatch test for legacy SHA-1 validation
• Added publish_rejects_missing_integrity_field test for required field enforcement
• Updated sample_publish_body helper to compute real SHA-512 and SHA-1 digests
• Added sri_sha512 and sha1_hex helper functions for test fixture generation

registry/crates/pnpm-registry/tests/auth_publish.rs


5. registry/crates/pnpm-registry/Cargo.toml Dependencies +1/-0

Add ssri dependency for digest verification

• Added ssri workspace dependency for SHA-512 and SHA-1 integrity verification

registry/crates/pnpm-registry/Cargo.toml


Grey Divider

Qodo Logo

@github-actions

github-actions Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

Micro-Benchmark Results

Linux

group                          main                                   pr
-----                          ----                                   --
tarball/download_dependency    1.05      8.1±0.37ms   532.9 KB/sec    1.00      7.7±0.06ms   561.1 KB/sec

@codecov-commenter

codecov-commenter commented May 26, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 92.16418% with 21 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.19%. Comparing base (f03dc2d) to head (3fbc5b5).
⚠️ Report is 8 commits behind head on main.

Files with missing lines Patch % Lines
registry/crates/pnpm-registry/src/server.rs 79.16% 10 Missing ⚠️
registry/crates/pnpm-registry/src/publish.rs 95.43% 9 Missing ⚠️
registry/crates/pnpm-registry/src/cache.rs 89.47% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #11976      +/-   ##
==========================================
+ Coverage   88.03%   88.19%   +0.16%     
==========================================
  Files         228      228              
  Lines       28068    28737     +669     
==========================================
+ Hits        24711    25346     +635     
- Misses       3357     3391      +34     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

github-actions Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

Integrated-Benchmark Report (Linux)

Scenario: Isolated linker: fresh restore, cold cache + cold store

Command Mean [s] Min [s] Max [s] Relative
pacquet@HEAD 1.946 ± 0.083 1.846 2.082 1.01 ± 0.05
pacquet@main 1.928 ± 0.056 1.854 2.048 1.00
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 1.94615672312,
      "stddev": 0.08311690951970489,
      "median": 1.92196890572,
      "user": 2.7607041599999995,
      "system": 3.23114208,
      "min": 1.84602721922,
      "max": 2.08198523722,
      "times": [
        1.9986837772200001,
        1.9997971052199999,
        1.91461319822,
        1.92932461322,
        1.89864583922,
        2.05556557622,
        1.88404067222,
        2.08198523722,
        1.84602721922,
        1.85288399322
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 1.92817821292,
      "stddev": 0.055697303177662034,
      "median": 1.9216367082199999,
      "user": 2.77955126,
      "system": 3.23657398,
      "min": 1.8537913592200002,
      "max": 2.04783630022,
      "times": [
        1.96901833622,
        1.93852464622,
        1.8537913592200002,
        1.89989764822,
        1.95708187222,
        1.90522050522,
        2.04783630022,
        1.90208022822,
        1.93805291122,
        1.87027832222
      ]
    }
  ]
}

Scenario: Isolated linker: fresh restore, hot cache + hot store

Command Mean [ms] Min [ms] Max [ms] Relative
pacquet@HEAD 646.8 ± 39.2 624.9 755.4 1.02 ± 0.06
pacquet@main 635.1 ± 10.9 621.4 661.0 1.00
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 0.6468001888400001,
      "stddev": 0.03921493845104728,
      "median": 0.63389094984,
      "user": 0.37157146,
      "system": 1.3408782799999999,
      "min": 0.62490847784,
      "max": 0.7553942318400001,
      "times": [
        0.7553942318400001,
        0.6435088578400001,
        0.6524181268400001,
        0.64294234784,
        0.6345392588400001,
        0.62490847784,
        0.6263767878400001,
        0.63324264084,
        0.62630662284,
        0.6283645358400001
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 0.6350578599400001,
      "stddev": 0.010946060549885854,
      "median": 0.6330758163400001,
      "user": 0.36637266,
      "system": 1.34045388,
      "min": 0.6213649278400001,
      "max": 0.6610379798400001,
      "times": [
        0.6610379798400001,
        0.6268870098400001,
        0.6314662858400001,
        0.63800273484,
        0.6346853468400001,
        0.63950649684,
        0.6399745428400001,
        0.63052803284,
        0.6213649278400001,
        0.6271252418400001
      ]
    }
  ]
}

Scenario: Isolated linker: fresh install, cold cache + cold store

Command Mean [s] Min [s] Max [s] Relative
pacquet@HEAD 2.209 ± 0.030 2.159 2.265 1.00
pacquet@main 2.230 ± 0.024 2.206 2.281 1.01 ± 0.02
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 2.20944355774,
      "stddev": 0.030077072250081213,
      "median": 2.2135835191399997,
      "user": 3.827690080000001,
      "system": 3.0086175600000002,
      "min": 2.1585499321399997,
      "max": 2.2648153561399997,
      "times": [
        2.2223783021399997,
        2.20413852714,
        2.22415732614,
        2.1585499321399997,
        2.1856776811399996,
        2.17791006614,
        2.2195886031399996,
        2.2648153561399997,
        2.22964134814,
        2.20757843514
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 2.2301428483399994,
      "stddev": 0.023759106128624478,
      "median": 2.2211385126399996,
      "user": 3.86044228,
      "system": 3.00369626,
      "min": 2.2059987951399997,
      "max": 2.2814093131399997,
      "times": [
        2.2101345101399996,
        2.2059987951399997,
        2.2190862921399996,
        2.2190956361399996,
        2.23437836814,
        2.25569121614,
        2.2108178481399996,
        2.24163511514,
        2.2231813891399996,
        2.2814093131399997
      ]
    }
  ]
}

Scenario: Isolated linker: fresh install, hot cache + hot store

Command Mean [s] Min [s] Max [s] Relative
pacquet@HEAD 1.466 ± 0.025 1.422 1.503 1.00 ± 0.02
pacquet@main 1.460 ± 0.022 1.432 1.494 1.00
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 1.4657587857799999,
      "stddev": 0.02529892893238504,
      "median": 1.46502933348,
      "user": 1.7283687199999995,
      "system": 1.8782971199999998,
      "min": 1.4219677959800001,
      "max": 1.50322056498,
      "times": [
        1.50322056498,
        1.47315731398,
        1.4635981679799999,
        1.46392911398,
        1.4219677959800001,
        1.47435575798,
        1.45424812698,
        1.43566494398,
        1.50131651898,
        1.46612955298
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 1.4599491793800001,
      "stddev": 0.022082840305512687,
      "median": 1.4537552754799998,
      "user": 1.7154707199999997,
      "system": 1.86129812,
      "min": 1.4315969929799999,
      "max": 1.49377766898,
      "times": [
        1.49377766898,
        1.4315969929799999,
        1.48266971598,
        1.4629552399799999,
        1.4387424549799999,
        1.49043263198,
        1.45592590498,
        1.45158464598,
        1.44062592298,
        1.45118061498
      ]
    }
  ]
}

@github-actions

github-actions Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

🐰 Bencher Report

Branchpr/11976
Testbedpacquet
Click to view all benchmark results
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Upper Boundary
milliseconds (ms)
(Limit %)
isolated-linker.fresh-install.cold-cache.cold-store📈 view plot
🚷 view threshold
2,209.44 ms
(-7.10%)Baseline: 2,378.43 ms
2,854.12 ms
(77.41%)
isolated-linker.fresh-install.hot-cache.hot-store📈 view plot
🚷 view threshold
1,465.76 ms
(-3.55%)Baseline: 1,519.75 ms
1,823.70 ms
(80.37%)
isolated-linker.fresh-restore.cold-cache.cold-store📈 view plot
🚷 view threshold
1,946.16 ms
(-4.20%)Baseline: 2,031.57 ms
2,437.88 ms
(79.83%)
isolated-linker.fresh-restore.hot-cache.hot-store📈 view plot
🚷 view threshold
646.80 ms
(-2.34%)Baseline: 662.32 ms
794.78 ms
(81.38%)
🐰 View full continuous benchmarking report in Bencher

Replace the buffer-then-decode-then-hash flow with a streaming pass
through ssri's IntegrityChecker. `extract_attachments` now returns
the base64 payload verbatim (no eager decode), and a new
`stream_decode_verify_and_write` consumes that payload via
`base64::read::DecoderReader`, feeding 64 KiB chunks to the on-disk
file and to the integrity / shasum hashers in lockstep. The publish
handler runs the streaming routine inside `spawn_blocking` and
finalizes (or cleans up) the tmp slots asynchronously.

The decoded tarball never has to live in memory all at once — only
the original base64 string and a fixed-size working buffer do. On a
100 MiB publish that saves roughly 100 MiB of peak heap.

Failure modes (mismatched integrity, mismatched shasum, mismatched
length, base64 decode error) clean up the in-progress tmp file
before returning so a rejected publish leaves no on-disk artifact.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
registry/crates/pnpm-registry/src/server.rs (1)

661-668: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Rollback promoted tarballs if the publish still fails.

After Line 662 renames a tarball into its final cache path, any later finalize_tarball_slot failure or the write_packument failure at Line 667 returns immediately without deleting the tarballs that were already promoted. That leaves a failed publish with persisted artifacts and a partially-updated package directory.

Suggested direction
-    for slot in written_slots {
-        if let Err(err) = state.inner.cache.finalize_tarball_slot(slot).await {
-            return error_response(&err);
-        }
-    }
+    let mut finalized_paths = Vec::with_capacity(written_slots.len());
+    for slot in written_slots {
+        let final_path = slot.final_path.clone();
+        if let Err(err) = state.inner.cache.finalize_tarball_slot(slot).await {
+            cleanup_finalized_paths(finalized_paths).await;
+            return error_response(&err);
+        }
+        finalized_paths.push(final_path);
+    }
 
     if let Err(err) = state.inner.cache.write_packument(&name, &merged_bytes).await {
+        cleanup_finalized_paths(finalized_paths).await;
         return error_response(&err);
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@registry/crates/pnpm-registry/src/server.rs` around lines 661 - 668, The loop
promoting tarball slots (written_slots ->
state.inner.cache.finalize_tarball_slot) can leave promoted files when a later
finalize_tarball_slot or state.inner.cache.write_packument(&name, &merged_bytes)
fails; update the publish error path so that on any failure after promotions you
iterate over the already-promoted slots and call the cache rollback/removal
(e.g., a new state.inner.cache.rollback_tarball_slot or remove_promoted_slot
operation) for each slot to revert them, and ensure rollback errors are
logged/handled but do not mask the original error returned by error_response;
modify the code around finalize_tarball_slot, written_slots and write_packument
to perform this cleanup before returning the error_response.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@registry/crates/pnpm-registry/src/publish.rs`:
- Around line 138-159: The two integrity-failure branches in publish.rs should
return the client-visible EINTEGRITY variant rather than a plain
InvalidAttachment message: update the Err(invalid(format!("base64 decode failed:
{err}"))) in the base64-decode failure branch and the
Err(invalid(format!("length mismatch: header says {expected}, decoded
{total}"))) in the declared_length mismatch branch to emit the same
EINTEGRITY-style error used elsewhere (either by prefixing the message with
"EINTEGRITY: " or by calling the existing helper that constructs EINTEGRITY
errors if one exists), so failures in the base64 decode and declared-length
checks match the other integrity errors; keep the same diagnostic text after the
EINTEGRITY marker and leave the RegistryError::Io(err) handling unchanged.

---

Outside diff comments:
In `@registry/crates/pnpm-registry/src/server.rs`:
- Around line 661-668: The loop promoting tarball slots (written_slots ->
state.inner.cache.finalize_tarball_slot) can leave promoted files when a later
finalize_tarball_slot or state.inner.cache.write_packument(&name, &merged_bytes)
fails; update the publish error path so that on any failure after promotions you
iterate over the already-promoted slots and call the cache rollback/removal
(e.g., a new state.inner.cache.rollback_tarball_slot or remove_promoted_slot
operation) for each slot to revert them, and ensure rollback errors are
logged/handled but do not mask the original error returned by error_response;
modify the code around finalize_tarball_slot, written_slots and write_packument
to perform this cleanup before returning the error_response.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 97746665-8f5f-44c8-a0da-7d87f550174a

📥 Commits

Reviewing files that changed from the base of the PR and between 2a03d2d and 3af4ed8.

📒 Files selected for processing (3)
  • registry/crates/pnpm-registry/src/cache.rs
  • registry/crates/pnpm-registry/src/publish.rs
  • registry/crates/pnpm-registry/src/server.rs
📜 Review details
🧰 Additional context used
📓 Path-based instructions (1)
registry/crates/**/*.rs

📄 CodeRabbit inference engine (registry/AGENTS.md)

Follow the pacquet code-style guide (../pacquet/CODE_STYLE_GUIDE.md) for Rust-level conventions including imports, naming, ownership, error handling, and test layout

Files:

  • registry/crates/pnpm-registry/src/cache.rs
  • registry/crates/pnpm-registry/src/server.rs
  • registry/crates/pnpm-registry/src/publish.rs

Comment thread registry/crates/pnpm-registry/src/publish.rs Outdated
… EINTEGRITY

The remaining two failure modes in stream_decode_verify_and_write —
base64 decode error and declared-length mismatch — return plain
InvalidAttachment messages rather than the EINTEGRITY-prefixed form
the integrity and shasum branches use. Clients that key off
EINTEGRITY to decide whether to retry the publish would miss these
cases. Both are integrity-class failures (the bytes the client sent
can't be reconstructed, or don't add up to what the client claimed),
so they get the same prefix as the hash mismatches.
zkochan added 2 commits May 28, 2026 01:56
…, and multi-attachment cleanup paths

Four branches in the publish flow had no test coverage:

* Malformed `dist.integrity` — the SRI parser fails before any
  bytes get streamed; previously only mismatch-after-streaming was
  tested.
* Invalid base64 payload — the EINTEGRITY-prefixed branch added in
  response to the CodeRabbit review (commit 41f8582) was
  inferred-correct but unproven.
* Payloads larger than the 64 KiB chunk buffer — every prior happy
  path used a tiny payload that drained in one read. A regression
  where the streaming loop returned after the first chunk (or where
  the hasher only saw the first chunk) would still pass them all.
  The new test uses 200 KiB.
* Multi-attachment cleanup — `cleanup_tmp_slots` removes already-
  written tmp files when a later attachment fails integrity. Real
  npm clients only ever publish one attachment, so this code path
  had no integration coverage. Confirmed by temporarily no-op'ing
  the cleanup call: the new test fails with the expected leftover
  `*.tmp.*` file.
`s` in the data-field match arm and `n` for the read byte count
were the two single-letter identifiers introduced by this PR. The
code style guide prefers descriptive names for production code
even where perfectionist's `single_letter_let_binding` allowlist
would let `n` slide. Switch to `bytes_read`, and collapse the
`Some(Value::String(s)) => s` arm into a `let Some(Value::String(data)) = ... else`
so the binding goes away entirely.
@zkochan zkochan merged commit 87136ed into main May 28, 2026
26 of 28 checks passed
@zkochan zkochan deleted the feat/11975 branch May 28, 2026 00:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pnpr: verify tarball SHA on publish

2 participants